Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Microsoft Specific
Emits the Streaming SIMD Extensions 4 (SSE4) instruction dppd. This instruction computes the dot product of double precision floating point values.
__m128d _mm_dp_pd(
__m128d a,
__m128d b,
const int mask
);
Parameters
[in] a
A 128-bit parameter that contains two 64-bit floating point values.[in] b
A 128-bit parameter that contains two 64-bit floating point values.[in] mask
A constant mask that determines which components will be multiplied and where to place the results.
Return value
A 128 bit parameter that contains both 64-bit results of the dot products.
The result can be expressed with the following equations:
tmp0 := (mask4 == 1) ? (a0 * b0) : +0.0
tmp1 := (mask5 == 1) ? (a1 * b1) : +0.0
tmp2 := tmp0 + tmp1
r0 := (mask0 == 1) ? tmp2 : +0.0
r1 := (mask1 == 1) ? tmp2 : +0.0
Requirements
Intrinsic |
Architecture |
---|---|
_mm_dp_pd |
x86, x64 |
Header file <smmintrin.h>
Remarks
The immediate bits 4-5 of mask determine which of the corresponding source operand pairs are to be multiplied. Bits 0-1 determine whether the dot product result will be written. If a mask bit is 0, the corresponding product result or written value is +0.0.
r0, a0, and b0 are the lowest 64 bits of return value r and parameters a and b, respectively. r1, a1, and b1 are the highest 64 bits of return value r and parameters a and b, respectively.
maski is bit i of parameter mask, where bit 0 is the least significant bit.
Before you use this intrinsic, software must ensure that the underlying processor supports the instruction.
Example
#include <stdio.h>
#include <smmintrin.h>
int main ()
{
__m128d a, b;
const int mask = 0x31;
a.m128d_f64[0] = 1.5;
a.m128d_f64[1] = 10.25;
b.m128d_f64[0] = -1.5;
b.m128d_f64[1] = 3.125;
__m128d res = _mm_dp_pd(a, b, mask);
printf_s("Original a: %I64f\t%I64f\nOriginal b: %I64f\t%I64f\n",
a.m128d_f64[0], a.m128d_f64[1], b.m128d_f64[0], b.m128d_f64[1]);
printf_s("Result res: %I64f\t%I64f\n",
res.m128d_f64[0], res.m128d_f64[1]);
return 0;
}
Original a: 1.500000 10.250000 Original b: -1.500000 3.125000 Result res: 29.781250 0.000000