🔬This is a nightly-only experimental API. (
stdsimd #48556)Available on x86 or x86-64 only.
Expand description
Advanced Vector Extensions (AVX)
The references are:
- Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2: Instruction Set Reference, A-Z. - AMD64 Architecture Programmer’s Manual, Volume 3: General-Purpose and System Instructions.
Wikipedia provides a quick overview of the instructions available.
Constants
- Equal (ordered, non-signaling)
- Equal (ordered, signaling)
- Equal (unordered, non-signaling)
- Equal (unordered, signaling)
- False (ordered, non-signaling)
- False (ordered, signaling)
- Greater-than-or-equal (ordered, non-signaling)
- Greater-than-or-equal (ordered, signaling)
- Greater-than (ordered, non-signaling)
- Greater-than (ordered, signaling)
- Less-than-or-equal (ordered, non-signaling)
- Less-than-or-equal (ordered, signaling)
- Less-than (ordered, non-signaling)
- Less-than (ordered, signaling)
- Not-equal (ordered, non-signaling)
- Not-equal (ordered, signaling)
- Not-equal (unordered, non-signaling)
- Not-equal (unordered, signaling)
- Not-greater-than-or-equal (unordered, non-signaling)
- Not-greater-than-or-equal (unordered, signaling)
- Not-greater-than (unordered, non-signaling)
- Not-greater-than (unordered, signaling)
- Not-less-than-or-equal (unordered, non-signaling)
- Not-less-than-or-equal (unordered, signaling)
- Not-less-than (unordered, non-signaling)
- Not-less-than (unordered, signaling)
- Ordered (non-signaling)
- Ordered (signaling)
- True (unordered, non-signaling)
- True (unordered, signaling)
- Unordered (non-signaling)
- Unordered (signaling)
Functions
- _mm256_add_pd⚠
avxAdds packed double-precision (64-bit) floating-point elements inaandb. - _mm256_add_ps⚠
avxAdds packed single-precision (32-bit) floating-point elements inaandb. - _mm256_addsub_pd⚠
avxAlternatively adds and subtracts packed double-precision (64-bit) floating-point elements inato/from packed elements inb. - _mm256_addsub_ps⚠
avxAlternatively adds and subtracts packed single-precision (32-bit) floating-point elements inato/from packed elements inb. - _mm256_and_pd⚠
avxComputes the bitwise AND of a packed double-precision (64-bit) floating-point elements inaandb. - _mm256_and_ps⚠
avxComputes the bitwise AND of packed single-precision (32-bit) floating-point elements inaandb. - _mm256_andnot_pd⚠
avxComputes the bitwise NOT of packed double-precision (64-bit) floating-point elements ina, and then AND withb. - _mm256_andnot_ps⚠
avxComputes the bitwise NOT of packed single-precision (32-bit) floating-point elements inaand then AND withb. - _mm256_blend_pd⚠
avxBlends packed double-precision (64-bit) floating-point elements fromaandbusing control maskimm8. - _mm256_blend_ps⚠
avxBlends packed single-precision (32-bit) floating-point elements fromaandbusing control maskimm8. - _mm256_blendv_pd⚠
avxBlends packed double-precision (64-bit) floating-point elements fromaandbusingcas a mask. - _mm256_blendv_ps⚠
avxBlends packed single-precision (32-bit) floating-point elements fromaandbusingcas a mask. - Broadcasts 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of the returned vector.
- Broadcasts 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of the returned vector.
- Broadcasts a double-precision (64-bit) floating-point element from memory to all elements of the returned vector.
- Broadcasts a single-precision (32-bit) floating-point element from memory to all elements of the returned vector.
- Casts vector of type __m128d to type __m256d; the upper 128 bits of the result are undefined.
- Casts vector of type __m256d to type __m128d.
- _mm256_castpd_ps⚠
avxCast vector of type __m256d to type __m256. - Casts vector of type __m256d to type __m256i.
- Casts vector of type __m128 to type __m256; the upper 128 bits of the result are undefined.
- Casts vector of type __m256 to type __m128.
- _mm256_castps_pd⚠
avxCast vector of type __m256 to type __m256d. - Casts vector of type __m256 to type __m256i.
- Casts vector of type __m128i to type __m256i; the upper 128 bits of the result are undefined.
- Casts vector of type __m256i to type __m256d.
- Casts vector of type __m256i to type __m256.
- Casts vector of type __m256i to type __m128i.
- _mm256_ceil_pd⚠
avxRounds packed double-precision (64-bit) floating point elements inatoward positive infinity. - _mm256_ceil_ps⚠
avxRounds packed single-precision (32-bit) floating point elements inatoward positive infinity. - _mm256_cmp_pd⚠
avxCompares packed double-precision (64-bit) floating-point elements inaandbbased on the comparison operand specified byIMM5. - _mm256_cmp_ps⚠
avxCompares packed single-precision (32-bit) floating-point elements inaandbbased on the comparison operand specified byIMM5. - Converts packed 32-bit integers in
ato packed double-precision (64-bit) floating-point elements. - Converts packed 32-bit integers in
ato packed single-precision (32-bit) floating-point elements. - Converts packed double-precision (64-bit) floating-point elements in
ato packed 32-bit integers. - _mm256_cvtpd_ps⚠
avxConverts packed double-precision (64-bit) floating-point elements inato packed single-precision (32-bit) floating-point elements. - Converts packed single-precision (32-bit) floating-point elements in
ato packed 32-bit integers. - _mm256_cvtps_pd⚠
avxConverts packed single-precision (32-bit) floating-point elements inato packed double-precision (64-bit) floating-point elements. - _mm256_cvtss_f32⚠
avxReturns the first element of the input vector of[8 x float]. - Converts packed double-precision (64-bit) floating-point elements in
ato packed 32-bit integers with truncation. - Converts packed single-precision (32-bit) floating-point elements in
ato packed 32-bit integers with truncation. - _mm256_div_pd⚠
avxComputes the division of each of the 4 packed 64-bit floating-point elements inaby the corresponding packed elements inb. - _mm256_div_ps⚠
avxComputes the division of each of the 8 packed 32-bit floating-point elements inaby the corresponding packed elements inb. - _mm256_dp_ps⚠
avxConditionally multiplies the packed single-precision (32-bit) floating-point elements inaandbusing the high 4 bits inimm8, sum the four products, and conditionally return the sum using the low 4 bits ofimm8. - Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from
a, selected withimm8. - Extracts 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from
a, selected withimm8. - Extracts 128 bits (composed of integer data) from
a, selected withimm8. - _mm256_floor_pd⚠
avxRounds packed double-precision (64-bit) floating point elements inatoward negative infinity. - _mm256_floor_ps⚠
avxRounds packed single-precision (32-bit) floating point elements inatoward negative infinity. - _mm256_hadd_pd⚠
avxHorizontal addition of adjacent pairs in the two packed vectors of 4 64-bit floating pointsaandb. In the result, sums of elements fromaare returned in even locations, while sums of elements frombare returned in odd locations. - _mm256_hadd_ps⚠
avxHorizontal addition of adjacent pairs in the two packed vectors of 8 32-bit floating pointsaandb. In the result, sums of elements fromaare returned in locations of indices 0, 1, 4, 5; while sums of elements frombare locations 2, 3, 6, 7. - _mm256_hsub_pd⚠
avxHorizontal subtraction of adjacent pairs in the two packed vectors of 4 64-bit floating pointsaandb. In the result, sums of elements fromaare returned in even locations, while sums of elements frombare returned in odd locations. - _mm256_hsub_ps⚠
avxHorizontal subtraction of adjacent pairs in the two packed vectors of 8 32-bit floating pointsaandb. In the result, sums of elements fromaare returned in locations of indices 0, 1, 4, 5; while sums of elements frombare locations 2, 3, 6, 7. - Copies
ato result, and inserts the 8-bit integeriinto result at the location specified byindex. - Copies
ato result, and inserts the 16-bit integeriinto result at the location specified byindex. - Copies
ato result, and inserts the 32-bit integeriinto result at the location specified byindex. - Copies
ato result, then inserts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) frombinto result at the location specified byimm8. - Copies
ato result, then inserts 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) frombinto result at the location specified byimm8. - Copies
ato result, then inserts 128 bits frombinto result at the location specified byimm8. - Loads 256-bits of integer data from unaligned memory into result. This intrinsic may perform better than
_mm256_loadu_si256when the data crosses a cache line boundary. - _mm256_load_pd⚠
avxLoads 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory into result.mem_addrmust be aligned on a 32-byte boundary or a general-protection exception may be generated. - _mm256_load_ps⚠
avxLoads 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory into result.mem_addrmust be aligned on a 32-byte boundary or a general-protection exception may be generated. - Loads 256-bits of integer data from memory into result.
mem_addrmust be aligned on a 32-byte boundary or a general-protection exception may be generated. - _mm256_loadu2_m128⚠
avx,sseLoads two 128-bit values (composed of 4 packed single-precision (32-bit) floating-point elements) from memory, and combine them into a 256-bit value.hiaddrandloaddrdo not need to be aligned on any particular boundary. - _mm256_loadu2_m128d⚠
avx,sse2Loads two 128-bit values (composed of 2 packed double-precision (64-bit) floating-point elements) from memory, and combine them into a 256-bit value.hiaddrandloaddrdo not need to be aligned on any particular boundary. - _mm256_loadu2_m128i⚠
avx,sse2Loads two 128-bit values (composed of integer data) from memory, and combine them into a 256-bit value.hiaddrandloaddrdo not need to be aligned on any particular boundary. - _mm256_loadu_pd⚠
avxLoads 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory into result.mem_addrdoes not need to be aligned on any particular boundary. - _mm256_loadu_ps⚠
avxLoads 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory into result.mem_addrdoes not need to be aligned on any particular boundary. - Loads 256-bits of integer data from memory into result.
mem_addrdoes not need to be aligned on any particular boundary. - Loads packed double-precision (64-bit) floating-point elements from memory into result using
mask(elements are zeroed out when the high bit of the corresponding element is not set). - Loads packed single-precision (32-bit) floating-point elements from memory into result using
mask(elements are zeroed out when the high bit of the corresponding element is not set). - Stores packed double-precision (64-bit) floating-point elements from
ainto memory usingmask. - Stores packed single-precision (32-bit) floating-point elements from
ainto memory usingmask. - _mm256_max_pd⚠
avxCompares packed double-precision (64-bit) floating-point elements inaandb, and returns packed maximum values - _mm256_max_ps⚠
avxCompares packed single-precision (32-bit) floating-point elements inaandb, and returns packed maximum values - _mm256_min_pd⚠
avxCompares packed double-precision (64-bit) floating-point elements inaandb, and returns packed minimum values - _mm256_min_ps⚠
avxCompares packed single-precision (32-bit) floating-point elements inaandb, and returns packed minimum values - Duplicate even-indexed double-precision (64-bit) floating-point elements from
a, and returns the results. - Duplicate odd-indexed single-precision (32-bit) floating-point elements from
a, and returns the results. - Duplicate even-indexed single-precision (32-bit) floating-point elements from
a, and returns the results. - Sets each bit of the returned mask based on the most significant bit of the corresponding packed double-precision (64-bit) floating-point element in
a. - Sets each bit of the returned mask based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in
a. - _mm256_mul_pd⚠
avxMultiplies packed double-precision (64-bit) floating-point elements inaandb. - _mm256_mul_ps⚠
avxMultiplies packed single-precision (32-bit) floating-point elements inaandb. - _mm256_or_pd⚠
avxComputes the bitwise OR packed double-precision (64-bit) floating-point elements inaandb. - _mm256_or_ps⚠
avxComputes the bitwise OR packed single-precision (32-bit) floating-point elements inaandb. - Shuffles 256 bits (composed of 4 packed double-precision (64-bit) floating-point elements) selected by
imm8fromaandb. - Shuffles 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) selected by
imm8fromaandb. - Shuffles 128-bits (composed of integer data) selected by
imm8fromaandb. - Shuffles double-precision (64-bit) floating-point elements in
awithin 128-bit lanes using the control inimm8. - Shuffles single-precision (32-bit) floating-point elements in
awithin 128-bit lanes using the control inimm8. - Shuffles double-precision (64-bit) floating-point elements in
awithin 256-bit lanes using the control inb. - Shuffles single-precision (32-bit) floating-point elements in
awithin 128-bit lanes using the control inb. - _mm256_rcp_ps⚠
avxComputes the approximate reciprocal of packed single-precision (32-bit) floating-point elements ina, and returns the results. The maximum relative error for this approximation is less than 1.5*2^-12. - _mm256_round_pd⚠
avxRounds packed double-precision (64-bit) floating point elements inaaccording to the flagROUNDING. The value ofROUNDINGmay be as follows: - _mm256_round_ps⚠
avxRounds packed single-precision (32-bit) floating point elements inaaccording to the flagROUNDING. The value ofROUNDINGmay be as follows: - _mm256_rsqrt_ps⚠
avxComputes the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements ina, and returns the results. The maximum relative error for this approximation is less than 1.5*2^-12. - _mm256_set1_epi8⚠
avxBroadcasts 8-bit integerato all elements of returned vector. This intrinsic may generate thevpbroadcastb. - Broadcasts 16-bit integer
ato all elements of returned vector. This intrinsic may generate thevpbroadcastw. - Broadcasts 32-bit integer
ato all elements of returned vector. This intrinsic may generate thevpbroadcastd. - Broadcasts 64-bit integer
ato all elements of returned vector. This intrinsic may generate thevpbroadcastq. - _mm256_set1_pd⚠
avxBroadcasts double-precision (64-bit) floating-point valueato all elements of returned vector. - _mm256_set1_ps⚠
avxBroadcasts single-precision (32-bit) floating-point valueato all elements of returned vector. - _mm256_set_epi8⚠
avxSets packed 8-bit integers in returned vector with the supplied values. - _mm256_set_epi16⚠
avxSets packed 16-bit integers in returned vector with the supplied values. - _mm256_set_epi32⚠
avxSets packed 32-bit integers in returned vector with the supplied values. - Sets packed 64-bit integers in returned vector with the supplied values.
- _mm256_set_m128⚠
avxSets packed __m256 returned vector with the supplied values. - _mm256_set_m128d⚠
avxSets packed __m256d returned vector with the supplied values. - _mm256_set_m128i⚠
avxSets packed __m256i returned vector with the supplied values. - _mm256_set_pd⚠
avxSets packed double-precision (64-bit) floating-point elements in returned vector with the supplied values. - _mm256_set_ps⚠
avxSets packed single-precision (32-bit) floating-point elements in returned vector with the supplied values. - _mm256_setr_epi8⚠
avxSets packed 8-bit integers in returned vector with the supplied values in reverse order. - Sets packed 16-bit integers in returned vector with the supplied values in reverse order.
- Sets packed 32-bit integers in returned vector with the supplied values in reverse order.
- Sets packed 64-bit integers in returned vector with the supplied values in reverse order.
- _mm256_setr_m128⚠
avxSets packed __m256 returned vector with the supplied values. - Sets packed __m256d returned vector with the supplied values.
- Sets packed __m256i returned vector with the supplied values.
- _mm256_setr_pd⚠
avxSets packed double-precision (64-bit) floating-point elements in returned vector with the supplied values in reverse order. - _mm256_setr_ps⚠
avxSets packed single-precision (32-bit) floating-point elements in returned vector with the supplied values in reverse order. - Returns vector of type __m256d with all elements set to zero.
- Returns vector of type __m256 with all elements set to zero.
- Returns vector of type __m256i with all elements set to zero.
- Shuffles double-precision (64-bit) floating-point elements within 128-bit lanes using the control in
imm8. - Shuffles single-precision (32-bit) floating-point elements in
awithin 128-bit lanes using the control inimm8. - _mm256_sqrt_pd⚠
avxReturns the square root of packed double-precision (64-bit) floating point elements ina. - _mm256_sqrt_ps⚠
avxReturns the square root of packed single-precision (32-bit) floating point elements ina. - _mm256_store_pd⚠
avxStores 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) fromainto memory.mem_addrmust be aligned on a 32-byte boundary or a general-protection exception may be generated. - _mm256_store_ps⚠
avxStores 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) fromainto memory.mem_addrmust be aligned on a 32-byte boundary or a general-protection exception may be generated. - Stores 256-bits of integer data from
ainto memory.mem_addrmust be aligned on a 32-byte boundary or a general-protection exception may be generated. - _mm256_storeu2_m128⚠
avx,sseStores the high and low 128-bit halves (each composed of 4 packed single-precision (32-bit) floating-point elements) fromainto memory two different 128-bit locations.hiaddrandloaddrdo not need to be aligned on any particular boundary. - _mm256_storeu2_m128d⚠
avx,sse2Stores the high and low 128-bit halves (each composed of 2 packed double-precision (64-bit) floating-point elements) fromainto memory two different 128-bit locations.hiaddrandloaddrdo not need to be aligned on any particular boundary. - _mm256_storeu2_m128i⚠
avx,sse2Stores the high and low 128-bit halves (each composed of integer data) fromainto memory two different 128-bit locations.hiaddrandloaddrdo not need to be aligned on any particular boundary. - _mm256_storeu_pd⚠
avxStores 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) fromainto memory.mem_addrdoes not need to be aligned on any particular boundary. - _mm256_storeu_ps⚠
avxStores 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) fromainto memory.mem_addrdoes not need to be aligned on any particular boundary. - Stores 256-bits of integer data from
ainto memory.mem_addrdoes not need to be aligned on any particular boundary. - _mm256_stream_pd⚠
avxMoves double-precision values from a 256-bit vector of[4 x double]to a 32-byte aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon). - _mm256_stream_ps⚠
avxMoves single-precision floating point values from a 256-bit vector of[8 x float]to a 32-byte aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon). - Moves integer data from a 256-bit integer vector to a 32-byte aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon)
- _mm256_sub_pd⚠
avxSubtracts packed double-precision (64-bit) floating-point elements inbfrom packed elements ina. - _mm256_sub_ps⚠
avxSubtracts packed single-precision (32-bit) floating-point elements inbfrom packed elements ina. - _mm256_testc_pd⚠
avxComputes the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) inaandb, producing an intermediate 256-bit value, and setZFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setCFto 0. Return theCFvalue. - _mm256_testc_ps⚠
avxComputes the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) inaandb, producing an intermediate 256-bit value, and setZFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setCFto 0. Return theCFvalue. - Computes the bitwise AND of 256 bits (representing integer data) in
aandb, and setZFto 1 if the result is zero, otherwise setZFto 0. Computes the bitwise NOT ofaand then AND withb, and setCFto 1 if the result is zero, otherwise setCFto 0. Return theCFvalue. - Computes the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in
aandb, producing an intermediate 256-bit value, and setZFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setCFto 0. Return 1 if both theZFandCFvalues are zero, otherwise return 0. - Computes the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in
aandb, producing an intermediate 256-bit value, and setZFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setCFto 0. Return 1 if both theZFandCFvalues are zero, otherwise return 0. - Computes the bitwise AND of 256 bits (representing integer data) in
aandb, and setZFto 1 if the result is zero, otherwise setZFto 0. Computes the bitwise NOT ofaand then AND withb, and setCFto 1 if the result is zero, otherwise setCFto 0. Return 1 if both theZFandCFvalues are zero, otherwise return 0. - _mm256_testz_pd⚠
avxComputes the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) inaandb, producing an intermediate 256-bit value, and setZFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setCFto 0. Return theZFvalue. - _mm256_testz_ps⚠
avxComputes the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) inaandb, producing an intermediate 256-bit value, and setZFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setCFto 0. Return theZFvalue. - Computes the bitwise AND of 256 bits (representing integer data) in
aandb, and setZFto 1 if the result is zero, otherwise setZFto 0. Computes the bitwise NOT ofaand then AND withb, and setCFto 1 if the result is zero, otherwise setCFto 0. Return theZFvalue. - Returns vector of type
__m256dwith indeterminate elements. Despite being “undefined”, this is some valid value and not equivalent tomem::MaybeUninit. In practice, this is equivalent tomem::zeroed. - Returns vector of type
__m256with indeterminate elements. Despite being “undefined”, this is some valid value and not equivalent tomem::MaybeUninit. In practice, this is equivalent tomem::zeroed. - Returns vector of type __m256i with with indeterminate elements. Despite being “undefined”, this is some valid value and not equivalent to
mem::MaybeUninit. In practice, this is equivalent tomem::zeroed. - Unpacks and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in
aandb. - Unpacks and interleave single-precision (32-bit) floating-point elements from the high half of each 128-bit lane in
aandb. - Unpacks and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in
aandb. - Unpacks and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in
aandb. - _mm256_xor_pd⚠
avxComputes the bitwise XOR of packed double-precision (64-bit) floating-point elements inaandb. - _mm256_xor_ps⚠
avxComputes the bitwise XOR of packed single-precision (32-bit) floating-point elements inaandb. - _mm256_zeroall⚠
avxZeroes the contents of all XMM or YMM registers. - _mm256_zeroupper⚠
avxZeroes the upper 128 bits of all YMM registers; the lower 128-bits of the registers are unmodified. - _mm256_zextpd128_pd256⚠
avx,sse2Constructs a 256-bit floating-point vector of[4 x double]from a 128-bit floating-point vector of[2 x double]. The lower 128 bits contain the value of the source vector. The upper 128 bits are set to zero. - _mm256_zextps128_ps256⚠
avx,sseConstructs a 256-bit floating-point vector of[8 x float]from a 128-bit floating-point vector of[4 x float]. The lower 128 bits contain the value of the source vector. The upper 128 bits are set to zero. - _mm256_zextsi128_si256⚠
avx,sse2Constructs a 256-bit integer vector from a 128-bit integer vector. The lower 128 bits contain the value of the source vector. The upper 128 bits are set to zero. - _mm_broadcast_ss⚠
avxBroadcasts a single-precision (32-bit) floating-point element from memory to all elements of the returned vector. - _mm_cmp_pd⚠
avx,sse2Compares packed double-precision (64-bit) floating-point elements inaandbbased on the comparison operand specified byIMM5. - _mm_cmp_ps⚠
avx,sseCompares packed single-precision (32-bit) floating-point elements inaandbbased on the comparison operand specified byIMM5. - _mm_cmp_sd⚠
avx,sse2Compares the lower double-precision (64-bit) floating-point element inaandbbased on the comparison operand specified byIMM5, store the result in the lower element of returned vector, and copies the upper element fromato the upper element of returned vector. - _mm_cmp_ss⚠
avx,sseCompares the lower single-precision (32-bit) floating-point element inaandbbased on the comparison operand specified byIMM5, store the result in the lower element of returned vector, and copies the upper 3 packed elements fromato the upper elements of returned vector. - _mm_maskload_pd⚠
avxLoads packed double-precision (64-bit) floating-point elements from memory into result usingmask(elements are zeroed out when the high bit of the corresponding element is not set). - _mm_maskload_ps⚠
avxLoads packed single-precision (32-bit) floating-point elements from memory into result usingmask(elements are zeroed out when the high bit of the corresponding element is not set). - _mm_maskstore_pd⚠
avxStores packed double-precision (64-bit) floating-point elements fromainto memory usingmask. - _mm_maskstore_ps⚠
avxStores packed single-precision (32-bit) floating-point elements fromainto memory usingmask. - _mm_permute_pd⚠
avx,sse2Shuffles double-precision (64-bit) floating-point elements inausing the control inimm8. - _mm_permute_ps⚠
avx,sseShuffles single-precision (32-bit) floating-point elements inausing the control inimm8. - Shuffles double-precision (64-bit) floating-point elements in
ausing the control inb. - Shuffles single-precision (32-bit) floating-point elements in
ausing the control inb. - _mm_testc_pd⚠
avxComputes the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) inaandb, producing an intermediate 128-bit value, and setZFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setCFto 0. Return theCFvalue. - _mm_testc_ps⚠
avxComputes the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) inaandb, producing an intermediate 128-bit value, and setZFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setCFto 0. Return theCFvalue. - _mm_testnzc_pd⚠
avxComputes the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) inaandb, producing an intermediate 128-bit value, and setZFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setCFto 0. Return 1 if both theZFandCFvalues are zero, otherwise return 0. - _mm_testnzc_ps⚠
avxComputes the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) inaandb, producing an intermediate 128-bit value, and setZFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setCFto 0. Return 1 if both theZFandCFvalues are zero, otherwise return 0. - _mm_testz_pd⚠
avxComputes the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) inaandb, producing an intermediate 128-bit value, and setZFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise setCFto 0. Return theZFvalue. - _mm_testz_ps⚠
avxComputes the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) inaandb, producing an intermediate 128-bit value, and setZFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setZFto 0. Compute the bitwise NOT ofaand then AND withb, producing an intermediate value, and setCFto 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise setCFto 0. Return theZFvalue.