Module core::core_arch::x86::sse2

source ·
🔬This is a nightly-only experimental API. (stdsimd #48556)
Available on x86 or x86-64 only.
Expand description

Streaming SIMD Extensions 2 (SSE2)

Functions

  • _mm_slli_si128_impl 🔒 Experimentalsse2
    Implementation detail: converts the immediate argument of the _mm_slli_si128 intrinsic into a compile-time constant.
  • _mm_srli_si128_impl 🔒 Experimentalsse2
    Implementation detail: converts the immediate argument of the _mm_srli_si128 intrinsic into a compile-time constant.
  • clflush 🔒 Experimental
  • cmppd 🔒 Experimental
  • cmpsd 🔒 Experimental
  • comieqsd 🔒 Experimental
  • comigesd 🔒 Experimental
  • comigtsd 🔒 Experimental
  • comilesd 🔒 Experimental
  • comiltsd 🔒 Experimental
  • comineqsd 🔒 Experimental
  • cvtdq2ps 🔒 Experimental
  • cvtpd2dq 🔒 Experimental
  • cvtpd2ps 🔒 Experimental
  • cvtps2dq 🔒 Experimental
  • cvtps2pd 🔒 Experimental
  • cvtsd2si 🔒 Experimental
  • cvtsd2ss 🔒 Experimental
  • cvtss2sd 🔒 Experimental
  • cvttpd2dq 🔒 Experimental
  • cvttps2dq 🔒 Experimental
  • cvttsd2si 🔒 Experimental
  • lfence 🔒 Experimental
  • maskmovdqu 🔒 Experimental
  • maxpd 🔒 Experimental
  • maxsd 🔒 Experimental
  • mfence 🔒 Experimental
  • minpd 🔒 Experimental
  • minsd 🔒 Experimental
  • movmskpd 🔒 Experimental
  • packssdw 🔒 Experimental
  • packsswb 🔒 Experimental
  • packuswb 🔒 Experimental
  • pause 🔒 Experimental
  • pavgb 🔒 Experimental
  • pavgw 🔒 Experimental
  • pmaddwd 🔒 Experimental
  • pmulhuw 🔒 Experimental
  • pmulhw 🔒 Experimental
  • pmuludq 🔒 Experimental
  • psadbw 🔒 Experimental
  • pslld 🔒 Experimental
  • psllid 🔒 Experimental
  • pslliq 🔒 Experimental
  • pslliw 🔒 Experimental
  • psllq 🔒 Experimental
  • psllw 🔒 Experimental
  • psrad 🔒 Experimental
  • psraid 🔒 Experimental
  • psraiw 🔒 Experimental
  • psraw 🔒 Experimental
  • psrld 🔒 Experimental
  • psrlid 🔒 Experimental
  • psrliq 🔒 Experimental
  • psrliw 🔒 Experimental
  • psrlq 🔒 Experimental
  • psrlw 🔒 Experimental
  • sqrtpd 🔒 Experimental
  • sqrtsd 🔒 Experimental
  • storeudq 🔒 Experimental
  • storeupd 🔒 Experimental
  • ucomieqsd 🔒 Experimental
  • ucomigesd 🔒 Experimental
  • ucomigtsd 🔒 Experimental
  • ucomilesd 🔒 Experimental
  • ucomiltsd 🔒 Experimental
  • ucomineqsd 🔒 Experimental
  • Adds packed 8-bit integers in a and b.
  • Adds packed 16-bit integers in a and b.
  • Adds packed 32-bit integers in a and b.
  • Adds packed 64-bit integers in a and b.
  • _mm_add_pdsse2
    Adds packed double-precision (64-bit) floating-point elements in a and b.
  • _mm_add_sdsse2
    Returns a new vector with the low element of a replaced by the sum of the low elements of a and b.
  • Adds packed 8-bit integers in a and b using saturation.
  • Adds packed 16-bit integers in a and b using saturation.
  • Adds packed unsigned 8-bit integers in a and b using saturation.
  • Adds packed unsigned 16-bit integers in a and b using saturation.
  • _mm_and_pdsse2
    Computes the bitwise AND of packed double-precision (64-bit) floating-point elements in a and b.
  • Computes the bitwise AND of 128 bits (representing integer data) in a and b.
  • Computes the bitwise NOT of a and then AND with b.
  • Computes the bitwise NOT of 128 bits (representing integer data) in a and then AND with b.
  • Averages packed unsigned 8-bit integers in a and b.
  • Averages packed unsigned 16-bit integers in a and b.
  • Shifts a left by IMM8 bytes while shifting in zeros.
  • Shifts a right by IMM8 bytes while shifting in zeros.
  • Casts a 128-bit floating-point vector of [2 x double] into a 128-bit floating-point vector of [4 x float].
  • Casts a 128-bit floating-point vector of [2 x double] into a 128-bit integer vector.
  • Casts a 128-bit floating-point vector of [4 x float] into a 128-bit floating-point vector of [2 x double].
  • Casts a 128-bit floating-point vector of [4 x float] into a 128-bit integer vector.
  • Casts a 128-bit integer vector into a 128-bit floating-point vector of [2 x double].
  • Casts a 128-bit integer vector into a 128-bit floating-point vector of [4 x float].
  • Invalidates and flushes the cache line that contains p from all levels of the cache hierarchy.
  • Compares packed 8-bit integers in a and b for equality.
  • Compares packed 16-bit integers in a and b for equality.
  • Compares packed 32-bit integers in a and b for equality.
  • Compares corresponding elements in a and b for equality.
  • Returns a new vector with the low element of a replaced by the equality comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b for greater-than-or-equal.
  • Returns a new vector with the low element of a replaced by the greater-than-or-equal comparison of the lower elements of a and b.
  • Compares packed 8-bit integers in a and b for greater-than.
  • Compares packed 16-bit integers in a and b for greater-than.
  • Compares packed 32-bit integers in a and b for greater-than.
  • Compares corresponding elements in a and b for greater-than.
  • Returns a new vector with the low element of a replaced by the greater-than comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b for less-than-or-equal
  • Returns a new vector with the low element of a replaced by the less-than-or-equal comparison of the lower elements of a and b.
  • Compares packed 8-bit integers in a and b for less-than.
  • Compares packed 16-bit integers in a and b for less-than.
  • Compares packed 32-bit integers in a and b for less-than.
  • Compares corresponding elements in a and b for less-than.
  • Returns a new vector with the low element of a replaced by the less-than comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b for not-equal.
  • Returns a new vector with the low element of a replaced by the not-equal comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b for not-greater-than-or-equal.
  • Returns a new vector with the low element of a replaced by the not-greater-than-or-equal comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b for not-greater-than.
  • Returns a new vector with the low element of a replaced by the not-greater-than comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b for not-less-than-or-equal.
  • Returns a new vector with the low element of a replaced by the not-less-than-or-equal comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b for not-less-than.
  • Returns a new vector with the low element of a replaced by the not-less-than comparison of the lower elements of a and b.
  • Compares corresponding elements in a and b to see if neither is NaN.
  • Returns a new vector with the low element of a replaced by the result of comparing both of the lower elements of a and b to NaN. If neither are equal to NaN then 0xFFFFFFFFFFFFFFFF is used and 0 otherwise.
  • Compares corresponding elements in a and b to see if either is NaN.
  • Returns a new vector with the low element of a replaced by the result of comparing both of the lower elements of a and b to NaN. If either is equal to NaN then 0xFFFFFFFFFFFFFFFF is used and 0 otherwise.
  • Compares the lower element of a and b for equality.
  • Compares the lower element of a and b for greater-than-or-equal.
  • Compares the lower element of a and b for greater-than.
  • Compares the lower element of a and b for less-than-or-equal.
  • Compares the lower element of a and b for less-than.
  • Compares the lower element of a and b for not-equal.
  • Converts the lower two packed 32-bit integers in a to packed double-precision (64-bit) floating-point elements.
  • Converts packed 32-bit integers in a to packed single-precision (32-bit) floating-point elements.
  • Converts packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers.
  • Converts packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements
  • Converts packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers.
  • Converts packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements.
  • Returns the lower double-precision (64-bit) floating-point element of a.
  • Converts the lower double-precision (64-bit) floating-point element in a to a 32-bit integer.
  • Converts the lower double-precision (64-bit) floating-point element in b to a single-precision (32-bit) floating-point element, store the result in the lower element of the return value, and copies the upper element from a to the upper element the return value.
  • Returns a with its lower element replaced by b after converting it to an f64.
  • Returns a vector whose lowest element is a and all higher elements are 0.
  • Returns the lowest element of a.
  • Converts the lower single-precision (32-bit) floating-point element in b to a double-precision (64-bit) floating-point element, store the result in the lower element of the return value, and copies the upper element from a to the upper element the return value.
  • Converts packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation.
  • Converts packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation.
  • Converts the lower double-precision (64-bit) floating-point element in a to a 32-bit integer with truncation.
  • _mm_div_pdsse2
    Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b.
  • _mm_div_sdsse2
    Returns a new vector with the low element of a replaced by the result of diving the lower element of a by the lower element of b.
  • Returns the imm8 element of a.
  • Returns a new vector where the imm8 element of a is replaced with i.
  • _mm_lfencesse2
    Performs a serializing operation on all load-from-memory instructions that were issued prior to this instruction.
  • Loads a double-precision (64-bit) floating-point element from memory into both elements of returned vector.
  • Loads 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from memory into the returned vector. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
  • Loads a double-precision (64-bit) floating-point element from memory into both elements of returned vector.
  • Loads a 64-bit double-precision value to the low element of a 128-bit integer vector and clears the upper element.
  • Loads 128-bits of integer data from memory into a new vector.
  • Loads a double-precision value into the high-order bits of a 128-bit vector of [2 x double]. The low-order bits are copied from the low-order bits of the first operand.
  • Loads 64-bit integer from memory into first element of returned vector.
  • Loads a double-precision value into the low-order bits of a 128-bit vector of [2 x double]. The high-order bits are copied from the high-order bits of the first operand.
  • Loads 2 double-precision (64-bit) floating-point elements from memory into the returned vector in reverse order. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
  • Loads 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from memory into the returned vector. mem_addr does not need to be aligned on any particular boundary.
  • Loads 128-bits of integer data from memory into a new vector.
  • Multiplies and then horizontally add signed 16 bit integers in a and b.
  • Conditionally store 8-bit integer elements from a into memory using mask.
  • Compares packed 16-bit integers in a and b, and returns the packed maximum values.
  • Compares packed unsigned 8-bit integers in a and b, and returns the packed maximum values.
  • _mm_max_pdsse2
    Returns a new vector with the maximum values from corresponding elements in a and b.
  • _mm_max_sdsse2
    Returns a new vector with the low element of a replaced by the maximum of the lower elements of a and b.
  • _mm_mfencesse2
    Performs a serializing operation on all load-from-memory and store-to-memory instructions that were issued prior to this instruction.
  • Compares packed 16-bit integers in a and b, and returns the packed minimum values.
  • Compares packed unsigned 8-bit integers in a and b, and returns the packed minimum values.
  • _mm_min_pdsse2
    Returns a new vector with the minimum values from corresponding elements in a and b.
  • _mm_min_sdsse2
    Returns a new vector with the low element of a replaced by the minimum of the lower elements of a and b.
  • Returns a vector where the low element is extracted from a and its upper element is zero.
  • Constructs a 128-bit floating-point vector of [2 x double]. The lower 64 bits are set to the lower 64 bits of the second parameter. The upper 64 bits are set to the upper 64 bits of the first parameter.
  • Returns a mask of the most significant bit of each element in a.
  • Returns a mask of the most significant bit of each element in a.
  • Multiplies the low unsigned 32-bit integers from each packed 64-bit element in a and b.
  • _mm_mul_pdsse2
    Multiplies packed double-precision (64-bit) floating-point elements in a and b.
  • _mm_mul_sdsse2
    Returns a new vector with the low element of a replaced by multiplying the low elements of a and b.
  • Multiplies the packed 16-bit integers in a and b.
  • Multiplies the packed unsigned 16-bit integers in a and b.
  • Multiplies the packed 16-bit integers in a and b.
  • _mm_or_pdsse2
    Computes the bitwise OR of a and b.
  • Computes the bitwise OR of 128 bits (representing integer data) in a and b.
  • Converts packed 16-bit integers from a and b to packed 8-bit integers using signed saturation.
  • Converts packed 32-bit integers from a and b to packed 16-bit integers using signed saturation.
  • Converts packed 16-bit integers from a and b to packed 8-bit integers using unsigned saturation.
  • Provides a hint to the processor that the code sequence is a spin-wait loop.
  • Sum the absolute differences of packed unsigned 8-bit integers.
  • Broadcasts 8-bit integer a to all elements.
  • Broadcasts 16-bit integer a to all elements.
  • Broadcasts 32-bit integer a to all elements.
  • Broadcasts 64-bit integer a to all elements.
  • Broadcasts double-precision (64-bit) floating-point value a to all elements of the return value.
  • Sets packed 8-bit integers with the supplied values.
  • Sets packed 16-bit integers with the supplied values.
  • Sets packed 32-bit integers with the supplied values.
  • Sets packed 64-bit integers with the supplied values, from highest to lowest.
  • _mm_set_pdsse2
    Sets packed double-precision (64-bit) floating-point elements in the return value with the supplied values.
  • Broadcasts double-precision (64-bit) floating-point value a to all elements of the return value.
  • _mm_set_sdsse2
    Copies double-precision (64-bit) floating-point element a to the lower element of the packed 64-bit return value.
  • Sets packed 8-bit integers with the supplied values in reverse order.
  • Sets packed 16-bit integers with the supplied values in reverse order.
  • Sets packed 32-bit integers with the supplied values in reverse order.
  • Sets packed double-precision (64-bit) floating-point elements in the return value with the supplied values in reverse order.
  • Returns packed double-precision (64-bit) floating-point elements with all zeros.
  • Returns a vector with all elements set to zero.
  • Shuffles 32-bit integers in a using the control in IMM8.
  • Constructs a 128-bit floating-point vector of [2 x double] from two 128-bit vector parameters of [2 x double], using the immediate-value parameter as a specifier.
  • Shuffles 16-bit integers in the high 64 bits of a using the control in IMM8.
  • Shuffles 16-bit integers in the low 64 bits of a using the control in IMM8.
  • Shifts packed 16-bit integers in a left by count while shifting in zeros.
  • Shifts packed 32-bit integers in a left by count while shifting in zeros.
  • Shifts packed 64-bit integers in a left by count while shifting in zeros.
  • Shifts packed 16-bit integers in a left by IMM8 while shifting in zeros.
  • Shifts packed 32-bit integers in a left by IMM8 while shifting in zeros.
  • Shifts packed 64-bit integers in a left by IMM8 while shifting in zeros.
  • Shifts a left by IMM8 bytes while shifting in zeros.
  • Returns a new vector with the square root of each of the values in a.
  • Returns a new vector with the low element of a replaced by the square root of the lower element b.
  • Shifts packed 16-bit integers in a right by count while shifting in sign bits.
  • Shifts packed 32-bit integers in a right by count while shifting in sign bits.
  • Shifts packed 16-bit integers in a right by IMM8 while shifting in sign bits.
  • Shifts packed 32-bit integers in a right by IMM8 while shifting in sign bits.
  • Shifts packed 16-bit integers in a right by count while shifting in zeros.
  • Shifts packed 32-bit integers in a right by count while shifting in zeros.
  • Shifts packed 64-bit integers in a right by count while shifting in zeros.
  • Shifts packed 16-bit integers in a right by IMM8 while shifting in zeros.
  • Shifts packed 32-bit integers in a right by IMM8 while shifting in zeros.
  • Shifts packed 64-bit integers in a right by IMM8 while shifting in zeros.
  • Shifts a right by IMM8 bytes while shifting in zeros.
  • Stores the lower double-precision (64-bit) floating-point element from a into 2 contiguous elements in memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
  • Stores 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
  • Stores the lower double-precision (64-bit) floating-point element from a into 2 contiguous elements in memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
  • Stores the lower 64 bits of a 128-bit vector of [2 x double] to a memory location.
  • Stores 128-bits of integer data from a into memory.
  • Stores the upper 64 bits of a 128-bit vector of [2 x double] to a memory location.
  • Stores the lower 64-bit integer a to a memory location.
  • Stores the lower 64 bits of a 128-bit vector of [2 x double] to a memory location.
  • Stores 2 double-precision (64-bit) floating-point elements from a into memory in reverse order. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
  • Stores 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.
  • Stores 128-bits of integer data from a into memory.
  • Stores a 128-bit floating point vector of [2 x double] to a 128-bit aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
  • Stores a 32-bit integer value in the specified memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
  • Stores a 128-bit integer vector to a 128-bit aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
  • Subtracts packed 8-bit integers in b from packed 8-bit integers in a.
  • Subtracts packed 16-bit integers in b from packed 16-bit integers in a.
  • Subtract packed 32-bit integers in b from packed 32-bit integers in a.
  • Subtract packed 64-bit integers in b from packed 64-bit integers in a.
  • _mm_sub_pdsse2
    Subtract packed double-precision (64-bit) floating-point elements in b from a.
  • _mm_sub_sdsse2
    Returns a new vector with the low element of a replaced by subtracting the low element by b from the low element of a.
  • Subtract packed 8-bit integers in b from packed 8-bit integers in a using saturation.
  • Subtract packed 16-bit integers in b from packed 16-bit integers in a using saturation.
  • Subtract packed unsigned 8-bit integers in b from packed unsigned 8-bit integers in a using saturation.
  • Subtract packed unsigned 16-bit integers in b from packed unsigned 16-bit integers in a using saturation.
  • Compares the lower element of a and b for equality.
  • Compares the lower element of a and b for greater-than-or-equal.
  • Compares the lower element of a and b for greater-than.
  • Compares the lower element of a and b for less-than-or-equal.
  • Compares the lower element of a and b for less-than.
  • Compares the lower element of a and b for not-equal.
  • Returns vector of type __m128d with indeterminate elements. Despite being “undefined”, this is some valid value and not equivalent to mem::MaybeUninit. In practice, this is equivalent to mem::zeroed.
  • Returns vector of type __m128i with indeterminate elements. Despite being “undefined”, this is some valid value and not equivalent to mem::MaybeUninit. In practice, this is equivalent to mem::zeroed.
  • Unpacks and interleave 8-bit integers from the high half of a and b.
  • Unpacks and interleave 16-bit integers from the high half of a and b.
  • Unpacks and interleave 32-bit integers from the high half of a and b.
  • Unpacks and interleave 64-bit integers from the high half of a and b.
  • The resulting __m128d element is composed by the low-order values of the two __m128d interleaved input elements, i.e.:
  • Unpacks and interleave 8-bit integers from the low half of a and b.
  • Unpacks and interleave 16-bit integers from the low half of a and b.
  • Unpacks and interleave 32-bit integers from the low half of a and b.
  • Unpacks and interleave 64-bit integers from the low half of a and b.
  • The resulting __m128d element is composed by the high-order values of the two __m128d interleaved input elements, i.e.:
  • _mm_xor_pdsse2
    Computes the bitwise XOR of a and b.
  • Computes the bitwise XOR of 128 bits (representing integer data) in a and b.