Module core::core_arch::x86::sse

source ·

🔬This is a nightly-only experimental API. (stdsimd #48556)

Available on x86 or x86-64 only.

Expand description

Streaming SIMD Extensions (SSE)

Constants

_MM_EXCEPT_DENORM
See _mm_setcsr
_MM_EXCEPT_DIV_ZERO
See _mm_setcsr
_MM_EXCEPT_INEXACT
See _mm_setcsr
_MM_EXCEPT_INVALID
See _mm_setcsr
_MM_EXCEPT_MASK
See _MM_GET_EXCEPTION_STATE
_MM_EXCEPT_OVERFLOW
See _mm_setcsr
_MM_EXCEPT_UNDERFLOW
See _mm_setcsr
_MM_FLUSH_ZERO_MASK
See _MM_GET_FLUSH_ZERO_MODE
_MM_FLUSH_ZERO_OFF
See _mm_setcsr
_MM_FLUSH_ZERO_ON
See _mm_setcsr
_MM_HINT_ET0
See _mm_prefetch.
_MM_HINT_ET1
See _mm_prefetch.
_MM_HINT_NTA
See _mm_prefetch.
_MM_HINT_T0
See _mm_prefetch.
_MM_HINT_T1
See _mm_prefetch.
_MM_HINT_T2
See _mm_prefetch.
_MM_MASK_DENORM
See _mm_setcsr
_MM_MASK_DIV_ZERO
See _mm_setcsr
_MM_MASK_INEXACT
See _mm_setcsr
_MM_MASK_INVALID
See _mm_setcsr
_MM_MASK_MASK
See _MM_GET_EXCEPTION_MASK
_MM_MASK_OVERFLOW
See _mm_setcsr
_MM_MASK_UNDERFLOW
See _mm_setcsr
_MM_ROUND_DOWN
See _mm_setcsr
_MM_ROUND_MASK
See _MM_GET_ROUNDING_MODE
_MM_ROUND_NEAREST
See _mm_setcsr
_MM_ROUND_TOWARD_ZERO
See _mm_setcsr
_MM_ROUND_UP
See _mm_setcsr

Functions

_MM_SHUFFLEExperimental
A utility function for creating masks to use with Intel shuffle and permute intrinsics.
addss 🔒 ^⚠Experimental
cmpps 🔒 ^⚠Experimental
cmpss 🔒 ^⚠Experimental
comieq_ss 🔒 ^⚠Experimental
comige_ss 🔒 ^⚠Experimental
comigt_ss 🔒 ^⚠Experimental
comile_ss 🔒 ^⚠Experimental
comilt_ss 🔒 ^⚠Experimental
comineq_ss 🔒 ^⚠Experimental
cvtsi2ss 🔒 ^⚠Experimental
cvtss2si 🔒 ^⚠Experimental
cvttss2si 🔒 ^⚠Experimental
divss 🔒 ^⚠Experimental
ldmxcsr 🔒 ^⚠Experimental
maxps 🔒 ^⚠Experimental
maxss 🔒 ^⚠Experimental
minps 🔒 ^⚠Experimental
minss 🔒 ^⚠Experimental
movmskps 🔒 ^⚠Experimental
mulss 🔒 ^⚠Experimental
prefetch 🔒 ^⚠Experimental
rcpps 🔒 ^⚠Experimental
rcpss 🔒 ^⚠Experimental
rsqrtps 🔒 ^⚠Experimental
rsqrtss 🔒 ^⚠Experimental
sfence 🔒 ^⚠Experimental
sqrtps 🔒 ^⚠Experimental
sqrtss 🔒 ^⚠Experimental
stmxcsr 🔒 ^⚠Experimental
subss 🔒 ^⚠Experimental
ucomieq_ss 🔒 ^⚠Experimental
ucomige_ss 🔒 ^⚠Experimental
ucomigt_ss 🔒 ^⚠Experimental
ucomile_ss 🔒 ^⚠Experimental
ucomilt_ss 🔒 ^⚠Experimental
ucomineq_ss 🔒 ^⚠Experimental
_MM_GET_EXCEPTION_MASK^⚠sse
See _mm_setcsr
_MM_GET_EXCEPTION_STATE^⚠sse
See _mm_setcsr
_MM_GET_FLUSH_ZERO_MODE^⚠sse
See _mm_setcsr
_MM_GET_ROUNDING_MODE^⚠sse
See _mm_setcsr
_MM_SET_EXCEPTION_MASK^⚠sse
See _mm_setcsr
_MM_SET_EXCEPTION_STATE^⚠sse
See _mm_setcsr
_MM_SET_FLUSH_ZERO_MODE^⚠sse
See _mm_setcsr
_MM_SET_ROUNDING_MODE^⚠sse
See _mm_setcsr
_MM_TRANSPOSE4_PS^⚠sse
Transpose the 4x4 matrix formed by 4 rows of __m128 in place.
_mm_add_ps^⚠sse
Adds __m128 vectors.
_mm_add_ss^⚠sse
Adds the first component of a and b, the other components are copied from a.
_mm_and_ps^⚠sse
Bitwise AND of packed single-precision (32-bit) floating-point elements.
_mm_andnot_ps^⚠sse
Bitwise AND-NOT of packed single-precision (32-bit) floating-point elements.
_mm_cmpeq_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input elements were equal, or 0 otherwise.
_mm_cmpeq_ss^⚠sse
Compares the lowest f32 of both inputs for equality. The lowest 32 bits of the result will be 0xffffffff if the two inputs are equal, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpge_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is greater than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpge_ss^⚠sse
Compares the lowest f32 of both inputs for greater than or equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is greater than or equal b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpgt_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is greater than the corresponding element in b, or 0 otherwise.
_mm_cmpgt_ss^⚠sse
Compares the lowest f32 of both inputs for greater than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is greater than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmple_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is less than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmple_ss^⚠sse
Compares the lowest f32 of both inputs for less than or equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is less than or equal b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmplt_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is less than the corresponding element in b, or 0 otherwise.
_mm_cmplt_ss^⚠sse
Compares the lowest f32 of both inputs for less than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is less than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpneq_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input elements are not equal, or 0 otherwise.
_mm_cmpneq_ss^⚠sse
Compares the lowest f32 of both inputs for inequality. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnge_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not greater than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpnge_ss^⚠sse
Compares the lowest f32 of both inputs for not-greater-than-or-equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not greater than or equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpngt_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not greater than the corresponding element in b, or 0 otherwise.
_mm_cmpngt_ss^⚠sse
Compares the lowest f32 of both inputs for not-greater-than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not greater than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnle_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not less than or equal to the corresponding element in b, or 0 otherwise.
_mm_cmpnle_ss^⚠sse
Compares the lowest f32 of both inputs for not-less-than-or-equal. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not less than or equal to b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpnlt_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. The result in the output vector will be 0xffffffff if the input element in a is not less than the corresponding element in b, or 0 otherwise.
_mm_cmpnlt_ss^⚠sse
Compares the lowest f32 of both inputs for not-less-than. The lowest 32 bits of the result will be 0xffffffff if a.extract(0) is not less than b.extract(0), or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpord_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. Returns four floats that have one of two possible bit patterns. The element in the output vector will be 0xffffffff if the input elements in a and b are ordered (i.e., neither of them is a NaN), or 0 otherwise.
_mm_cmpord_ss^⚠sse
Checks if the lowest f32 of both inputs are ordered. The lowest 32 bits of the result will be 0xffffffff if neither of a.extract(0) or b.extract(0) is a NaN, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_cmpunord_ps^⚠sse
Compares each of the four floats in a to the corresponding element in b. Returns four floats that have one of two possible bit patterns. The element in the output vector will be 0xffffffff if the input elements in a and b are unordered (i.e., at least on of them is a NaN), or 0 otherwise.
_mm_cmpunord_ss^⚠sse
Checks if the lowest f32 of both inputs are unordered. The lowest 32 bits of the result will be 0xffffffff if any of a.extract(0) or b.extract(0) is a NaN, or 0 otherwise. The upper 96 bits of the result are the upper 96 bits of a.
_mm_comieq_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are equal, or 0 otherwise.
_mm_comige_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than or equal to the one from b, or 0 otherwise.
_mm_comigt_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than the one from b, or 0 otherwise.
_mm_comile_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than or equal to the one from b, or 0 otherwise.
_mm_comilt_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than the one from b, or 0 otherwise.
_mm_comineq_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are not equal, or 0 otherwise.
_mm_cvt_si2ss^⚠sse
Alias for _mm_cvtsi32_ss.
_mm_cvt_ss2si^⚠sse
Alias for _mm_cvtss_si32.
_mm_cvtsi32_ss^⚠sse
Converts a 32 bit integer to a 32 bit float. The result vector is the input vector a with the lowest 32 bit float replaced by the converted integer.
_mm_cvtss_f32^⚠sse
Extracts the lowest 32 bit float from the input vector.
_mm_cvtss_si32^⚠sse
Converts the lowest 32 bit float in the input vector to a 32 bit integer.
_mm_cvtt_ss2si^⚠sse
Alias for _mm_cvttss_si32.
_mm_cvttss_si32^⚠sse
Converts the lowest 32 bit float in the input vector to a 32 bit integer with truncation.
_mm_div_ps^⚠sse
Divides __m128 vectors.
_mm_div_ss^⚠sse
Divides the first component of b by a, the other components are copied from a.
_mm_getcsr^⚠sse
Gets the unsigned 32-bit value of the MXCSR control and status register.
_mm_load1_ps^⚠sse
Construct a __m128 by duplicating the value read from p into all elements.
_mm_load_ps^⚠sse
Loads four f32 values from aligned memory into a __m128. If the pointer is not aligned to a 128-bit boundary (16 bytes) a general protection fault will be triggered (fatal program crash).
_mm_load_ps1^⚠sse
Alias for _mm_load1_ps
_mm_load_ss^⚠sse
Construct a __m128 with the lowest element read from p and the other elements set to zero.
_mm_loadr_ps^⚠sse
Loads four f32 values from aligned memory into a __m128 in reverse order.
_mm_loadu_ps^⚠sse
Loads four f32 values from memory into a __m128. There are no restrictions on memory alignment. For aligned memory _mm_load_ps may be faster.
_mm_loadu_si64^⚠sse
Loads unaligned 64-bits of integer data from memory into new vector.
_mm_max_ps^⚠sse
Compares packed single-precision (32-bit) floating-point elements in a and b, and return the corresponding maximum values.
_mm_max_ss^⚠sse
Compares the first single-precision (32-bit) floating-point element of a and b, and return the maximum value in the first element of the return value, the other elements are copied from a.
_mm_min_ps^⚠sse
Compares packed single-precision (32-bit) floating-point elements in a and b, and return the corresponding minimum values.
_mm_min_ss^⚠sse
Compares the first single-precision (32-bit) floating-point element of a and b, and return the minimum value in the first element of the return value, the other elements are copied from a.
_mm_move_ss^⚠sse
Returns a __m128 with the first component from b and the remaining components from a.
_mm_movehl_ps^⚠sse
Combine higher half of a and b. The higher half of b occupies the lower half of result.
_mm_movelh_ps^⚠sse
Combine lower half of a and b. The lower half of b occupies the higher half of result.
_mm_movemask_ps^⚠sse
Returns a mask of the most significant bit of each element in a.
_mm_mul_ps^⚠sse
Multiplies __m128 vectors.
_mm_mul_ss^⚠sse
Multiplies the first component of a and b, the other components are copied from a.
_mm_or_ps^⚠sse
Bitwise OR of packed single-precision (32-bit) floating-point elements.
_mm_prefetch^⚠sse
Fetch the cache line that contains address p using the given STRATEGY.
_mm_rcp_ps^⚠sse
Returns the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a.
_mm_rcp_ss^⚠sse
Returns the approximate reciprocal of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_rsqrt_ps^⚠sse
Returns the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a.
_mm_rsqrt_ss^⚠sse
Returns the approximate reciprocal square root of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_set1_ps^⚠sse
Construct a __m128 with all element set to a.
_mm_set_ps^⚠sse
Construct a __m128 from four floating point values highest to lowest.
_mm_set_ps1^⚠sse
Alias for _mm_set1_ps
_mm_set_ss^⚠sse
Construct a __m128 with the lowest element set to a and the rest set to zero.
_mm_setcsr^⚠sse
Sets the MXCSR register with the 32-bit unsigned integer value.
_mm_setr_ps^⚠sse
Construct a __m128 from four floating point values lowest to highest.
_mm_setzero_ps^⚠sse
Construct a __m128 with all elements initialized to zero.
_mm_sfence^⚠sse
Performs a serializing operation on all store-to-memory instructions that were issued prior to this instruction.
_mm_shuffle_ps^⚠sse
Shuffles packed single-precision (32-bit) floating-point elements in a and b using MASK.
_mm_sqrt_ps^⚠sse
Returns the square root of packed single-precision (32-bit) floating-point elements in a.
_mm_sqrt_ss^⚠sse
Returns the square root of the first single-precision (32-bit) floating-point element in a, the other elements are unchanged.
_mm_store1_ps^⚠sse
Stores the lowest 32 bit float of a repeated four times into aligned memory.
_mm_store_ps^⚠sse
Stores four 32-bit floats into aligned memory.
_mm_store_ps1^⚠sse
Alias for _mm_store1_ps
_mm_store_ss^⚠sse
Stores the lowest 32 bit float of a into memory.
_mm_storer_ps^⚠sse
Stores four 32-bit floats into aligned memory in reverse order.
_mm_storeu_ps^⚠sse
Stores four 32-bit floats into memory. There are no restrictions on memory alignment. For aligned memory _mm_store_ps may be faster.
_mm_stream_ps^⚠sse
Stores a into the memory at mem_addr using a non-temporal memory hint.
_mm_sub_ps^⚠sse
Subtracts __m128 vectors.
_mm_sub_ss^⚠sse
Subtracts the first component of b from a, the other components are copied from a.
_mm_ucomieq_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are equal, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomige_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than or equal to the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomigt_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is greater than the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomile_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than or equal to the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomilt_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if the value from a is less than the one from b, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_ucomineq_ss^⚠sse
Compares two 32-bit floats from the low-order bits of a and b. Returns 1 if they are not equal, or 0 otherwise. This instruction will not signal an exception if either argument is a quiet NaN.
_mm_undefined_ps^⚠sse
Returns vector of type __m128 with indeterminate elements. Despite being “undefined”, this is some valid value and not equivalent to mem::MaybeUninit. In practice, this is equivalent to mem::zeroed.
_mm_unpackhi_ps^⚠sse
Unpacks and interleave single-precision (32-bit) floating-point elements from the higher half of a and b.
_mm_unpacklo_ps^⚠sse
Unpacks and interleave single-precision (32-bit) floating-point elements from the lower half of a and b.
_mm_xor_ps^⚠sse
Bitwise exclusive OR of packed single-precision (32-bit) floating-point elements.