🔬This is a nightly-only experimental API. (
stdsimd
#48556)Available on x86 or x86-64 only.
Expand description
Streaming SIMD Extensions (SSE)
Constants
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_prefetch
. - See
_mm_prefetch
. - See
_mm_prefetch
. - See
_mm_prefetch
. - See
_mm_prefetch
. - See
_mm_prefetch
. - See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
Functions
- _MM_SHUFFLEExperimentalA utility function for creating masks to use with Intel shuffle and permute intrinsics.
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- See
_mm_setcsr
- Transpose the 4x4 matrix formed by 4 rows of __m128 in place.
- _mm_add_ps⚠
sse
Adds __m128 vectors. - _mm_add_ss⚠
sse
Adds the first component ofa
andb
, the other components are copied froma
. - _mm_and_ps⚠
sse
Bitwise AND of packed single-precision (32-bit) floating-point elements. - _mm_andnot_ps⚠
sse
Bitwise AND-NOT of packed single-precision (32-bit) floating-point elements. - _mm_cmpeq_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input elements were equal, or0
otherwise. - _mm_cmpeq_ss⚠
sse
Compares the lowestf32
of both inputs for equality. The lowest 32 bits of the result will be0xffffffff
if the two inputs are equal, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpge_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is greater than or equal to the corresponding element inb
, or0
otherwise. - _mm_cmpge_ss⚠
sse
Compares the lowestf32
of both inputs for greater than or equal. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is greater than or equalb.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpgt_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is greater than the corresponding element inb
, or0
otherwise. - _mm_cmpgt_ss⚠
sse
Compares the lowestf32
of both inputs for greater than. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is greater thanb.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmple_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is less than or equal to the corresponding element inb
, or0
otherwise. - _mm_cmple_ss⚠
sse
Compares the lowestf32
of both inputs for less than or equal. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is less than or equalb.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmplt_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is less than the corresponding element inb
, or0
otherwise. - _mm_cmplt_ss⚠
sse
Compares the lowestf32
of both inputs for less than. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is less thanb.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpneq_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input elements are not equal, or0
otherwise. - _mm_cmpneq_ss⚠
sse
Compares the lowestf32
of both inputs for inequality. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is not equal tob.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpnge_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is not greater than or equal to the corresponding element inb
, or0
otherwise. - _mm_cmpnge_ss⚠
sse
Compares the lowestf32
of both inputs for not-greater-than-or-equal. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is not greater than or equal tob.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpngt_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is not greater than the corresponding element inb
, or0
otherwise. - _mm_cmpngt_ss⚠
sse
Compares the lowestf32
of both inputs for not-greater-than. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is not greater thanb.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpnle_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is not less than or equal to the corresponding element inb
, or0
otherwise. - _mm_cmpnle_ss⚠
sse
Compares the lowestf32
of both inputs for not-less-than-or-equal. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is not less than or equal tob.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpnlt_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. The result in the output vector will be0xffffffff
if the input element ina
is not less than the corresponding element inb
, or0
otherwise. - _mm_cmpnlt_ss⚠
sse
Compares the lowestf32
of both inputs for not-less-than. The lowest 32 bits of the result will be0xffffffff
ifa.extract(0)
is not less thanb.extract(0)
, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpord_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. Returns four floats that have one of two possible bit patterns. The element in the output vector will be0xffffffff
if the input elements ina
andb
are ordered (i.e., neither of them is a NaN), or 0 otherwise. - _mm_cmpord_ss⚠
sse
Checks if the lowestf32
of both inputs are ordered. The lowest 32 bits of the result will be0xffffffff
if neither ofa.extract(0)
orb.extract(0)
is a NaN, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_cmpunord_ps⚠
sse
Compares each of the four floats ina
to the corresponding element inb
. Returns four floats that have one of two possible bit patterns. The element in the output vector will be0xffffffff
if the input elements ina
andb
are unordered (i.e., at least on of them is a NaN), or 0 otherwise. - _mm_cmpunord_ss⚠
sse
Checks if the lowestf32
of both inputs are unordered. The lowest 32 bits of the result will be0xffffffff
if any ofa.extract(0)
orb.extract(0)
is a NaN, or0
otherwise. The upper 96 bits of the result are the upper 96 bits ofa
. - _mm_comieq_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if they are equal, or0
otherwise. - _mm_comige_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is greater than or equal to the one fromb
, or0
otherwise. - _mm_comigt_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is greater than the one fromb
, or0
otherwise. - _mm_comile_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is less than or equal to the one fromb
, or0
otherwise. - _mm_comilt_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is less than the one fromb
, or0
otherwise. - _mm_comineq_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if they are not equal, or0
otherwise. - _mm_cvt_si2ss⚠
sse
Alias for_mm_cvtsi32_ss
. - _mm_cvt_ss2si⚠
sse
Alias for_mm_cvtss_si32
. - _mm_cvtsi32_ss⚠
sse
Converts a 32 bit integer to a 32 bit float. The result vector is the input vectora
with the lowest 32 bit float replaced by the converted integer. - _mm_cvtss_f32⚠
sse
Extracts the lowest 32 bit float from the input vector. - _mm_cvtss_si32⚠
sse
Converts the lowest 32 bit float in the input vector to a 32 bit integer. - _mm_cvtt_ss2si⚠
sse
Alias for_mm_cvttss_si32
. - _mm_cvttss_si32⚠
sse
Converts the lowest 32 bit float in the input vector to a 32 bit integer with truncation. - _mm_div_ps⚠
sse
Divides __m128 vectors. - _mm_div_ss⚠
sse
Divides the first component ofb
bya
, the other components are copied froma
. - _mm_getcsr⚠
sse
Gets the unsigned 32-bit value of the MXCSR control and status register. - _mm_load1_ps⚠
sse
Construct a__m128
by duplicating the value read fromp
into all elements. - _mm_load_ps⚠
sse
Loads fourf32
values from aligned memory into a__m128
. If the pointer is not aligned to a 128-bit boundary (16 bytes) a general protection fault will be triggered (fatal program crash). - _mm_load_ps1⚠
sse
Alias for_mm_load1_ps
- _mm_load_ss⚠
sse
Construct a__m128
with the lowest element read fromp
and the other elements set to zero. - _mm_loadr_ps⚠
sse
Loads fourf32
values from aligned memory into a__m128
in reverse order. - _mm_loadu_ps⚠
sse
Loads fourf32
values from memory into a__m128
. There are no restrictions on memory alignment. For aligned memory_mm_load_ps
may be faster. - _mm_loadu_si64⚠
sse
Loads unaligned 64-bits of integer data from memory into new vector. - _mm_max_ps⚠
sse
Compares packed single-precision (32-bit) floating-point elements ina
andb
, and return the corresponding maximum values. - _mm_max_ss⚠
sse
Compares the first single-precision (32-bit) floating-point element ofa
andb
, and return the maximum value in the first element of the return value, the other elements are copied froma
. - _mm_min_ps⚠
sse
Compares packed single-precision (32-bit) floating-point elements ina
andb
, and return the corresponding minimum values. - _mm_min_ss⚠
sse
Compares the first single-precision (32-bit) floating-point element ofa
andb
, and return the minimum value in the first element of the return value, the other elements are copied froma
. - _mm_move_ss⚠
sse
Returns a__m128
with the first component fromb
and the remaining components froma
. - _mm_movehl_ps⚠
sse
Combine higher half ofa
andb
. The higher half ofb
occupies the lower half of result. - _mm_movelh_ps⚠
sse
Combine lower half ofa
andb
. The lower half ofb
occupies the higher half of result. - _mm_movemask_ps⚠
sse
Returns a mask of the most significant bit of each element ina
. - _mm_mul_ps⚠
sse
Multiplies __m128 vectors. - _mm_mul_ss⚠
sse
Multiplies the first component ofa
andb
, the other components are copied froma
. - _mm_or_ps⚠
sse
Bitwise OR of packed single-precision (32-bit) floating-point elements. - _mm_prefetch⚠
sse
Fetch the cache line that contains addressp
using the givenSTRATEGY
. - _mm_rcp_ps⚠
sse
Returns the approximate reciprocal of packed single-precision (32-bit) floating-point elements ina
. - _mm_rcp_ss⚠
sse
Returns the approximate reciprocal of the first single-precision (32-bit) floating-point element ina
, the other elements are unchanged. - _mm_rsqrt_ps⚠
sse
Returns the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements ina
. - _mm_rsqrt_ss⚠
sse
Returns the approximate reciprocal square root of the first single-precision (32-bit) floating-point element ina
, the other elements are unchanged. - _mm_set1_ps⚠
sse
Construct a__m128
with all element set toa
. - _mm_set_ps⚠
sse
Construct a__m128
from four floating point values highest to lowest. - _mm_set_ps1⚠
sse
Alias for_mm_set1_ps
- _mm_set_ss⚠
sse
Construct a__m128
with the lowest element set toa
and the rest set to zero. - _mm_setcsr⚠
sse
Sets the MXCSR register with the 32-bit unsigned integer value. - _mm_setr_ps⚠
sse
Construct a__m128
from four floating point values lowest to highest. - _mm_setzero_ps⚠
sse
Construct a__m128
with all elements initialized to zero. - _mm_sfence⚠
sse
Performs a serializing operation on all store-to-memory instructions that were issued prior to this instruction. - _mm_shuffle_ps⚠
sse
Shuffles packed single-precision (32-bit) floating-point elements ina
andb
usingMASK
. - _mm_sqrt_ps⚠
sse
Returns the square root of packed single-precision (32-bit) floating-point elements ina
. - _mm_sqrt_ss⚠
sse
Returns the square root of the first single-precision (32-bit) floating-point element ina
, the other elements are unchanged. - _mm_store1_ps⚠
sse
Stores the lowest 32 bit float ofa
repeated four times into aligned memory. - _mm_store_ps⚠
sse
Stores four 32-bit floats into aligned memory. - _mm_store_ps1⚠
sse
Alias for_mm_store1_ps
- _mm_store_ss⚠
sse
Stores the lowest 32 bit float ofa
into memory. - _mm_storer_ps⚠
sse
Stores four 32-bit floats into aligned memory in reverse order. - _mm_storeu_ps⚠
sse
Stores four 32-bit floats into memory. There are no restrictions on memory alignment. For aligned memory_mm_store_ps
may be faster. - _mm_stream_ps⚠
sse
Storesa
into the memory atmem_addr
using a non-temporal memory hint. - _mm_sub_ps⚠
sse
Subtracts __m128 vectors. - _mm_sub_ss⚠
sse
Subtracts the first component ofb
froma
, the other components are copied froma
. - _mm_ucomieq_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if they are equal, or0
otherwise. This instruction will not signal an exception if either argument is a quiet NaN. - _mm_ucomige_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is greater than or equal to the one fromb
, or0
otherwise. This instruction will not signal an exception if either argument is a quiet NaN. - _mm_ucomigt_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is greater than the one fromb
, or0
otherwise. This instruction will not signal an exception if either argument is a quiet NaN. - _mm_ucomile_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is less than or equal to the one fromb
, or0
otherwise. This instruction will not signal an exception if either argument is a quiet NaN. - _mm_ucomilt_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if the value froma
is less than the one fromb
, or0
otherwise. This instruction will not signal an exception if either argument is a quiet NaN. - _mm_ucomineq_ss⚠
sse
Compares two 32-bit floats from the low-order bits ofa
andb
. Returns1
if they are not equal, or0
otherwise. This instruction will not signal an exception if either argument is a quiet NaN. - _mm_undefined_ps⚠
sse
Returns vector of type __m128 with indeterminate elements. Despite being “undefined”, this is some valid value and not equivalent tomem::MaybeUninit
. In practice, this is equivalent tomem::zeroed
. - _mm_unpackhi_ps⚠
sse
Unpacks and interleave single-precision (32-bit) floating-point elements from the higher half ofa
andb
. - _mm_unpacklo_ps⚠
sse
Unpacks and interleave single-precision (32-bit) floating-point elements from the lower half ofa
andb
. - _mm_xor_ps⚠
sse
Bitwise exclusive OR of packed single-precision (32-bit) floating-point elements.