🔬This is a nightly-only experimental API. (
stdsimd
#48556)Available on x86 or x86-64 only.
Expand description
Streaming SIMD Extensions 4.1 (SSE4.1)
Constants
- round up and do not suppress exceptions
- use MXCSR.RC; see
vendor::_MM_SET_ROUNDING_MODE
- round down and do not suppress exceptions
- use MXCSR.RC and suppress exceptions; see
vendor::_MM_SET_ROUNDING_MODE
- round to nearest and do not suppress exceptions
- suppress exceptions
- do not suppress exceptions
- use MXCSR.RC and do not suppress exceptions; see
vendor::_MM_SET_ROUNDING_MODE
- round to nearest
- round down
- round up
- truncate
- truncate and do not suppress exceptions
Functions
- _mm_blend_epi16⚠
sse4.1
Blend packed 16-bit integers froma
andb
using the maskIMM8
. - _mm_blend_pd⚠
sse4.1
Blend packed double-precision (64-bit) floating-point elements froma
andb
using control maskIMM2
- _mm_blend_ps⚠
sse4.1
Blend packed single-precision (32-bit) floating-point elements froma
andb
using maskIMM4
- _mm_blendv_epi8⚠
sse4.1
Blend packed 8-bit integers froma
andb
usingmask
- _mm_blendv_pd⚠
sse4.1
Blend packed double-precision (64-bit) floating-point elements froma
andb
usingmask
- _mm_blendv_ps⚠
sse4.1
Blend packed single-precision (32-bit) floating-point elements froma
andb
usingmask
- _mm_ceil_pd⚠
sse4.1
Round the packed double-precision (64-bit) floating-point elements ina
up to an integer value, and stores the results as packed double-precision floating-point elements. - _mm_ceil_ps⚠
sse4.1
Round the packed single-precision (32-bit) floating-point elements ina
up to an integer value, and stores the results as packed single-precision floating-point elements. - _mm_ceil_sd⚠
sse4.1
Round the lower double-precision (64-bit) floating-point element inb
up to an integer value, store the result as a double-precision floating-point element in the lower element of the intrinsic result, and copies the upper element froma
to the upper element of the intrinsic result. - _mm_ceil_ss⚠
sse4.1
Round the lower single-precision (32-bit) floating-point element inb
up to an integer value, store the result as a single-precision floating-point element in the lower element of the intrinsic result, and copies the upper 3 packed elements froma
to the upper elements of the intrinsic result. - _mm_cmpeq_epi64⚠
sse4.1
Compares packed 64-bit integers ina
andb
for equality - _mm_cvtepi8_epi16⚠
sse4.1
Sign extend packed 8-bit integers ina
to packed 16-bit integers - _mm_cvtepi8_epi32⚠
sse4.1
Sign extend packed 8-bit integers ina
to packed 32-bit integers - _mm_cvtepi8_epi64⚠
sse4.1
Sign extend packed 8-bit integers in the low 8 bytes ofa
to packed 64-bit integers - _mm_cvtepi16_epi32⚠
sse4.1
Sign extend packed 16-bit integers ina
to packed 32-bit integers - _mm_cvtepi16_epi64⚠
sse4.1
Sign extend packed 16-bit integers ina
to packed 64-bit integers - _mm_cvtepi32_epi64⚠
sse4.1
Sign extend packed 32-bit integers ina
to packed 64-bit integers - _mm_cvtepu8_epi16⚠
sse4.1
Zeroes extend packed unsigned 8-bit integers ina
to packed 16-bit integers - _mm_cvtepu8_epi32⚠
sse4.1
Zeroes extend packed unsigned 8-bit integers ina
to packed 32-bit integers - _mm_cvtepu8_epi64⚠
sse4.1
Zeroes extend packed unsigned 8-bit integers ina
to packed 64-bit integers - _mm_cvtepu16_epi32⚠
sse4.1
Zeroes extend packed unsigned 16-bit integers ina
to packed 32-bit integers - _mm_cvtepu16_epi64⚠
sse4.1
Zeroes extend packed unsigned 16-bit integers ina
to packed 64-bit integers - _mm_cvtepu32_epi64⚠
sse4.1
Zeroes extend packed unsigned 32-bit integers ina
to packed 64-bit integers - _mm_dp_pd⚠
sse4.1
Returns the dot product of two __m128d vectors. - _mm_dp_ps⚠
sse4.1
Returns the dot product of two __m128 vectors. - _mm_extract_epi8⚠
sse4.1
Extracts an 8-bit integer froma
, selected withIMM8
. Returns a 32-bit integer containing the zero-extended integer data. - _mm_extract_epi32⚠
sse4.1
Extracts an 32-bit integer froma
selected withIMM8
- _mm_extract_ps⚠
sse4.1
Extracts a single-precision (32-bit) floating-point element froma
, selected withIMM8
. The returnedi32
stores the float’s bit-pattern, and may be converted back to a floating point number via casting. - _mm_floor_pd⚠
sse4.1
Round the packed double-precision (64-bit) floating-point elements ina
down to an integer value, and stores the results as packed double-precision floating-point elements. - _mm_floor_ps⚠
sse4.1
Round the packed single-precision (32-bit) floating-point elements ina
down to an integer value, and stores the results as packed single-precision floating-point elements. - _mm_floor_sd⚠
sse4.1
Round the lower double-precision (64-bit) floating-point element inb
down to an integer value, store the result as a double-precision floating-point element in the lower element of the intrinsic result, and copies the upper element froma
to the upper element of the intrinsic result. - _mm_floor_ss⚠
sse4.1
Round the lower single-precision (32-bit) floating-point element inb
down to an integer value, store the result as a single-precision floating-point element in the lower element of the intrinsic result, and copies the upper 3 packed elements froma
to the upper elements of the intrinsic result. - _mm_insert_epi8⚠
sse4.1
Returns a copy ofa
with the 8-bit integer fromi
inserted at a location specified byIMM8
. - _mm_insert_epi32⚠
sse4.1
Returns a copy ofa
with the 32-bit integer fromi
inserted at a location specified byIMM8
. - _mm_insert_ps⚠
sse4.1
Select a single value ina
to store at some position inb
, Then zero elements according toIMM8
. - _mm_max_epi8⚠
sse4.1
Compares packed 8-bit integers ina
andb
and returns packed maximum values in dst. - _mm_max_epi32⚠
sse4.1
Compares packed 32-bit integers ina
andb
, and returns packed maximum values. - _mm_max_epu16⚠
sse4.1
Compares packed unsigned 16-bit integers ina
andb
, and returns packed maximum. - _mm_max_epu32⚠
sse4.1
Compares packed unsigned 32-bit integers ina
andb
, and returns packed maximum values. - _mm_min_epi8⚠
sse4.1
Compares packed 8-bit integers ina
andb
and returns packed minimum values in dst. - _mm_min_epi32⚠
sse4.1
Compares packed 32-bit integers ina
andb
, and returns packed minimum values. - _mm_min_epu16⚠
sse4.1
Compares packed unsigned 16-bit integers ina
andb
, and returns packed minimum. - _mm_min_epu32⚠
sse4.1
Compares packed unsigned 32-bit integers ina
andb
, and returns packed minimum values. - _mm_minpos_epu16⚠
sse4.1
Finds the minimum unsigned 16-bit element in the 128-bit __m128i vector, returning a vector containing its value in its first position, and its index in its second position; all other elements are set to zero. - _mm_mpsadbw_epu8⚠
sse4.1
Subtracts 8-bit unsigned integer values and computes the absolute values of the differences to the corresponding bits in the destination. Then sums of the absolute differences are returned according to the bit fields in the immediate operand. - _mm_mul_epi32⚠
sse4.1
Multiplies the low 32-bit integers from each packed 64-bit element ina
andb
, and returns the signed 64-bit result. - _mm_mullo_epi32⚠
sse4.1
Multiplies the packed 32-bit integers ina
andb
, producing intermediate 64-bit integers, and returns the lowest 32-bit, whatever they might be, reinterpreted as a signed integer. Whilepmulld __m128i::splat(2), __m128i::splat(2)
returns the obvious__m128i::splat(4)
, due to wrapping arithmeticpmulld __m128i::splat(i32::MAX), __m128i::splat(2)
would return a negative number. - _mm_packus_epi32⚠
sse4.1
Converts packed 32-bit integers froma
andb
to packed 16-bit integers using unsigned saturation - _mm_round_pd⚠
sse4.1
Round the packed double-precision (64-bit) floating-point elements ina
using theROUNDING
parameter, and stores the results as packed double-precision floating-point elements. Rounding is done according to the rounding parameter, which can be one of: - _mm_round_ps⚠
sse4.1
Round the packed single-precision (32-bit) floating-point elements ina
using theROUNDING
parameter, and stores the results as packed single-precision floating-point elements. Rounding is done according to the rounding parameter, which can be one of: - _mm_round_sd⚠
sse4.1
Round the lower double-precision (64-bit) floating-point element inb
using theROUNDING
parameter, store the result as a double-precision floating-point element in the lower element of the intrinsic result, and copies the upper element froma
to the upper element of the intrinsic result. Rounding is done according to the rounding parameter, which can be one of: - _mm_round_ss⚠
sse4.1
Round the lower single-precision (32-bit) floating-point element inb
using theROUNDING
parameter, store the result as a single-precision floating-point element in the lower element of the intrinsic result, and copies the upper 3 packed elements froma
to the upper elements of the intrinsic result. Rounding is done according to the rounding parameter, which can be one of: - _mm_test_all_ones⚠
sse4.1
Tests whether the specified bits ina
128-bit integer vector are all ones. - _mm_test_all_zeros⚠
sse4.1
Tests whether the specified bits in a 128-bit integer vector are all zeros. - _mm_test_mix_ones_zeros⚠
sse4.1
Tests whether the specified bits in a 128-bit integer vector are neither all zeros nor all ones. - _mm_testc_si128⚠
sse4.1
Tests whether the specified bits in a 128-bit integer vector are all ones. - _mm_testnzc_si128⚠
sse4.1
Tests whether the specified bits in a 128-bit integer vector are neither all zeros nor all ones. - _mm_testz_si128⚠
sse4.1
Tests whether the specified bits in a 128-bit integer vector are all zeros.