Module core::core_arch::x86::avx512vbmi2

source ·
🔬This is a nightly-only experimental API. (stdsimd #48556)
Available on x86 or x86-64 only.

Functions

  • _mm256_mask_compress_epi8Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 8-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
  • _mm256_mask_compress_epi16Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 16-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
  • _mm256_mask_compressstoreu_epi8Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 8-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
  • _mm256_mask_compressstoreu_epi16Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 16-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
  • _mm256_mask_expand_epi8Experimentalavx512vbmi2,avx512vl
    Load contiguous active 8-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_expand_epi16Experimentalavx512vbmi2,avx512vl
    Load contiguous active 16-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_expandloadu_epi8Experimentalavx512f,avx512bw,avx512vbmi2,avx512vl,avx
    Load contiguous active 8-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_expandloadu_epi16Experimentalavx512f,avx512vbmi2,avx512vl,avx
    Load contiguous active 16-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_shldi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_shldi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_shldi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_shldv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm256_mask_shldv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm256_mask_shldv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm256_mask_shrdi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_shrdi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm256_mask_shrdi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst using writemask k (elements are copied from src“ when the corresponding mask bit is not set).
  • _mm256_mask_shrdv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm256_mask_shrdv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm256_mask_shrdv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm256_maskz_compress_epi8Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 8-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
  • _mm256_maskz_compress_epi16Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 16-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
  • _mm256_maskz_expand_epi8Experimentalavx512vbmi2,avx512vl
    Load contiguous active 8-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_expand_epi16Experimentalavx512vbmi2,avx512vl
    Load contiguous active 16-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_expandloadu_epi8Experimentalavx512f,avx512bw,avx512vbmi2,avx512vl,avx
    Load contiguous active 8-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_expandloadu_epi16Experimentalavx512f,avx512vbmi2,avx512vl,avx
    Load contiguous active 16-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shldi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shldi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shldi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shldv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shldv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shldv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shrdi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shrdi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shrdi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shrdv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shrdv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_maskz_shrdv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm256_shldi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst).
  • _mm256_shldi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst.
  • _mm256_shldi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst).
  • _mm256_shldv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst.
  • _mm256_shldv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst.
  • _mm256_shldv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst.
  • _mm256_shrdi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst.
  • _mm256_shrdi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst.
  • _mm256_shrdi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst.
  • _mm256_shrdv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst.
  • _mm256_shrdv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst.
  • _mm256_shrdv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst.
  • _mm512_mask_compress_epi8Experimentalavx512vbmi2
    Contiguously store the active 8-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
  • _mm512_mask_compress_epi16Experimentalavx512vbmi2
    Contiguously store the active 16-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
  • _mm512_mask_compressstoreu_epi8Experimentalavx512vbmi2
    Contiguously store the active 8-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
  • _mm512_mask_compressstoreu_epi16Experimentalavx512vbmi2
    Contiguously store the active 16-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
  • _mm512_mask_expand_epi8Experimentalavx512vbmi2
    Load contiguous active 8-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_expand_epi16Experimentalavx512vbmi2
    Load contiguous active 16-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_expandloadu_epi8Experimentalavx512f,avx512bw,avx512vbmi2
    Load contiguous active 8-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_expandloadu_epi16Experimentalavx512f,avx512bw,avx512vbmi2
    Load contiguous active 16-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_shldi_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_shldi_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_shldi_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_shldv_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm512_mask_shldv_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm512_mask_shldv_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm512_mask_shrdi_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_shrdi_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm512_mask_shrdi_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst using writemask k (elements are copied from src“ when the corresponding mask bit is not set).
  • _mm512_mask_shrdv_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm512_mask_shrdv_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm512_mask_shrdv_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm512_maskz_compress_epi8Experimentalavx512vbmi2
    Contiguously store the active 8-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
  • _mm512_maskz_compress_epi16Experimentalavx512vbmi2
    Contiguously store the active 16-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
  • _mm512_maskz_expand_epi8Experimentalavx512vbmi2
    Load contiguous active 8-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_expand_epi16Experimentalavx512vbmi2
    Load contiguous active 16-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_expandloadu_epi8Experimentalavx512f,avx512bw,avx512vbmi2
    Load contiguous active 8-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_expandloadu_epi16Experimentalavx512f,avx512bw,avx512vbmi2
    Load contiguous active 16-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shldi_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shldi_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shldi_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shldv_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shldv_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shldv_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shrdi_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shrdi_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shrdi_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shrdv_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shrdv_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_maskz_shrdv_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm512_shldi_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst).
  • _mm512_shldi_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst.
  • _mm512_shldi_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst).
  • _mm512_shldv_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst.
  • _mm512_shldv_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst.
  • _mm512_shldv_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst.
  • _mm512_shrdi_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst.
  • _mm512_shrdi_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst.
  • _mm512_shrdi_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst.
  • _mm512_shrdv_epi16Experimentalavx512vbmi2
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst.
  • _mm512_shrdv_epi32Experimentalavx512vbmi2
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst.
  • _mm512_shrdv_epi64Experimentalavx512vbmi2
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst.
  • _mm_mask_compress_epi8Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 8-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
  • _mm_mask_compress_epi16Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 16-bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src.
  • _mm_mask_compressstoreu_epi8Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 8-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
  • _mm_mask_compressstoreu_epi16Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 16-bit integers in a (those with their respective bit set in writemask k) to unaligned memory at base_addr.
  • _mm_mask_expand_epi8Experimentalavx512vbmi2,avx512vl
    Load contiguous active 8-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_expand_epi16Experimentalavx512vbmi2,avx512vl
    Load contiguous active 16-bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_expandloadu_epi8Experimentalavx512f,avx512vbmi2,avx512vl,avx,sse
    Load contiguous active 8-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_expandloadu_epi16Experimentalavx512f,avx512vbmi2,avx512vl,avx,sse
    Load contiguous active 16-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_shldi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_shldi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_shldi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_shldv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm_mask_shldv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm_mask_shldv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm_mask_shrdi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_shrdi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst using writemask k (elements are copied from src when the corresponding mask bit is not set).
  • _mm_mask_shrdi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst using writemask k (elements are copied from src“ when the corresponding mask bit is not set).
  • _mm_mask_shrdv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm_mask_shrdv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm_mask_shrdv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst using writemask k (elements are copied from a when the corresponding mask bit is not set).
  • _mm_maskz_compress_epi8Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 8-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
  • _mm_maskz_compress_epi16Experimentalavx512vbmi2,avx512vl
    Contiguously store the active 16-bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero.
  • _mm_maskz_expand_epi8Experimentalavx512vbmi2,avx512vl
    Load contiguous active 8-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_expand_epi16Experimentalavx512vbmi2,avx512vl
    Load contiguous active 16-bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_expandloadu_epi8Experimentalavx512f,avx512vbmi2,avx512vl,avx,sse
    Load contiguous active 8-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_expandloadu_epi16Experimentalavx512f,avx512vbmi2,avx512vl,avx,sse
    Load contiguous active 16-bit integers from unaligned memory at mem_addr (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shldi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shldi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shldi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shldv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shldv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shldv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shrdi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shrdi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shrdi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shrdv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shrdv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_maskz_shrdv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set).
  • _mm_shldi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by imm8 bits, and store the upper 16-bits in dst).
  • _mm_shldi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by imm8 bits, and store the upper 32-bits in dst.
  • _mm_shldi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by imm8 bits, and store the upper 64-bits in dst).
  • _mm_shldv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in a and b producing an intermediate 32-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 16-bits in dst.
  • _mm_shldv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in a and b producing an intermediate 64-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 32-bits in dst.
  • _mm_shldv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in a and b producing an intermediate 128-bit result. Shift the result left by the amount specified in the corresponding element of c, and store the upper 64-bits in dst.
  • _mm_shrdi_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by imm8 bits, and store the lower 16-bits in dst.
  • _mm_shrdi_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by imm8 bits, and store the lower 32-bits in dst.
  • _mm_shrdi_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by imm8 bits, and store the lower 64-bits in dst.
  • _mm_shrdv_epi16Experimentalavx512vbmi2,avx512vl
    Concatenate packed 16-bit integers in b and a producing an intermediate 32-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 16-bits in dst.
  • _mm_shrdv_epi32Experimentalavx512vbmi2,avx512vl
    Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst.
  • _mm_shrdv_epi64Experimentalavx512vbmi2,avx512vl
    Concatenate packed 64-bit integers in b and a producing an intermediate 128-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 64-bits in dst.
  • vcompressstoreb 🔒 Experimental
  • vcompressstoreb128 🔒 Experimental
  • vcompressstoreb256 🔒 Experimental
  • vcompressstorew 🔒 Experimental
  • vcompressstorew128 🔒 Experimental
  • vcompressstorew256 🔒 Experimental
  • vpcompressb 🔒 Experimental
  • vpcompressb128 🔒 Experimental
  • vpcompressb256 🔒 Experimental
  • vpcompressw 🔒 Experimental
  • vpcompressw128 🔒 Experimental
  • vpcompressw256 🔒 Experimental
  • vpexpandb 🔒 Experimental
  • vpexpandb128 🔒 Experimental
  • vpexpandb256 🔒 Experimental
  • vpexpandw 🔒 Experimental
  • vpexpandw128 🔒 Experimental
  • vpexpandw256 🔒 Experimental
  • vpshldvd 🔒 Experimental
  • vpshldvd128 🔒 Experimental
  • vpshldvd256 🔒 Experimental
  • vpshldvq 🔒 Experimental
  • vpshldvq128 🔒 Experimental
  • vpshldvq256 🔒 Experimental
  • vpshldvw 🔒 Experimental
  • vpshldvw128 🔒 Experimental
  • vpshldvw256 🔒 Experimental
  • vpshrdvd 🔒 Experimental
  • vpshrdvd128 🔒 Experimental
  • vpshrdvd256 🔒 Experimental
  • vpshrdvq 🔒 Experimental
  • vpshrdvq128 🔒 Experimental
  • vpshrdvq256 🔒 Experimental
  • vpshrdvw 🔒 Experimental
  • vpshrdvw128 🔒 Experimental
  • vpshrdvw256 🔒 Experimental