all-about-SIMD-on_x86_64
  • Introduction
  • 1.Vectorized Memory Access
  • 2.Intel 64 instructions demo
  • 3.Intel 64 Base Architecture notes
  • 4.Intel 64 Base Architecture:2
  • 5.Intel 64 Base Architecture:3
  • 6.Intel Architecture Optimization
Powered by GitBook
On this page

Was this helpful?

5.Intel 64 Base Architecture:3

Intel AVX extension introduced 256-bit vector processing capability,AVX employs VEX prefix to extend access to 256-bit registers . the new registers are YMM0 through YMM16 in 64-bit mode,the lower 128-bit are alias to XMM registers .the legacy SSE instructions (without VEX prefix) can not access upper 128-bit registers.

AVX provides 256-bit floating-point arithmetic processing ,including add/substract/multiply/divide... .

AVX also provides 256-bit non-arithmetic instructions,including load/store/shuffle/blend/insert/extract... .

AVX provides a full complement if 128-bit numeric processing instructions which employ VEX prefix encoding ,these vex-encoded instruction generally provide the same functionality over instructions operating on XMM registers which are encoded using SIMD prefixes.

as for non-arithmetic processing ,AVX does the same thing,it provides complementary instructions for SIMD prefixed 128-bit non-arithmetic instructions.

AVX2 follows the same programming model as AVX instruction,but enhancement can be gained by encoding with VEX and extending to 256-bit, there are lots of VEX-only instructions whose mnemonics usually start with a 'v'.

AVX also introduced general purpose instruction set. lzcnt/tzcnt are bit-manipulation instructions without VEX prefix,other instructions are vex-prefixed.

as the summary ,we have the following points:

  • legacy SIMD can not access upper 128-bit of YMM registers,and VEX.128 prefixed instructions will zero these bit, and these bits are usually accessed by VEX.256 prefixed instructions.

  • memory alignment requirement,explicitly-aligned/explicitly-unaligned/most of the others must be aligned,but for VEX-prefixed ,we have more relaxed alignment requirement.

Previous4.Intel 64 Base Architecture:2Next6.Intel Architecture Optimization

Last updated 4 years ago

Was this helpful?