De / Interleave bits
Interleaving the binary coordinate values yields binary z-values. Connecting the z-values in their numerical order produces the recursively Z-shaped curve. - Morton Codes
morton(0b1100,0b0011)
│││└──┐││││
││└─┐ │││││
│└┐ │ │││││
│┌─────┘│││
│││ │ │ │││
│││┌────┘││
│││││ │ ││
│││││┌───┘│
│││││││┌──┘
0b10100101
see also:
- Morton (github) / HN - a SWAR implementation is used which takes 2 nanoseconds. If -mbmi2 is available, then morton() and unmorton() take less than one nanosecond.
- Left Pack
- Fastest way to unpack 32 bits to a 32 byte vector
- Optimal uint8_t bitmap into a 8 x 32bit SIMD “bool” vector
- Unpacking a bitfield (Inverse of movmskb)
- How to create a byte out of 8 bool values (and vice versa)?
Morton Codes
Provides an ordering along a space-filling curve while preserving data locality. - Constructing Acceleration Datastructures (raytracing).
- How to use Morton Order(z order curve) in range search?
- Morton-code
- How to compute a 3D Morton number (interleave the bits of 3 ints)
Interleaving bits
x &= 0x0000ffff; // x = ---- ---- ---- ---- fedc ba98 7654 3210
x = (x ^ (x << 8)) & 0x00ff00ff; // x = ---- ---- fedc ba98 ---- ---- 7654 3210
x = (x ^ (x << 4)) & 0x0f0f0f0f; // x = ---- fedc ---- ba98 ---- 7654 ---- 3210
x = (x ^ (x << 2)) & 0x33333333; // x = --fe --dc --ba --98 --76 --54 --32 --10
x = (x ^ (x << 1)) & 0x55555555; // x = -f-e -d-c -b-a -9-8 -7-6 -5-4 -3-2 -1-0
De-interleaving bits
x &= 0x55555555; // x = -f-e -d-c -b-a -9-8 -7-6 -5-4 -3-2 -1-0
x = (x ^ (x >> 1)) & 0x33333333; // x = --fe --dc --ba --98 --76 --54 --32 --10
x = (x ^ (x >> 2)) & 0x0f0f0f0f; // x = ---- fedc ---- ba98 ---- 7654 ---- 3210
x = (x ^ (x >> 4)) & 0x00ff00ff; // x = ---- ---- fedc ba98 ---- ---- 7654 3210
x = (x ^ (x >> 8)) & 0x0000ffff; // x = ---- ---- ---- ---- fedc ba98 7654 3210
Parallel bit deposit and extract
The PDEP and PEXT instructions are new generalized bit-level compress and expand instructions.
This can be used to extract any bitfield of the input, and even do a lot of bit-level shuffling that previously would have been expensive. While what these instructions do is similar to bit level gather-scatter SIMD instructions, PDEP and PEXT instructions operate on general-purpose registers.
Instruction | Selector mask | Source | Destination |
PEXT | 0xff00fff0 | 0x12345678 | 0x00012567 |
PDEP | 0xff00fff0 | 0x00012567 | 0x12005670 |
They are accessible in c++
_pext_u64(__X, __Y);
_pdep_u64(__X, __Y);
see also: