NN Quantisation

we managed to selectively quantize certain layers to higher bits (like 4bit) & leave most MoE layers (like those used in GPT-4) to 1.5bit. - Run DeepSeek R1D ynamic 1.58-bit / HN

DeepSeek-R1 Models

Written on March 17, 2025, Last update on March 17, 2025

NN quantisation