Show HN: HIGGS – new sota data-free LLM quantization
huggingface.coMy colleagues and I wrote a paper and integrated it into transformers.
It has more of both accuracy and speed than NF4
We have compressed hf models for everyone to try: https://huggingface.co/collections/ISTA-DASLab/higgs-675308e...
No comments yet.