Quantisation and co. Reducing inference times on LLMs by 80% | Towards Data Science
Showing techniques to optimise your own LLMs – with code examples

Source: Towards Data Science
Showing techniques to optimise your own LLMs – with code examples