Alexandre Marques
Alexandre Marques's contributions
Article
We ran over half a million evaluations on quantized LLMs—here's what we found
Eldar Kurtić
+3
Quantized LLMs achieve near-full accuracy with minimal trade-offs after 500K+ evaluations, providing efficient, high-performance solutions for AI model deployment.
Article
How well do quantized models handle long-context tasks?
Eldar Kurtić
+3
4-bit and 8-bit quantized LLMs excel in long-context tasks, retaining over 99% accuracy across 4K to 64K sequence lengths.
Article
We ran over half a million evaluations on quantized LLMs—here's what we found
Eldar Kurtić
+3
Quantized LLMs achieve near-full accuracy with minimal trade-offs after 500K+ evaluations, providing efficient, high-performance solutions for AI model deployment.
Article
How well do quantized models handle long-context tasks?
Eldar Kurtić
+3
4-bit and 8-bit quantized LLMs excel in long-context tasks, retaining over 99% accuracy across 4K to 64K sequence lengths.