Hugging Face Quantization Tutorial

NeMo Export-Deploy

[03/12/2026] Deprecating Python 3.10 support: We're officially dropping Python 3.10 support with the upcoming 0.4.0 release. Downstream applications must raise their lower boundary to 3.12 to stay ...

Demystifying LLM Quantization: GPTQ, AWQ, and GGUF Explained

If VRAM is the brake pedal on local LLMs, quantization is how we ease the pressure. At its core, it’s simple: store numbers with fewer bits. But in practice, modern methods like GPTQ, AWQ, and GGUF ...

Geeky Gadgets

fine-tuning GPT-OSS : Complete Tutorial for Beginners & AI Developers

What if you could take a innovative language model like GPT-OSS and tailor it to your unique needs, all without needing a supercomputer or a PhD in machine learning? Fine-tuning large language models ...

How to Run DeepSeek Locally: Using Hugging Face and Quantization for Efficient Deployment

Most recent tutorials on implementing DeepSeek locally have used tools like Ollama for quick and easy deployment. While these tools are fast and user-friendly, they come with some limitations, such as ...

lablab

No-Code phi3 Fine-Tuning: A Hands-On Guide Using LlamaFactory

Hello!👋🏽 I'm Tommy, and today I'm excited to show you how to fine-tune the powerful Phi3 model without writing any code. Whether you're a software developer, AI enthusiast, or just someone curious ...

GitHub

EleutherAI/lm-evaluation-harness

Please see our updated documentation pages in docs/ for more details. Development will be continuing on the main branch, and we encourage you to give us feedback on what features are desired and how ...

搜狐

如何使用Code Llama构建自己的LLM编码助手

今日份知识你摄入了么？使用CodeLlama-7b-Instruction-hf和Streamlit创建本地LLM聊天机器人。我们将在本文中构建的编码助理聊天机器人在这个实践教程中，我们将实现一个可以免费使用并在本地GPU上运行的AI代码助手。你可以向聊天机器人提问，它会用自然语言和 ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果