Theory is nice — but how does fine-tuning actually work? Here we show the three most common approaches: managed (OpenAI/Anthropic), open source with LoRA/QLoRA, and when each approach fits.
The easiest entry point — no own GPU needed.
# 1. Prepare data (JSONL)
openai tools fine_tunes.prepare_data -f training_data.jsonl
# 2. Start fine-tuning
openai api fine_tuning.jobs.create \
-t training_data.jsonl \
-m gpt-4o-mini-2025-09-01 \
--suffix "my-usecase"
# 3. Check status
openai api fine_tuning.jobs.list
# 4. Use finished model
openai api chat.completions.create \
-m ft:gpt-4o-mini-2025-09-01:org:my-usecase:abc123 \
-g user "Write a product text for..."
Cost (approx.): $8/1M training tokens, $3/1M inference tokens Duration: 30 min – 2 hours (depending on data size) Limitation: Only OpenAI models, no access to weights
Since 2025, Anthropic offers fine-tuning for Claude models.
For full control — on your own hardware or cloud GPUs.
Low-Rank Adaptation doesn't train all model parameters, just small additional matrices. This drastically reduces GPU requirements.
| Method | GPU RAM | Training Time | Quality |
|---|---|---|---|
| Full Fine-Tuning | 80+ GB | Hours–days | ✅ Maximum |
| LoRA | 16–24 GB | 30–60 min | ✅ Very good |
| QLoRA | 8–12 GB | 30–60 min | ✅ Good |
QLoRA loads the base model in 4-bit and trains LoRA adapters in 16-bit. Result: Fine-tuning a 70B model on a single A100 GPU.
| Model | Parameters | Strengths |
|---|---|---|
| Llama 3.1 | 8B / 70B / 405B | All-rounder, Meta license |
| Mistral Large | 123B | Strong for EU languages |
| Qwen 2.5 | 7B / 72B | Code + multilingual |
| Gemma 2 | 9B / 27B | Compact, Google-optimized |
Unsloth / Hugging Face TRL → Training
Weights & Biases / MLflow → Experiment tracking
vLLM / TGI → Inference server
| Criterion | Managed (OpenAI) | Open Source (LoRA) |
|---|---|---|
| Setup | 5 minutes | 2–4 hours |
| Cost per training | $10–100 | $5–50 (cloud GPU) |
| Control | Low | Full |
| Data privacy | Data at OpenAI | Data with you |
| Model ownership | No | Yes |
Practical tip: Start with managed fine-tuning (OpenAI/Anthropic) for proof of concept. If it works and data privacy or costs become an issue, migrate to open source with QLoRA.