Finetuning Llms: Difference between revisions

Latest revision as of 01:51, 25 April 2026

How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?

Fine-tuning is the process of taking a large language model (LLM) that has already been pre-trained on a vast corpus and continuing its training on a smaller, task-specific dataset to specialize its capabilities. It is one of the most powerful techniques in practical AI deployment, enabling organizations to adapt frontier models to domain-specific language, formats, reasoning styles, or behaviors — often with only thousands of examples. Fine-tuning sits at the intersection of deep learning theory and production engineering.

Remembering[edit]

Pre-training — The initial phase where a model is trained on massive, general-purpose datasets to develop broad language capabilities. This is done once and is extremely expensive.
Fine-tuning — Continuing training of a pre-trained model on a smaller dataset to specialize behavior. The model's weights are adjusted, typically starting from the pre-trained state.
Supervised Fine-Tuning (SFT) — Fine-tuning on labeled input-output pairs, teaching the model to follow instructions or produce specific response formats.
Instruction tuning — A form of SFT where the model is trained on instruction-following examples to make it more helpful and controllable.
RLHF (Reinforcement Learning from Human Feedback) — A multi-stage process: SFT, then reward model training, then RL optimization — used to align model outputs with human preferences.
LoRA (Low-Rank Adaptation) — A parameter-efficient fine-tuning technique that adds small trainable low-rank matrices to frozen base model weights, drastically reducing compute and memory requirements.
QLoRA — LoRA applied to a quantized base model (typically 4-bit), enabling fine-tuning of large models on consumer GPUs.
PEFT (Parameter-Efficient Fine-Tuning) — An umbrella term for methods like LoRA, Prefix Tuning, and Adapter layers that update only a small fraction of model parameters.
Catastrophic forgetting — The tendency of a model to lose previously learned capabilities when trained extensively on new data.
Learning rate — Typically much lower during fine-tuning than pre-training (e.g., 1e-5 to 2e-4) to avoid destroying pre-trained representations.
Chat template — A structured format for instruction-tuned models defining how system prompts, user turns, and assistant turns are delimited.
Prompt template — The format used to structure training examples, which must match the format used at inference time.
Validation loss — The key metric monitored during fine-tuning to detect overfitting and determine when to stop.

Understanding[edit]

Fine-tuning works because pre-trained LLMs have already learned rich representations of language, facts, and reasoning patterns. Fine-tuning doesn't teach the model new knowledge so much as it reconfigures how the model accesses and expresses what it already knows.

Analogy: A pre-trained LLM is like a broadly educated graduate. Fine-tuning is like a specialized internship — they don't forget everything they learned in university; they learn how to apply their knowledge in a specific context, following specific conventions and communicating in specific ways.

Full fine-tuning updates all model parameters. It is most powerful but requires enormous compute (multiple GPUs, hours to days) and is prone to catastrophic forgetting of general capabilities.

LoRA (Low-Rank Adaptation) is the dominant technique in practice. It freezes the original weights and adds small trainable matrices A and B to each attention layer such that the effective weight update is W + ΔW = W + AB, where A is d×r and B is r×d, with rank r ≪ d. With r=16, a 7B model might add only ~20M trainable parameters (0.3% of total). This dramatically reduces compute, memory, and overfitting risk.

The data format matters enormously. Fine-tuning teaches the model a specific input-output pattern. If training examples don't precisely match the inference format (including chat templates, special tokens, and prompt structures), the model will underperform.

Applying[edit]

LoRA fine-tuning with HuggingFace + PEFT:

<syntaxhighlight lang="python"> from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments from peft import LoraConfig, get_peft_model, TaskType from trl import SFTTrainer import datasets

Load base model (quantized for efficiency)

model = AutoModelForCausalLM.from_pretrained(

   "meta-llama/Llama-2-7b-hf",
   load_in_4bit=True,      # QLoRA: quantize to 4-bit
   device_map="auto"

)

LoRA configuration

lora_config = LoraConfig(

   r=16,                           # Rank
   lora_alpha=32,                  # Scaling factor
   target_modules=["q_proj", "v_proj"],  # Which layers to adapt
   lora_dropout=0.05,
   bias="none",
   task_type=TaskType.CAUSAL_LM

)

model = get_peft_model(model, lora_config) model.print_trainable_parameters()

trainable params: 4,194,304 || all params: 6,742,609,920 || trainable%: 0.06%

Training setup

training_args = TrainingArguments(

   output_dir="./finetuned_model",
   num_train_epochs=3,
   per_device_train_batch_size=4,
   gradient_accumulation_steps=4,
   learning_rate=2e-4,
   fp16=True,
   save_steps=100,
   logging_steps=25,

)

Dataset: each sample has "text" field with full formatted prompt+response

dataset = datasets.load_dataset("json", data_files="train.jsonl")["train"]

trainer = SFTTrainer(

   model=model,
   args=training_args,
   train_dataset=dataset,
   dataset_text_field="text",
   max_seq_length=2048,

) trainer.train() </syntaxhighlight>

Data format for instruction tuning (Llama chat template): System → Defines the model's role and constraints; User turn → The instruction or question; Assistant turn → The desired response (what the model learns to produce); Special tokens → [INST], [/INST], <<SYS>> etc. must exactly match the model's chat template

Analyzing[edit]

Fine-tuning Method Comparison
Method	Params Updated	GPU Memory	Risk of Forgetting	Quality
Full fine-tuning	100%	Very high (multiple GPUs)	High	Highest
LoRA	0.1–1%	Low (1 GPU possible)	Low	Near-full for most tasks
QLoRA	0.1–1% (on 4-bit model)	Very low (fits on 24GB GPU)	Low	Slightly below LoRA
Prefix tuning	~0.1%	Low	Very low	Moderate
Prompt tuning	~0.01%	Very low	Very low	Lower than LoRA

Failure modes:

Overfitting on small datasets — With <500 examples, the model can memorize rather than generalize. Monitor validation loss; stop early.
Format mismatch — Training on incorrectly formatted examples causes the model to generate malformed outputs or include spurious tokens.
Instruction following collapse — Aggressive fine-tuning can make the model rigid, losing the flexibility to handle instructions it wasn't trained on.
Reward hacking (RLHF) — The model learns to produce responses that score well according to the reward model without actually being more helpful — for example, becoming verbose without substance.
Capability regression — Fine-tuning on a narrow task can degrade performance on other tasks. Evaluate on a broad benchmark before and after.

Evaluating[edit]

Expert practitioners treat fine-tuning evaluation as multi-dimensional:

Task-specific metrics: Whatever the downstream task demands — ROUGE for summarization, exact match for QA, pass@k for code generation, human preference rates for chat.

General capability retention: Run the fine-tuned model on standard benchmarks (MMLU, HellaSwag, HumanEval) to verify general capabilities weren't degraded. A model fine-tuned for customer service shouldn't lose its ability to reason.

Alignment and safety evaluation: Does fine-tuning introduce new failure modes? Run adversarial prompts, jailbreak attempts, and harmful content evaluations on the fine-tuned model.

Human preference evaluation (A/B testing): For conversational models, human raters compare base model vs. fine-tuned model outputs on real user queries. This is the ground truth for whether fine-tuning achieved its goal.

Expert practitioners maintain a regression test suite — a fixed set of prompts with expected behaviors — and run it after every fine-tuning run to catch regressions automatically.

Creating[edit]

Designing a full fine-tuning pipeline:

1. Dataset curation (most important step) <syntaxhighlight lang="text"> Source data collection (domain documents, logs, demonstrations)

↓

Quality filtering (deduplication, length filtering, toxic content removal)

↓

Formatting (convert to chat template, add system prompt)

↓

Review sample (manually inspect 100+ examples)

↓

Train/validation split (90/10 or 95/5) </syntaxhighlight>

2. Training configuration decision tree

<1k examples and 1 GPU → QLoRA with early stopping
1k–100k examples and 2–8 GPUs → LoRA with gradient checkpointing
>100k examples and production budget → Full fine-tune with DDP/FSDP

3. Iterative refinement loop <syntaxhighlight lang="text"> v1: SFT on demonstrations

   ↓ evaluate → identify failure cases

v2: Add failure case examples to dataset, retrain

   ↓ evaluate → identify preference gaps

v3: Collect human preference data → train reward model → PPO/DPO fine-tune </syntaxhighlight>

4. Serving the fine-tuned model

Merge LoRA adapters into base model: model.mergeandunload()
Export to GGUF format for llama.cpp (local/edge deployment)
Push to HuggingFace Hub or deploy with vLLM for API serving

@@ Line 1: / Line 1: @@
+<div style="background-color: #4B0082; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
 {{BloomIntro}}
 Fine-tuning is the process of taking a large language model (LLM) that has already been pre-trained on a vast corpus and continuing its training on a smaller, task-specific dataset to specialize its capabilities. It is one of the most powerful techniques in practical AI deployment, enabling organizations to adapt frontier models to domain-specific language, formats, reasoning styles, or behaviors — often with only thousands of examples. Fine-tuning sits at the intersection of deep learning theory and production engineering.
+</div>
-== Remembering ==
+__TOC__
+<div style="background-color: #000080; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
+== <span style="color: #FFFFFF;">Remembering</span> ==
 * '''Pre-training''' — The initial phase where a model is trained on massive, general-purpose datasets to develop broad language capabilities. This is done once and is extremely expensive.
 * '''Fine-tuning''' — Continuing training of a pre-trained model on a smaller dataset to specialize behavior. The model's weights are adjusted, typically starting from the pre-trained state.
@@ Line 16: / Line 21: @@
 * '''Prompt template''' — The format used to structure training examples, which must match the format used at inference time.
 * '''Validation loss''' — The key metric monitored during fine-tuning to detect overfitting and determine when to stop.
+</div>
-== Understanding ==
+<div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
+== <span style="color: #FFFFFF;">Understanding</span> ==
 Fine-tuning works because pre-trained LLMs have already learned rich representations of language, facts, and reasoning patterns. Fine-tuning doesn't teach the model new knowledge so much as it '''reconfigures how the model accesses and expresses what it already knows'''.
@@ Line 27: / Line 34: @@
 The '''data format''' matters enormously. Fine-tuning teaches the model a specific input-output pattern. If training examples don't precisely match the inference format (including chat templates, special tokens, and prompt structures), the model will underperform.
+</div>
-== Applying ==
+<div style="background-color: #8B0000; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
+== <span style="color: #FFFFFF;">Applying</span> ==
 '''LoRA fine-tuning with HuggingFace + PEFT:'''
@@ Line 88: / Line 97: @@
 : '''Assistant turn''' → The desired response (what the model learns to produce)
 : '''Special tokens''' → [INST], [/INST], <<SYS>> etc. must exactly match the model's chat template
+</div>
-== Analyzing ==
+<div style="background-color: #8B4500; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
+== <span style="color: #FFFFFF;">Analyzing</span> ==
 {| class="wikitable"
 |+ Fine-tuning Method Comparison
@@ Line 111: / Line 122: @@
 * '''Reward hacking (RLHF)''' — The model learns to produce responses that score well according to the reward model without actually being more helpful — for example, becoming verbose without substance.
 * '''Capability regression''' — Fine-tuning on a narrow task can degrade performance on other tasks. Evaluate on a broad benchmark before and after.
+</div>
-== Evaluating ==
+<div style="background-color: #483D8B; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
+== <span style="color: #FFFFFF;">Evaluating</span> ==
 Expert practitioners treat fine-tuning evaluation as multi-dimensional:
@@ Line 124: / Line 137: @@
 Expert practitioners maintain a '''regression test suite''' — a fixed set of prompts with expected behaviors — and run it after every fine-tuning run to catch regressions automatically.
+</div>
-== Creating ==
+<div style="background-color: #2F4F4F; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
+== <span style="color: #FFFFFF;">Creating</span> ==
 Designing a full fine-tuning pipeline:
@@ Line 156: / Line 171: @@
 '''4. Serving the fine-tuned model'''
-* Merge LoRA adapters into base model: `model.merge''and''unload()`
+* Merge LoRA adapters into base model: <code>model.merge''and''unload()</code>
 * Export to GGUF format for llama.cpp (local/edge deployment)
 * Push to HuggingFace Hub or deploy with vLLM for API serving
@@ Line 163: / Line 178: @@
 [[Category:Large Language Models]]
 [[Category:Machine Learning]]
+</div>

Finetuning Llms: Difference between revisions

Latest revision as of 01:51, 25 April 2026

Contents

Remembering[edit]

Understanding[edit]

Applying[edit]

Analyzing[edit]

Evaluating[edit]

Creating[edit]

Navigation menu

Finetuning Llms: Difference between revisions

Latest revision as of 01:51, 25 April 2026

Remembering[edit]

Understanding[edit]

Applying[edit]

Analyzing[edit]

Evaluating[edit]

Creating[edit]

Navigation menu

Search