Fine-tuning and Conditioning
Fine-Tuning and Conditioning
Customize base models using curated examples or on-the-fly instructions so they speak your organization’s language.
Introduction
Fine-tuning adapts a model with new training examples, while conditioning supplies context at inference time using prompts, embeddings, or system messages.
- Definition: Fine-tuning = updating model weights; conditioning = steering output without retraining.
- Why: Improve accuracy on domain language, tone, and repetitive tasks while controlling cost.
- Where: Customer support macros, document classification, branded copywriting, automated QA responses.
Syntax
from openai import OpenAI
import json
client = OpenAI()
# Step 1: Prepare training file (JSONL with instruction-completion pairs)
with open("support_faq.jsonl", "w", encoding="utf-8") as fp:
fp.write(json.dumps({
"messages": [
{"role": "system", "content": "You are a polite support agent."},
{"role": "user", "content": "Reset password instructions"},
{"role": "assistant", "content": "Sure! Visit portal.example.com/reset and follow the OTP prompts."}
]
}) + "\n")
training_file = client.files.create(file=open("support_faq.jsonl", "rb"), purpose="fine-tune")
# Step 2: Kick off fine-tune job
job = client.fine_tuning.jobs.create(
training_file=training_file.id,
model="gpt-4o-mini"
)
print(job.id)
Example
Example 1 — Using a Fine-Tuned Model
completion = client.chat.completions.create(
model="ft:gpt-4o-mini:ittechgenie:support-v1",
messages=[
{"role": "system", "content": "Always keep answers under 80 words."},
{"role": "user", "content": "My MFA token expired. What should I do?"}
]
)
print(completion.choices[0].message.content)
Output
Please open the security portal, choose “Replace MFA token,” and scan the new QR code. If you cannot access the portal, call the service desk at 555-0100 for manual verification.
Example 2 — Conditioning without Fine-Tuning
[
{"role": "system", "content": "You are the ItTechGenie release manager. Reply with risk level, rollback plan, and sign-off."},
{"role": "user", "content": "Deployment summary: build v5.6, tests green, one low-severity bug deferred."}
]
Explanation
- Training file uses short, representative conversations. Cover success and failure cases.
- Fine-tune job produces a new model ID. Track metrics like loss and validation accuracy.
- Conditioning remains vital after fine-tuning—system prompts reinforce policy and reduce drift.
- Cost/Latency considerations: fine-tuned models may incur higher storage fees but reduce per-request tokens.
Real-World Use Case
An MSP specializing in SaaS migrations fine-tunes a model on historical support chats. Agents now receive draft replies that mirror the company’s tone and escalation paths, cutting handle time by 35% while keeping satisfaction scores high.
Key Notes / Tips
- Audit training data for personally identifiable information before uploading.
- Start with 50–200 high-quality examples; quantity helps, but quality matters more.
- Version datasets and prompts so you can revert if performance regresses.
- Evaluate with hold-out conversations and human review before production rollout.
- Combine with retrieval (RAG) for rapidly changing facts instead of retraining every week.
Practice Exercise
- Create three JSONL examples for a billing support assistant. Include both correct answers and polite fallback messages.
- Simulate conditioning by writing a system prompt that forces answers into a bullet list with SLAs.
- Challenge: Design an evaluation rubric (accuracy, tone, escalation) to compare base vs. fine-tuned model outputs.
Summary
Fine-tuning captures your organization’s voice, while conditioning keeps responses aligned with current goals. Use both strategically to deliver assistants that feel bespoke yet remain maintainable.