Fine-tuning and Conditioning

Fine-Tuning and Conditioning

Customize base models using curated examples or on-the-fly instructions so they speak your organization’s language.

Introduction

Fine-tuning adapts a model with new training examples, while conditioning supplies context at inference time using prompts, embeddings, or system messages.

  • Definition: Fine-tuning = updating model weights; conditioning = steering output without retraining.
  • Why: Improve accuracy on domain language, tone, and repetitive tasks while controlling cost.
  • Where: Customer support macros, document classification, branded copywriting, automated QA responses.

Syntax

from openai import OpenAI
import json

client = OpenAI()

# Step 1: Prepare training file (JSONL with instruction-completion pairs)
with open("support_faq.jsonl", "w", encoding="utf-8") as fp:
    fp.write(json.dumps({
        "messages": [
            {"role": "system", "content": "You are a polite support agent."},
            {"role": "user", "content": "Reset password instructions"},
            {"role": "assistant", "content": "Sure! Visit portal.example.com/reset and follow the OTP prompts."}
        ]
    }) + "\n")

training_file = client.files.create(file=open("support_faq.jsonl", "rb"), purpose="fine-tune")

# Step 2: Kick off fine-tune job
job = client.fine_tuning.jobs.create(
    training_file=training_file.id,
    model="gpt-4o-mini"
)
print(job.id)

Example

Example 1 — Using a Fine-Tuned Model
completion = client.chat.completions.create(
    model="ft:gpt-4o-mini:ittechgenie:support-v1",
    messages=[
        {"role": "system", "content": "Always keep answers under 80 words."},
        {"role": "user", "content": "My MFA token expired. What should I do?"}
    ]
)
print(completion.choices[0].message.content)
Output
Please open the security portal, choose “Replace MFA token,” and scan the new QR code. If you cannot access the portal, call the service desk at 555-0100 for manual verification.
Example 2 — Conditioning without Fine-Tuning
[
  {"role": "system", "content": "You are the ItTechGenie release manager. Reply with risk level, rollback plan, and sign-off."},
  {"role": "user", "content": "Deployment summary: build v5.6, tests green, one low-severity bug deferred."}
]

Explanation

  • Training file uses short, representative conversations. Cover success and failure cases.
  • Fine-tune job produces a new model ID. Track metrics like loss and validation accuracy.
  • Conditioning remains vital after fine-tuning—system prompts reinforce policy and reduce drift.
  • Cost/Latency considerations: fine-tuned models may incur higher storage fees but reduce per-request tokens.

Real-World Use Case

An MSP specializing in SaaS migrations fine-tunes a model on historical support chats. Agents now receive draft replies that mirror the company’s tone and escalation paths, cutting handle time by 35% while keeping satisfaction scores high.

Key Notes / Tips

  • Audit training data for personally identifiable information before uploading.
  • Start with 50–200 high-quality examples; quantity helps, but quality matters more.
  • Version datasets and prompts so you can revert if performance regresses.
  • Evaluate with hold-out conversations and human review before production rollout.
  • Combine with retrieval (RAG) for rapidly changing facts instead of retraining every week.

Practice Exercise

  1. Create three JSONL examples for a billing support assistant. Include both correct answers and polite fallback messages.
  2. Simulate conditioning by writing a system prompt that forces answers into a bullet list with SLAs.
  3. Challenge: Design an evaluation rubric (accuracy, tone, escalation) to compare base vs. fine-tuned model outputs.

Summary

Fine-tuning captures your organization’s voice, while conditioning keeps responses aligned with current goals. Use both strategically to deliver assistants that feel bespoke yet remain maintainable.

© ItTechGenie