Architecture Pattern

Parameter-Efficient Fine-Tuning (LoRA / QLoRA)

Fine-tuning shines when your domain has highly specialised vocabulary, a strict output format, or latency requirements below 300 ms. LoRA and QLoRA adapt only a small fraction of model weights, keeping training costs manageable ($1 K–$25 K per run). The resulting model is faster at inference and requires no retrieval hop, but it cannot incorporate new information without a retraining cycle.

Training compute ($1 K–$25 K per LoRA run depending on model size and dataset)Amortised training cost over 6-month deployment windowHosted fine-tuned model inference (typically 20% premium over base)Periodic retraining reserve (~2× initial cost per year)

Cost model

  • Training compute ($1 K–$25 K per LoRA run depending on model size and dataset)
  • Amortised training cost over 6-month deployment window
  • Hosted fine-tuned model inference (typically 20% premium over base)
  • Periodic retraining reserve (~2× initial cost per year)

When to pick this pattern

  • Domain vocabulary is highly specialised (medical, legal, financial jargon)
  • Consistent output format or tone is required
  • Latency SLA < 300 ms and retrieval hop is unacceptable
  • Dataset of 1 K–100 K high-quality labelled examples is available
  • Static or slow-moving corpus (refreshes monthly or less)
  • Strong ML team in-house (capability ≥ 4)

When to avoid it

  • Data freshness requirement is daily or faster
  • Audit-grade citations are mandatory (fine-tuned models hallucinate provenance)
  • ML team is at capability level 1 or 2
  • Budget ceiling < $5 K (training run alone may exceed this)

Common pitfalls

  • Catastrophic forgetting if training data distribution is too narrow
  • Hallucinated citations — model may fabricate sources not in training data
  • Hidden retraining costs when underlying base model is deprecated

Frequently asked questions

Is Parameter-Efficient Fine-Tuning (LoRA / QLoRA) right for your workload?

Answer 9 questions to get a deterministic recommendation, cost crossover chart, and PDF report.

Run the full decision wizard