Fine-tune Transcriber
Configure and monitor transcription model fine-tuning jobs
Configure training parameters and monitor fine-tuning progress. Select a base model, training dataset, and hyperparameters to start a new fine-tuning job.
Training Configuration
Enable for mixed-language audio. Disable for single-language (e.g. Lithuanian only).
LoRA Parameters
64
128
Select which attention layers to apply LoRA adapters. Default: all projection layers.
Data Preparation
Applied
Converts written numbers to spoken form
16kHz → 8kHz
Simulates telephony audio quality
Threshold: 30%
Excludes samples with baseline WER above threshold
Training Progress
Training
Epoch 7 / 10 — 70%
Training Loss
0.234
Validation WER
12.3%
Learning Rate
0.0001
ETA
8m 30s
[14:23:45] Starting epoch 7/10...
[14:23:46] Batch 1/16 — loss: 0.241
[14:23:48] Batch 8/16 — loss: 0.228
[14:23:50] Batch 16/16 — loss: 0.234
[14:23:51] Epoch 7 complete — val_wer: 12.3%, val_loss: 0.198
[14:23:51] Checkpoint saved: whisper-lg-v3-ft-epoch7.pt
Post-Training Pipeline
After training completes, the model goes through merge, conversion, and upload stages before deployment.
Train LoRA Adapter
⟳
Merge LoRA Weights
—
Convert to CTranslate2
—
Upload to HuggingFace