ML & Data Pipelines
Live training progress on your desktop. Know when your model finishes, when accuracy plateaus, and when the pipeline breaks — without watching a terminal.
The problem with training in the background
You kick off a training run, switch to writing documentation, and check back 45 minutes later only to find it crashed at epoch 3. Or it finished 20 minutes ago and you have been waiting on nothing. GPU time is expensive — wasted idle time between runs adds up fast.
syncfu puts training progress directly on your screen as a persistent overlay. You see the epoch count, loss value, and ETA without leaving your current window. When training completes, an overlay with Deploy and Reject buttons lets you act immediately.
Training progress with live updates
Send a notification at the start of training, then update it after each epoch with the latest metrics:
import requests
import json
# Start training notification
resp = requests.post("http://localhost:9868/notify", json={
"sender": "training",
"title": "Training run #7",
"body": "Starting training...",
"icon": "brain",
"progress": {"value": 0, "label": "Epoch 0/100", "style": "bar"},
"group": "training-run-7",
})
notif_id = resp.json()["id"]
# After each epoch, update the overlay
for epoch in range(1, 101):
train_one_epoch()
loss = get_current_loss()
val_acc = get_val_accuracy()
requests.post(f"http://localhost:9868/notify/{notif_id}/update", json={
"progress": {"value": epoch / 100, "label": f"Epoch {epoch}/100"},
"body": f"Loss: {loss:.4f} — Val accuracy: {val_acc:.1%}",
})
# Training complete — show results with action buttons
requests.post(f"http://localhost:9868/notify/{notif_id}/update", json={
"title": "Training complete",
"body": f"Final accuracy: {val_acc:.1%} — Loss: {loss:.4f}",
"progress": {"value": 1.0, "label": "Done"},
"actions": [
{"id": "deploy", "label": "Deploy model", "style": "primary"},
{"id": "discard", "label": "Discard", "style": "danger"},
],
})Data pipeline stage tracking
For multi-stage ETL or data processing pipelines, update a single notification as each stage completes:
#!/usr/bin/env bash # Data pipeline with stage tracking ID=$(syncfu send -t "Data Pipeline" --progress 0.0 \ --progress-label "Extract" -i loader --json \ "Starting data extraction..." | jq -r .id) # Stage 1: Extract run_extract syncfu update $ID --progress 0.33 \ --progress-label "Transform" \ --body "Extracted 2.4M rows. Transforming..." # Stage 2: Transform run_transform syncfu update $ID --progress 0.66 \ --progress-label "Load" \ --body "Transformed. Loading to warehouse..." # Stage 3: Load run_load syncfu update $ID --progress 1.0 \ --progress-label "Complete" \ --body "Pipeline complete. 2.4M rows loaded in 12m 34s."
Model evaluation with Deploy / Reject gate
After training completes, the evaluation results appear as a notification with action buttons. Use --wait to block until you decide:
# Block until human decision ACTION=$(syncfu send -t "Model v2.3 Evaluation" -p high \ -i trophy -s evaluator \ -a "deploy:Deploy to prod:primary" \ -a "staging:Deploy to staging:secondary" \ -a "reject:Reject:danger" \ --wait \ "Accuracy: 94.2% (+1.8%) | F1: 0.91 | Latency: 12ms") case $ACTION in deploy) deploy_model production ;; staging) deploy_model staging ;; reject) echo "Model rejected" ;; esac
GPU utilization alerts
# Alert when GPU is idle (training crashed or finished)
GPU_UTIL=$(nvidia-smi --query-gpu=utilization.gpu \
--format=csv,noheader,nounits | head -1)
if [ "$GPU_UTIL" -lt 5 ]; then
syncfu send -t "GPU Idle" -p high -i cpu \
"GPU utilization is $GPU_UTIL%. Training may have stopped." \
-a "check:Check logs:primary" -a "dismiss:Dismiss:secondary"
fiTraining progress
Live progress bar with epoch count, loss, and accuracy. Updates in real time after each epoch. Works with PyTorch, TensorFlow, JAX, and any framework.
Pipeline stages
Track extract → transform → load stages with a single notification that updates as each stage completes. Know exactly where your pipeline is.
Evaluation gates
Model evaluation results with Deploy / Staging / Reject buttons. The --wait flag blocks until you make a decision.
GPU monitoring
Detect idle GPUs when training crashes or finishes. Stop wasting expensive compute time waiting on runs that already stopped.
Frequently asked questions
Can syncfu show live training progress from PyTorch or TensorFlow?
Yes. Send an initial notification with --progress 0, then POST updates to /notify/{id}/update after each epoch. The overlay updates in real time with an animated progress bar and custom label showing loss, accuracy, or ETA. This works from any language that can make HTTP requests.
How do I get notified when a training run finishes overnight?
syncfu is a desktop overlay — it shows notifications when you are at your machine. For overnight runs, pair it with ntfy for phone alerts. When you return to your desk, the syncfu overlay will still be showing the final results with action buttons to deploy or discard the model.
Does this work with Jupyter notebooks?
Yes. Use Python requests to POST to http://localhost:9868/notify from any notebook cell. You can send notifications at the end of training, after evaluation, or at any checkpoint. The overlay appears on your desktop regardless of which browser tab or window is focused.
Can I track multiple training runs simultaneously?
Yes. Each notification has a unique ID and can be grouped with the --group flag. Run multiple experiments and each one updates its own notification independently. The overlays stack on your screen so you can see all runs at a glance.
Related
- Python integration — send notifications from any training script
- Notification payload — progress bars, action buttons, and style properties
Never miss a training run again
Open source, runs locally, zero config. Install in 30 seconds and get live training progress on your desktop.
Get started →