ML & Data Pipelines

Live training progress on your desktop. Know when your model finishes, when accuracy plateaus, and when the pipeline breaks — without watching a terminal.

The problem with training in the background

You kick off a training run, switch to writing documentation, and check back 45 minutes later only to find it crashed at epoch 3. Or it finished 20 minutes ago and you have been waiting on nothing. GPU time is expensive — wasted idle time between runs adds up fast.

syncfu puts training progress directly on your screen as a persistent overlay. You see the epoch count, loss value, and ETA without leaving your current window. When training completes, an overlay with Deploy and Reject buttons lets you act immediately.

Training progress with live updates

Send a notification at the start of training, then update it after each epoch with the latest metrics:

import requests
import json

# Start training notification
resp = requests.post("http://localhost:9868/notify", json={
    "sender": "training",
    "title": "Training run #7",
    "body": "Starting training...",
    "icon": "brain",
    "progress": {"value": 0, "label": "Epoch 0/100", "style": "bar"},
    "group": "training-run-7",
})
notif_id = resp.json()["id"]

# After each epoch, update the overlay
for epoch in range(1, 101):
    train_one_epoch()
    loss = get_current_loss()
    val_acc = get_val_accuracy()

    requests.post(f"http://localhost:9868/notify/{notif_id}/update", json={
        "progress": {"value": epoch / 100, "label": f"Epoch {epoch}/100"},
        "body": f"Loss: {loss:.4f} — Val accuracy: {val_acc:.1%}",
    })

# Training complete — show results with action buttons
requests.post(f"http://localhost:9868/notify/{notif_id}/update", json={
    "title": "Training complete",
    "body": f"Final accuracy: {val_acc:.1%} — Loss: {loss:.4f}",
    "progress": {"value": 1.0, "label": "Done"},
    "actions": [
        {"id": "deploy", "label": "Deploy model", "style": "primary"},
        {"id": "discard", "label": "Discard", "style": "danger"},
    ],
})

Data pipeline stage tracking

For multi-stage ETL or data processing pipelines, update a single notification as each stage completes:

#!/usr/bin/env bash
# Data pipeline with stage tracking

ID=$(syncfu send -t "Data Pipeline" --progress 0.0 \
  --progress-label "Extract" -i loader --json \
  "Starting data extraction..." | jq -r .id)

# Stage 1: Extract
run_extract
syncfu update $ID --progress 0.33 \
  --progress-label "Transform" \
  --body "Extracted 2.4M rows. Transforming..."

# Stage 2: Transform
run_transform
syncfu update $ID --progress 0.66 \
  --progress-label "Load" \
  --body "Transformed. Loading to warehouse..."

# Stage 3: Load
run_load
syncfu update $ID --progress 1.0 \
  --progress-label "Complete" \
  --body "Pipeline complete. 2.4M rows loaded in 12m 34s."

Model evaluation with Deploy / Reject gate

After training completes, the evaluation results appear as a notification with action buttons. Use --wait to block until you decide:

# Block until human decision
ACTION=$(syncfu send -t "Model v2.3 Evaluation" -p high \
  -i trophy -s evaluator \
  -a "deploy:Deploy to prod:primary" \
  -a "staging:Deploy to staging:secondary" \
  -a "reject:Reject:danger" \
  --wait \
  "Accuracy: 94.2% (+1.8%) | F1: 0.91 | Latency: 12ms")

case $ACTION in
  deploy)  deploy_model production ;;
  staging) deploy_model staging ;;
  reject)  echo "Model rejected" ;;
esac

GPU utilization alerts

# Alert when GPU is idle (training crashed or finished)
GPU_UTIL=$(nvidia-smi --query-gpu=utilization.gpu \
  --format=csv,noheader,nounits | head -1)

if [ "$GPU_UTIL" -lt 5 ]; then
  syncfu send -t "GPU Idle" -p high -i cpu \
    "GPU utilization is $GPU_UTIL%. Training may have stopped." \
    -a "check:Check logs:primary" -a "dismiss:Dismiss:secondary"
fi

Training progress

Live progress bar with epoch count, loss, and accuracy. Updates in real time after each epoch. Works with PyTorch, TensorFlow, JAX, and any framework.

Pipeline stages

Track extract → transform → load stages with a single notification that updates as each stage completes. Know exactly where your pipeline is.

Evaluation gates

Model evaluation results with Deploy / Staging / Reject buttons. The --wait flag blocks until you make a decision.

GPU monitoring

Detect idle GPUs when training crashes or finishes. Stop wasting expensive compute time waiting on runs that already stopped.

Frequently asked questions

Can syncfu show live training progress from PyTorch or TensorFlow?

Yes. Send an initial notification with --progress 0, then POST updates to /notify/{id}/update after each epoch. The overlay updates in real time with an animated progress bar and custom label showing loss, accuracy, or ETA. This works from any language that can make HTTP requests.

How do I get notified when a training run finishes overnight?

syncfu is a desktop overlay — it shows notifications when you are at your machine. For overnight runs, pair it with ntfy for phone alerts. When you return to your desk, the syncfu overlay will still be showing the final results with action buttons to deploy or discard the model.

Does this work with Jupyter notebooks?

Yes. Use Python requests to POST to http://localhost:9868/notify from any notebook cell. You can send notifications at the end of training, after evaluation, or at any checkpoint. The overlay appears on your desktop regardless of which browser tab or window is focused.

Can I track multiple training runs simultaneously?

Yes. Each notification has a unique ID and can be grouped with the --group flag. Run multiple experiments and each one updates its own notification independently. The overlays stack on your screen so you can see all runs at a glance.

Related

Never miss a training run again

Open source, runs locally, zero config. Install in 30 seconds and get live training progress on your desktop.

Get started →