Background Workers (Celery)
Background Workers (Celery)
OrbitingFox uses Celery with a Redis broker to run slow or heavy operations in the background, so users never wait on a slow HTTP request. This guide explains what background jobs exist, how to monitor them, and what to do when things go wrong.
What Runs in the Background?
| Task name | Trigger | What it does | Module |
|---|---|---|---|
orbitpilot.build_memory |
User clicks "Build My OrbitPilot Memory" | Runs the full memory pipeline (extract → clean → chunk → embed → store) for the user. | orbitpilot/tasks.py |
orbitpilot.generate_answer |
Reserved for future async Ask flow | Runs retrieval + answer generation; caches result under op_ask:{task_id}. |
orbitpilot/tasks.py |
notifications.send_push |
New message, group invite, channel invite, Blink share | Sends a Web Push notification to the user's browser subscription. | notifications/tasks.py |
config.send_verification_email |
User signs up | Sends the email-verification link. | config/tasks.py |
blink.transcribe_audio |
Audio upload or manual "Transcribe" button | Calls OpenAI Whisper to transcribe a Blink audio recording. | blink/tasks.py |
blink.analyse_audio |
"Analyse" button on audio profile | Calls GPT-4o to extract summary, action items, sentiment, speaker count. | blink/tasks.py |
How the System Is Structured
- Broker: Redis (
REDIS_URLenv var). Same Redis instance used by Django Channels. - Result backend: Redis (task results stored for up to 1 hour).
- Worker process: Separate from the web server. Start command:
celery -A config worker --loglevel=info --concurrency=2 - Time limits: Hard kill at 5 minutes; soft warning at 4.5 minutes.
- Retry policy: Each task type has its own retry count (1–3 retries) with a backoff delay.
Status Polling — How the UI Stays Live
When a user triggers a slow action, the view returns immediately and the UI polls a status endpoint:
- Memory build:
GET /orbitpilot/build/status/— returns current build status. Dashboard JS polls every 4 seconds and reloads when done. - Audio transcription / analysis:
GET /blink-audio/<pk>/status/— returnstranscription_statusandai_analysis_status. Audio page JS polls every 3 seconds.
Monitoring on Render
- Go to your Render dashboard → select the worker service.
- Click Logs. You will see lines like:
[2026-05-23 10:12:33,501: INFO/MainProcess] Task orbitpilot.build_memory[abc-123] received [2026-05-23 10:12:45,812: INFO/ForkPoolWorker-1] Task orbitpilot.build_memory[abc-123] succeeded in 12.3s
- Failed tasks show
FAILEDwith a traceback. TheOrbitPilotMemoryBuildrecord will also havestatus=failedand anerror_message.
What Happens if the Worker Is Down?
- Tasks are queued in Redis. They are not lost while Redis is running.
- When the worker restarts, it picks up and processes the queued tasks automatically.
- If Redis itself goes down, queued tasks that haven't been acknowledged are requeued on restart (because
CELERY_TASK_ACKS_LATE=True). - Users will see "Memory build started…" on the dashboard but the status will remain "planned" until the worker processes it. No data is lost.
Restarting the Worker on Render
- Render dashboard → Worker service → Manual Deploy (or it auto-deploys on push to the production branch).
- Alternatively, use the Render API or Render Shell to restart.
Local Development
Start the worker in a separate terminal:
conda activate orbitingfox100-env celery -A config worker --loglevel=info
You also need Redis running locally (default: redis://127.0.0.1:6379/0). If Redis is not running, tasks will fail silently and the UI will appear to hang at "queued" status.
Performance Caching
Two caches reduce DB load on every OrbitPilot ask:
- Monthly usage (
usage:{user_pk}:{YYYY-MM}): 5-minute TTL. Avoids repeated SQL aggregations on the AIUsageLog table. - User snapshot counts (
orbit_snapshot_counts:{user_pk}): 2-minute TTL. Avoids 7 COUNT queries every time a user loads the ask result page.
Both caches use Django's cache backend (LocMem in dev, Redis in production).