General

Background Workers (Celery)

Background Workers (Celery)

OrbitingFox uses Celery with a Redis broker to run slow or heavy operations in the background, so users never wait on a slow HTTP request. This guide explains what background jobs exist, how to monitor them, and what to do when things go wrong.


What Runs in the Background?

Task name Trigger What it does Module
orbitpilot.build_memory User clicks "Build My OrbitPilot Memory" Runs the full memory pipeline (extract → clean → chunk → embed → store) for the user. orbitpilot/tasks.py
orbitpilot.generate_answer Reserved for future async Ask flow Runs retrieval + answer generation; caches result under op_ask:{task_id}. orbitpilot/tasks.py
notifications.send_push New message, group invite, channel invite, Blink share Sends a Web Push notification to the user's browser subscription. notifications/tasks.py
config.send_verification_email User signs up Sends the email-verification link. config/tasks.py
blink.transcribe_audio Audio upload or manual "Transcribe" button Calls OpenAI Whisper to transcribe a Blink audio recording. blink/tasks.py
blink.analyse_audio "Analyse" button on audio profile Calls GPT-4o to extract summary, action items, sentiment, speaker count. blink/tasks.py

How the System Is Structured

  • Broker: Redis (REDIS_URL env var). Same Redis instance used by Django Channels.
  • Result backend: Redis (task results stored for up to 1 hour).
  • Worker process: Separate from the web server. Start command: celery -A config worker --loglevel=info --concurrency=2
  • Time limits: Hard kill at 5 minutes; soft warning at 4.5 minutes.
  • Retry policy: Each task type has its own retry count (1–3 retries) with a backoff delay.

Status Polling — How the UI Stays Live

When a user triggers a slow action, the view returns immediately and the UI polls a status endpoint:

  • Memory build: GET /orbitpilot/build/status/ — returns current build status. Dashboard JS polls every 4 seconds and reloads when done.
  • Audio transcription / analysis: GET /blink-audio/<pk>/status/ — returns transcription_status and ai_analysis_status. Audio page JS polls every 3 seconds.

Monitoring on Render

  1. Go to your Render dashboard → select the worker service.
  2. Click Logs. You will see lines like:
    [2026-05-23 10:12:33,501: INFO/MainProcess] Task orbitpilot.build_memory[abc-123] received
    [2026-05-23 10:12:45,812: INFO/ForkPoolWorker-1] Task orbitpilot.build_memory[abc-123] succeeded in 12.3s
  3. Failed tasks show FAILED with a traceback. The OrbitPilotMemoryBuild record will also have status=failed and an error_message.

What Happens if the Worker Is Down?

  • Tasks are queued in Redis. They are not lost while Redis is running.
  • When the worker restarts, it picks up and processes the queued tasks automatically.
  • If Redis itself goes down, queued tasks that haven't been acknowledged are requeued on restart (because CELERY_TASK_ACKS_LATE=True).
  • Users will see "Memory build started…" on the dashboard but the status will remain "planned" until the worker processes it. No data is lost.

Restarting the Worker on Render

  1. Render dashboard → Worker service → Manual Deploy (or it auto-deploys on push to the production branch).
  2. Alternatively, use the Render API or Render Shell to restart.

Local Development

Start the worker in a separate terminal:

conda activate orbitingfox100-env
celery -A config worker --loglevel=info

You also need Redis running locally (default: redis://127.0.0.1:6379/0). If Redis is not running, tasks will fail silently and the UI will appear to hang at "queued" status.


Performance Caching

Two caches reduce DB load on every OrbitPilot ask:

  • Monthly usage (usage:{user_pk}:{YYYY-MM}): 5-minute TTL. Avoids repeated SQL aggregations on the AIUsageLog table.
  • User snapshot counts (orbit_snapshot_counts:{user_pk}): 2-minute TTL. Avoids 7 COUNT queries every time a user loads the ask result page.

Both caches use Django's cache backend (LocMem in dev, Redis in production).