Architecture

System Overview

Kanchi is built as a standalone monitoring service that connects to your Celery broker and captures all task events in real-time.

graph LR
    A[Celery Workers] -->|Events| B[Broker]
    B -->|Subscribe| C[Kanchi]
    C -->|WebSocket| D[Dashboard]
    C -->|Store| E[PostgreSQL]

Core Components

Event Capture Service

The event capture service connects to your Celery broker and subscribes to all task events:

Task Sent: When a task is submitted
Task Started: When a worker begins execution
Task Succeeded: When a task completes successfully
Task Failed: When a task raises an exception
Task Retried: When a task is retried
Worker Online/Offline: Worker lifecycle events
Worker Heartbeat: Periodic health signals

Event Processing Pipeline

Event → Validation → Enrichment → Storage → WebSocket Broadcast

Validation: Ensure event integrity
Enrichment: Add metadata and relationships
Storage: Persist to database
Broadcast: Push to connected clients

Real-Time WebSocket Server

Kanchi uses WebSockets for real-time updates:

Zero Polling: Push-based updates
Efficient: Smart event deduplication
Scalable: Handles thousands of concurrent connections

Database Layer

PostgreSQL stores all historical data:

Indexed Queries: Fast search and filtering
Connection Pooling: Efficient resource usage
Migrations: Schema versioning with Alembic
Partitioning: Optional for high-volume deployments

Technology Stack

Backend

FastAPI: High-performance Python web framework
SQLAlchemy: ORM with async support
Kombu: Celery event subscription
Pydantic: Data validation and serialization
WebSockets: Real-time communication

Frontend

Next.js: React framework with SSR
TypeScript: Type-safe development
TanStack Query: Data fetching and caching
Tailwind CSS: Utility-first styling
Framer Motion: Smooth animations

Infrastructure

PostgreSQL: Primary data store
Redis/RabbitMQ: Celery broker (your existing one)
Docker: Containerized deployment

Event Flow

Task Lifecycle

sequenceDiagram
    participant Client
    participant Celery
    participant Broker
    participant Kanchi
    participant DB
    participant Dashboard

    Client->>Celery: task.delay()
    Celery->>Broker: task-sent event
    Broker->>Kanchi: event notification
    Kanchi->>DB: store event
    Kanchi->>Dashboard: WebSocket update

    Celery->>Broker: task-started
    Broker->>Kanchi: event notification
    Kanchi->>DB: update status
    Kanchi->>Dashboard: status change

    Celery->>Broker: task-succeeded
    Broker->>Kanchi: event notification
    Kanchi->>DB: final update
    Kanchi->>Dashboard: completion

Data Model

Core Entities

Task Event

{
  "task_id": "uuid",
  "name": "tasks.process_data",
  "state": "STARTED",
  "args": [...],
  "kwargs": {...},
  "worker": "worker@hostname",
  "timestamp": "2025-01-15T10:30:00Z",
  "runtime": 1.23
}

Worker

{
  "hostname": "worker@node-1",
  "status": "online",
  "active_tasks": 5,
  "processed": 1234,
  "load_average": [1.5, 1.3, 1.2],
  "last_heartbeat": "2025-01-15T10:30:00Z"
}

Workflow Rule

{
  "name": "Retry failed tasks",
  "trigger": {
    "event": "task.failed",
    "filters": [...]
  },
  "actions": [
    {"type": "retry"},
    {"type": "webhook", "url": "..."}
  ]
}