Changelog
Follow the latest updates, features, and improvements to Kanchi.
MySQL support and pickle opt-in
Steady refinements focused on compatibility and operational clarity.
What's new
MySQL deployments are now supported Kanchi now bundles a compatible MySQL driver so MySQL stacks start cleanly and run with first-class support.
Optional pickle serialization
Celery payloads remain JSON-first, but you can opt in to pickle by setting ENABLE_PICKLE_SERIALIZATION=true. When enabled, Kanchi accepts application/x-python-serialize and logs a startup warning to flag the risk.
Improvements
Health probes use /api/health
The healthcheck endpoint moved for consistency. Update liveness/readiness probes and scripts from /health to /api/health.
Version visibility The running app version is now exposed in the UI and API responses to speed up support and troubleshooting.
Safer event logging
_json_safe now understands Enum values so event logs stay stable instead of failing on uncommon payloads.
Upgrade as usual: pull the latest image or code, apply your environment variables, and restart the services.
Refinements and fixes
Small but important improvements to deployment, monitoring, and reliability.
What's new
ARM64 Docker support
Docker images now build for both linux/amd64 and linux/arm64 platforms. If you're running on ARM64 systems (Apple Silicon, AWS Graviton, etc.), you can pull and run Kanchi images without manifest errors.
Health endpoint improvements The health statistics endpoint now respects authentication state. When you're logged in and auth is enabled, the dashboard shows detailed health metrics instead of the minimal public view. The frontend properly detects auth state and calls the appropriate endpoint.
Python 3.13 compatibility
Updated asyncpg to the latest version for Python 3.13 support. Both SQLite and PostgreSQL work correctly on newer Python versions now.
What's fixed
Docker deployment flow
Fixed the deployment workflow for users who download only docker-compose.yaml. The compose file now uses pre-built images from Docker Hub (getkanchi/kanchi:latest) instead of trying to build locally. No more missing Dockerfile errors when you just want to spin up Kanchi quickly.
Task truncation handling Tasks with extremely large payloads no longer cause failures. The system now gracefully handles truncation when task arguments or results exceed reasonable size limits.
Standard upgrade process: pull the latest Docker image or git changes and restart. Database migrations run automatically if needed.
Authentication and access control
Kanchi now supports authentication. If you need to restrict access to your monitoring dashboard, you can enable login with Google, GitHub, or basic username/password authentication.
This is entirely opt-in. If you don't enable it, Kanchi works exactly as before — open access, zero friction. But when you need security, it's there.
Why we built this
Some deployments need authentication. Maybe you're exposing Kanchi to the internet. Maybe your infrastructure requires access control. Maybe you just don't want everyone on the network poking around production task queues.
We wanted authentication that doesn't get in the way when you don't need it, but provides proper security when you do.
How it works
OAuth providers
Login with Google or GitHub. Configure client credentials, set allowed email patterns, and you're done. Kanchi handles the OAuth flow, validates emails against your domain allowlist, and maintains sessions with refresh tokens.
Both providers support profile pictures and metadata — so your team members show up in the UI with their actual names and avatars instead of anonymous sessions.
Basic authentication
For simpler setups, use username and password authentication. Passwords are hashed with PBKDF2-SHA256 (260,000 iterations, unique salt per password). No plaintext secrets in the database.
Useful for internal deployments where you don't want to configure OAuth apps but still need basic access control.
Email-based access control
Restrict access by email pattern. Use wildcards to allow entire domains or specify individual addresses:
# Allow specific domain
ALLOWED_EMAIL_PATTERNS='*@example.com'
# Multiple domains
ALLOWED_EMAIL_PATTERNS='*@example.com,*@example.org'
# Specific addresses
ALLOWED_EMAIL_PATTERNS='admin@example.com,ops@company.io'Emails are normalized (lowercased) and validated before access is granted. If someone authenticates with an email that doesn't match your patterns, they're denied — even if OAuth succeeds.
Token-based sessions
Authentication uses short-lived access tokens (30 minutes) and long-lived refresh tokens (24 hours). Tokens are signed with HMAC-SHA256 and validated on every request.
The system stores SHA256 hashes of tokens in the database — never the tokens themselves. Even if someone compromises the database, they can't reconstruct valid tokens.
Tokens automatically refresh when they expire. The frontend handles this transparently, so sessions don't drop while you're actively using the dashboard.
WebSocket authentication
WebSocket connections require valid access tokens when authentication is enabled. The token is passed via query parameter during connection initialization — no manual header management needed.
If a token expires mid-session, the connection closes gracefully and the frontend reconnects with a refreshed token automatically.
Configuration
Authentication is disabled by default. Enable it with a single environment variable:
AUTH_ENABLED=trueThen configure at least one authentication method:
Basic authentication:
AUTH_BASIC_ENABLED=true
BASIC_AUTH_USERNAME=kanchi-admin
BASIC_AUTH_PASSWORD_HASH='pbkdf2_sha256$260000$...'Google OAuth:
AUTH_GOOGLE_ENABLED=true
GOOGLE_CLIENT_ID=your-client-id
GOOGLE_CLIENT_SECRET=your-client-secret
OAUTH_REDIRECT_BASE_URL=https://kanchi.example.comGitHub OAuth:
AUTH_GITHUB_ENABLED=true
GITHUB_CLIENT_ID=your-client-id
GITHUB_CLIENT_SECRET=your-client-secret
OAUTH_REDIRECT_BASE_URL=https://kanchi.example.comYou can enable multiple methods simultaneously. Users will see all available login options on the login page and can choose their preferred method.
Security settings:
# Generate secure secrets (required)
SESSION_SECRET_KEY=$(openssl rand -hex 32)
TOKEN_SECRET_KEY=$(openssl rand -hex 32)
# Configure CORS
ALLOWED_ORIGINS='https://kanchi.example.com'
ALLOWED_HOSTS='kanchi.example.com'
# Restrict access by email
ALLOWED_EMAIL_PATTERNS='*@example.com'What changed
Backend
New /api/auth/* endpoints for login, logout, token refresh, and OAuth flows. All other API routes now check for authentication when AUTH_ENABLED=true.
Two new database tables: users (stores authenticated user profiles) and extensions to user_sessions (links sessions to users and stores token hashes).
Migrations run automatically on startup — no manual schema changes needed.
Frontend
New login page at /login with OAuth buttons and basic auth form. Navigation middleware redirects unauthenticated users to the login page when auth is enabled.
User avatar dropdown in the navbar shows current user info and logout button. Tokens are stored in localStorage and automatically refreshed before expiration.
WebSocket connections include access tokens when auth is enabled. If a connection is rejected due to invalid auth, the frontend handles it gracefully and prompts for re-authentication.
Backward compatibility
If you don't enable authentication, nothing changes. Existing deployments continue working without modification.
Anonymous sessions are still supported when AUTH_ENABLED=false. The new database tables exist but remain empty.
Bug fixes
Workflow deletion issue Fixed a bug where deleting workflows would sometimes fail silently or leave orphaned records. Deletion now properly cascades through related workflow executions and action configurations.
For detailed setup instructions, OAuth configuration, troubleshooting, and production deployment guidelines, check out the Authentication documentation.
Built for teams that need security without the ceremony. Enable it when you need it, ignore it when you don't. Happy Halloween! 🎃
If you run into issues or have questions, open an issue on GitHub — we're here to help.
Database migration fix
If you upgraded to v1.2.0 and ran into database migration issues, this patch fixes that.
What was broken
The workflow circuit breaker migration wasn't properly sequenced with earlier migrations. In some deployment scenarios, this could cause the migration to run out of order or fail entirely.
What's fixed
Migrations now run in the correct sequence. If you experienced issues upgrading to v1.2.0, pull the latest version and the database will migrate cleanly.
Standard upgrade process applies—pull the update and restart. Database migrations run automatically on startup.
Redis broker support
Kanchi now supports Redis as an alternative message broker. If you're already running Redis or prefer it over RabbitMQ, you can use it directly — no changes to your Celery configuration needed.
Why this matters
Some teams already have Redis in their infrastructure and don't want to add RabbitMQ just for Celery. Others prefer Redis for its simplicity or operational familiarity. Now you can choose.
Both brokers work identically from Kanchi's perspective. Pick whichever fits your stack.
Configuration
We've standardized on a single environment variable that works with any Celery-compatible broker:
For RabbitMQ:
CELERY_BROKER_URL=amqp://user:password@localhost:5672//For Redis:
CELERY_BROKER_URL=redis://localhost:6379/0That's it. Set the variable, restart Kanchi, and it connects to your broker automatically.
Breaking change
If you're upgrading from v1.1.0 or earlier, the environment variable name has changed:
Old: RABBITMQ_URL
New: CELERY_BROKER_URL
The new name reflects that Kanchi supports multiple broker types. Your existing RabbitMQ connection will work — just rename the environment variable.
That's the update. Broker flexibility without the complexity. If you hit any issues, open an issue on GitHub — we are always happy to help. Happy monitoring!
Kanchi 1.1: Workflows and automation
We've been shipping fast since v1.0.0. This release brings intelligent automation, safer deployments, and the kind of polish that comes from actually using the thing every day.
Workflow Engine
The headline feature: automated response to task failures. If you've ever manually retried a batch of failed tasks at 2am or wished Slack would just tell you when things break, this is for you.
Event-driven automation
Create workflows that trigger on specific conditions — task failures, orphan detection, execution time thresholds. Each workflow runs independently with full execution history and rollback support.
Circuit breaker protection
Workflows include configurable circuit breakers to prevent infinite retry loops. Set thresholds for execution count and time windows. When limits are hit, the circuit opens and notifications are sent instead of blindly retrying tasks into oblivion.
Slack integration
Native Slack action for workflow notifications. Configure webhook URLs, customize message templates with task context variables, and get alerted when automation kicks in or circuits trip. Supports multiple webhook configurations per workflow.
Retry orchestration
Built-in retry action with configurable delays, max attempts, and exponential backoff. Workflows can automatically retry failed tasks with the same arguments, track retry chains across parent-child relationships, and stop gracefully when manual intervention is needed.
Workflow catalog
Browse and clone pre-built workflow templates for common scenarios: auto-retry on transient failures, Slack alerts for critical task types, orphan task recovery. Each template includes sensible defaults and inline documentation.
Task Detail Pages
Click any task to see everything: full argument inspection, result payloads, stack traces with syntax highlighting, retry chain visualization, and execution timeline. Deep-linkable URLs mean you can share specific task failures with your team without writing a novel in Slack.
Failed Task Dashboard
Unified view of both failed tasks and orphans on the home page. Real-time counters, filterable lists, bulk retry controls. The interface automatically refreshes as new failures come in via WebSocket.
Orphan tasks now show their last-known worker and estimated abandonment time. Failed tasks display error messages inline so you can triage without clicking through.
Deployment Improvements
Docker Compose workflow
We modeled the deployment experience after Docmost — download a single docker-compose.yaml, set RABBITMQ_URL, run one command. No configuration sprawl, no vendor lock-in. Bring your own RabbitMQ and PostgreSQL.
curl -O https://raw.githubusercontent.com/getkanchi/kanchi/main/docker-compose.yaml
# Configure in docker-compose.yaml
docker compose up -d --build --pull always --force-recreateThe same command handles both initial deployment and updates. Pull new code, re-run the command, you're done.
Better database support
SQLite still works great for development and small deployments. For production, point DATABASE_URL at PostgreSQL and migrations run automatically on startup. No manual Alembic commands, no schema drift.
Quality of Life
Pagination fix The select dropdown for page limits now correctly shows your current selection instead of always displaying the default. Small thing, but it was annoying.
SQL dialect detection Timeline queries now use the correct SQL dialect based on your configured database. No more SQLite-specific syntax breaking PostgreSQL deployments.
Cleaner codebase
Removed unused keybindings that were triggering accidentally. Added comprehensive test coverage for workflow circuit breakers, retry limits, and infinite loop prevention. Tests run in CI and locally via ./run_tests.sh.
What's Next
This release focused on automation and stability. Now that the foundation is solid, we're shifting focus to the details that make software feel good to use — refined interactions, thoughtful design decisions, and the kind of polish that comes from sweating the small stuff ✨.
If you're upgrading from v1.0.0, database migrations run automatically. Existing orphan batch and retry logic remains unchanged — workflows are opt-in automation on top of the existing foundation.
Kanchi 1.0: Ship with confidence
Today we're releasing Kanchi—a monitoring system for Celery that actually feels good to use. If you've spent years squinting at logs or refreshing Flower tabs, this one's for you.
Why we built this
Distributed task queues are critical infrastructure, but most observability tools haven't kept up. We wanted something that feels fun to use while providing deep insights into task execution.
Kanchi connects directly to your message broker (no agents, no SDK changes) and gives you real-time visibility into what's actually happening. Tasks, workers, retries, failures — everything you need to debug issues and prevent outages.
What you get
Live task monitoring
WebSocket-based updates mean you see tasks as they flow through your system. No refresh button, no polling lag. Switch between streaming mode for active debugging and paginated views for historical analysis.
Each task shows the full context: arguments, results, execution time, and stack traces. If a task retries, you'll see the entire chain with parent-child relationships mapped out.
Search and filter that actually works
Date range picker with calendar support. Filter by status, task name, worker, queue—or combine them all. Full-text search across task arguments and results. The interface updates in real time as you refine queries.
Worker health at a glance
Live worker status with heartbeat tracking. See which workers are handling the most load, which ones are idle, and which ones disappeared mid-task. The dashboard updates as workers join or leave the pool.
Task registry
Browse every registered Celery task in your application. Multi-environment support means you can monitor prod, staging, and dev separately without mixing state.
Orphan detection
When tasks fail because a worker crashed or the network dropped, Kanchi flags them automatically. Bulk retry with batch tracking so you can recover gracefully instead of scrambling through logs.
Persistence without the pain
SQLite by default, PostgreSQL for production. Task history with full state transitions, daily statistics, trend analysis. Auto-migrations with Alembic — zero manual schema work.
Built for developers who need intuitive visibility into distributed systems. Try it out and let us know what breaks 😅.