General
Production Checklist
Checklist for deploying a self-hosted Rivet Engine to production.
We recommend passing this page to your coding agent to verify your configuration before deploying.
PostgreSQL is the recommended backend for multi-node self-hosted deployments today, but it remains experimental. For a production-ready single-node Rivet deployment, use the file system backend (RocksDB-based). Enterprise teams can contact enterprise support about FoundationDB for the most scalable production-ready deployment.
Also review the general production checklist.
Security
- Validate that you have an admin token configured. Generate a strong, random token for engine authentication. See Configuration.
- Verify your admin token is not exposed publicly. Do not include the admin token in
RIVET_PUBLIC_ENDPOINTor anywhere accessible to clients. See Endpoints. - Configure TLS termination. Ensure connections to the engine are encrypted via a reverse proxy or load balancer.
Resources
- Set container resource limits. Recommended at least 1 CPU and 2 GB of RAM per Rivet Engine instance.
- Configure health checks. Set up liveness and readiness probes on port
6421at/health. Recommended timeout of 5 seconds.
Scaling
- Configure autoscaling for the Rivet Engine. Set target CPU utilization to 70% and memory to 80% to ensure headroom for traffic spikes. In Kubernetes, this is configured via a Horizontal Pod Autoscaler (HPA).
- Use 2+ engine nodes for redundancy. Running a single engine node is a single point of failure. Deploy at least two engine instances behind a load balancer.
- RocksDB only supports a single node. Do not run multiple RocksDB nodes. For a production-ready single-node Rivet deployment, use the file system backend (RocksDB-based). For multi-node deployments, PostgreSQL is the recommended backend today, though it remains experimental as we evaluate the best fit for scalability and performance.
- Validate the rate limit on your serverless actor host. Actor start requests are sent from your engine instances, so they all originate from a small set of IPs. Per-IP rate limits on the actor host will throttle the engine before they would throttle end-user traffic. Size the limit to your peak actor create and wake rate, and configure platform max concurrency (e.g. on GCP Cloud Run) to match your expected concurrent actor count.
PostgreSQL
- PostgreSQL is recommended for multi-node deployments, but remains experimental. Validate the deployment carefully before rollout.
- Configure automated backups. Set up regular backups for your PostgreSQL database to prevent data loss.
- Configure failover. Set up a standby replica with automatic failover to ensure high availability.
- Use FoundationDB for the most scalable production-ready deployments. FoundationDB provides the best performance, scalability, and uptime for Rivet. Contact enterprise support for FoundationDB guidance.
NATS
- Use NATS for pub/sub (recommended). By default, Rivet uses PostgreSQL
LISTEN/NOTIFYfor pub/sub which has limited throughput. NATS significantly improves performance for high-traffic deployments. This is not needed if using RocksDB. See Configuration. - Deploy 2+ NATS replicas. Run at least two NATS replicas for high availability.
Monitoring
- Configure OpenTelemetry. The Rivet Engine supports exporting traces and metrics via OpenTelemetry. Set
RIVET_OTEL_ENABLED=1andRIVET_OTEL_GRPC_ENDPOINTto your collector endpoint (defaults tohttp://localhost:4317). AdjustRIVET_OTEL_SAMPLER_RATIOto control trace sampling (defaults to0.001). See Configuration. - Set up alerts for critical metrics. Monitor engine CPU, memory, request latency, and error rates. Configure alerts to notify your team before issues become outages.
Enterprise
- Contact enterprise support for production-ready deployments. We can help with architecture review, scaling guidance, and FoundationDB support.