Zero-Downtime Deployments with AWS ECS and GitHub Actions

TawiCode Team
Jan 22, 2026
8 min read

Production deployments should be boring. They should be fast, repeatable, and completely unnoticed by your users. If your team dreads release day, if you wait for low-traffic windows to push code, or if a failed deployment means manual rollbacks and lost sleep, this article describes the pipeline that will change that. We use this exact setup in production for multiple client applications.

The Architecture: Blue-Green on ECS

Blue-green deployment maintains two identical production environments: the "blue" environment (currently live) and the "green" environment (next version). When you deploy, you bring up the green environment, run health checks, then shift traffic from blue to green in a single, atomic switch. If anything goes wrong, you switch back to blue in seconds with zero downtime.

AWS Elastic Container Service (ECS) with Fargate is the foundation. ECS runs your Docker containers without requiring you to manage EC2 instances. An Application Load Balancer (ALB) sits in front, routing traffic to either the blue or green ECS service. AWS CodeDeploy, integrated with ECS, manages the traffic shift — it can do a linear shift (10% per minute), a canary shift (10% immediately, then 90% after validation), or an all-at-once shift.

Setting Up the GitHub Actions Pipeline

The CI/CD pipeline is triggered on every push to the `main` branch. The workflow has four stages: test, build, push, and deploy. In the test stage, unit and integration tests run in parallel using Jest. In the build stage, Docker builds the application image using a multi-stage Dockerfile — the build stage compiles the application, and the production stage copies only the compiled output into a minimal Node.js Alpine base image, keeping the final image under 150MB.

The built image is tagged with the Git SHA and pushed to Amazon ECR (Elastic Container Registry). This tagging strategy means every image is traceable to the exact commit that produced it — critical for debugging production issues and for rollback scenarios.

Automated Rollback and Health Checks

The deployment step updates the ECS task definition with the new image tag, then triggers a CodeDeploy deployment. CodeDeploy runs the new task, waits for it to pass ALB health checks (three consecutive HTTP 200 responses on the `/health` endpoint), then begins traffic shifting. If health checks fail at any point in the fifteen-minute validation window, CodeDeploy automatically rolls back to the previous task definition. No manual intervention required.

Environment Variable Management

Secrets and environment-specific configuration live in AWS Systems Manager Parameter Store (SSM). The ECS task definition references SSM parameters by ARN — they are injected as environment variables at container startup. This approach means sensitive values never appear in the GitHub Actions logs, never live in your repository, and can be rotated in SSM without a code deployment.

The Result

With this pipeline in place, deployments take an average of four minutes from merge to live traffic. The pipeline runs automatically on every merge to `main`. Failed deployments roll back automatically. Developers push code at any time of day without coordination or ceremony. That is what zero-downtime deployments should feel like: invisible, automatic, and completely unremarkable.

Enjoyed This Article?

Explore our services and see how TawiCode turns insights into real-world products.