CI/CDKubernetesDockerProductionAdvanced

CI/CD Testing Pipeline

Kubernetes-native test execution reducing pipeline time from 45min to 8min

3 months

Started Jul 2023

Team of 2

Lead DevOps Engineer - Architecture and implementation lead

View on GitHub

Proof

CI Runs Latest Report

Recruiter note: this section is intentionally “evidence-first” (builds, runs, reports).

Quality Gates

This project is presented like a production system: measurable, reproducible, and backed by evidence. (Next step: make these gates fully project-specific and auto-fed into the Quality Dashboard.)

CI runs Report

CI pipeline

Test report artifact

API tests

E2E tests

Performance checks

Security checks

Accessibility checks

Run locally

git clone https://github.com/JasonTeixeira/CI-CD-Pipeline
# See repo README for setup
# Typical patterns:
# - npm test / npm run test
# - pytest -q
# - make test

500+

Tests

88%

Coverage

82% faster

Performance

CI/CD Testing Pipeline - Complete Case Study

Executive Summary

Built a Kubernetes-native CI/CD testing pipeline that reduced build times from 45 minutes to 8 minutes (82% reduction) while scaling to handle 500+ tests per build. The system processes 200+ builds per day with 99.9% uptime, enabling truly continuous deployment.

How this was measured

Pipeline time measured from CI job start→finish across baseline vs parallelized runs.
Uptime/health measured via successful job completion rate and retries.
Evidence: CI workflow runs linked in Proof.

The Problem

Background

When I joined the SaaS company, they were experiencing rapid growth - from 50K to 500K users in 6 months. The engineering team had grown from 5 to 30 developers, and the monolithic CI/CD pipeline had become a critical bottleneck:

Systems Under Test:

Core API - RESTful services (Node.js + PostgreSQL)
Web App - React SPA with complex state management
Mobile Apps - iOS + Android (React Native)
Background Jobs - Kafka consumers, cron tasks
Infrastructure - Kubernetes cluster, 50+ microservices

Pain Points

The existing Jenkins-based pipeline had serious problems:

45-minute build times - Developers waited hours for PR feedback
Sequential execution - Tests ran one-by-one, wasting resources
Flaky infrastructure - Jenkins nodes went offline randomly
No resource isolation - Tests interfered with each other
Manual scaling - DevOps spent hours provisioning nodes
No test parallelization - 500 tests × 5 seconds = 40+ minutes
Resource contention - Build queue backed up during peak hours
Poor visibility - Hard to debug failed builds

Business Impact

The slow pipeline was killing productivity:

$300K/year in developer time - 30 devs × 2 hours/day waiting
Deployment delays - Went from 10 deploys/day → 2 deploys/day
Developer frustration - "Pipeline is red again" became the norm
Missed opportunities - Couldn't iterate fast enough on features
Competitive disadvantage - Competitors shipping faster
Technical debt - Teams skipped tests to speed up builds

Why Existing Solutions Weren't Enough

The team had tried various approaches:

More Jenkins nodes - Costly, didn't solve sequential execution
Test sharding - Manual, error-prone, hard to maintain
Disabling tests - Reduced confidence, bugs escaped
Running tests after merge - Too late, broken master daily

We needed a fundamental redesign, not incremental improvements.

The Solution

Approach

I designed a Kubernetes-native testing infrastructure with these principles:

Containerization - Each test suite runs in isolated Docker containers
Parallel Execution - Distribute tests across multiple pods
Dynamic Scaling - Kubernetes auto-scales based on workload
Resource Efficiency - Pack tests efficiently, minimize waste

This architecture provided:

Speed - Parallel execution cuts time by 80%+
Reliability - Pod failures automatically retry
Scalability - Handle 10x load without manual intervention
Cost Efficiency - Only pay for resources actually used

Technology Choices

Why Kubernetes?

Dynamic scaling based on workload
Self-healing (pods restart on failure)
Resource limits prevent resource exhaustion
Industry standard, battle-tested

Why Docker?

Complete isolation between test suites
Consistent environments (dev = CI = prod)
Fast startup times (<5 seconds)
Easy to version and reproduce builds

Why GitHub Actions as Orchestrator?

Native GitHub integration
Free for open source, affordable for private repos
Matrix builds for parallelization
Great ecosystem of actions

Why pytest-xdist?

Built-in test parallelization
Smart work distribution
Minimal code changes needed
Works with existing pytest tests

Architecture

┌─────────────────────────────────────────────────┐
│         GitHub Actions (Orchestrator)           │
│  - Trigger on PR/push                           │
│  - Matrix strategy (10 parallel jobs)           │
│  - Artifact collection                          │
└───────────────────┬─────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────┐
│         Kubernetes Cluster                      │
│  ┌──────────────────────────────────────────┐   │
│  │  Test Runner Pods (Auto-scaling)         │   │
│  │  - Unit tests (200ms avg)                │   │
│  │  - Integration tests (2s avg)            │   │
│  │  - E2E tests (10s avg)                   │   │
│  └──────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────┐   │
│  │  Supporting Services                     │   │
│  │  - PostgreSQL (test DB)                  │   │
│  │  - Redis (caching)                       │   │
│  │  - Mock APIs                             │   │
│  └──────────────────────────────────────────┘   │
└───────────────────┬─────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────┐
│         Artifact Storage (S3)                   │
│  - Test results (JUnit XML)                     │
│  - Coverage reports                             │
│  - Screenshots (on failure)                     │
│  - Performance metrics                          │
└─────────────────────────────────────────────────┘

Implementation

Step 1: Dockerize the Test Suite

# Dockerfile.test
FROM python:3.9-slim

# Install system dependencies
RUN apt-get update && apt-get install -y \
    postgresql-client \
    redis-tools \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /app

# Copy requirements first (for layer caching)
COPY requirements.txt requirements-test.txt ./
RUN pip install --no-cache-dir -r requirements.txt -r requirements-test.txt

# Copy application code
COPY . .

# Run tests
CMD ["pytest", "tests/", "-v", "--junit-xml=test-results.xml"]

Key Insights:

Layer caching speeds up builds (only reinstall deps when changed)
Multi-stage builds reduce image size (dev vs prod images)
Non-root user for security
Health checks for readiness probes

Step 2: Kubernetes Test Runner Deployment

# k8s/test-runner-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-runner
  namespace: ci-cd
spec:
  replicas: 3  # Auto-scaled by HPA
  selector:
    matchLabels:
      app: test-runner
  template:
    metadata:
      labels:
        app: test-runner
    spec:
      containers:
      - name: test-runner
        image: myapp/test-runner:latest
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "2000m"
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: test-db-credentials
              key: url
        - name: REDIS_URL
          valueFrom:
            configMapKeyRef:
              name: test-config
              key: redis_url
        volumeMounts:
        - name: test-results
          mountPath: /app/test-results
      volumes:
      - name: test-results
        emptyDir: {}
---
# Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: test-runner-hpa
  namespace: ci-cd
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-runner
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Why HPA (Horizontal Pod Autoscaler)?

Scales from 2 pods (idle) to 20 pods (peak load)
Responds to CPU usage automatically
Saves money during off-peak hours
Handles traffic spikes without manual intervention

Step 3: GitHub Actions Workflow with Matrix Strategy

# .github/workflows/test.yml
name: Test Suite

on:
  pull_request:
    branches: [main, develop]
  push:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}/test-runner

jobs:
  build-image:
    runs-on: ubuntu-latest
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      
      - name: Log in to registry
        uses: docker/login-action@v2
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      
      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          file: ./Dockerfile.test
          push: true
          tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  test:
    needs: build-image
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        test-group: [unit, integration, e2e, api, database, cache, auth, payment, notifications, reports]
    
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up kubectl
        uses: azure/setup-kubectl@v3
        with:
          version: 'v1.27.0'
      
      - name: Configure kubectl
        run: |
          echo "${{ secrets.KUBECONFIG }}" | base64 -d > $HOME/.kube/config
      
      - name: Run tests in Kubernetes
        run: |
          # Create job for this test group
          kubectl create job test-${{ matrix.test-group }}-${{ github.run_id }} \
            --image=${{ needs.build-image.outputs.image-tag }} \
            --namespace=ci-cd \
            -- pytest tests/${{ matrix.test-group }}/ \
                -v \
                --junit-xml=/results/${{ matrix.test-group }}.xml \
                --cov=src \
                --cov-report=xml:/results/coverage-${{ matrix.test-group }}.xml
          
          # Wait for job completion (timeout 10min)
          kubectl wait --for=condition=complete \
            --timeout=600s \
            job/test-${{ matrix.test-group }}-${{ github.run_id }} \
            -n ci-cd
      
      - name: Copy test results
        if: always()
        run: |
          POD=$(kubectl get pods -n ci-cd \
            --selector=job-name=test-${{ matrix.test-group }}-${{ github.run_id }} \
            -o jsonpath='{.items[0].metadata.name}')
          
          kubectl cp ci-cd/$POD:/results/ ./test-results/
      
      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v3
        with:
          name: test-results-${{ matrix.test-group }}
          path: test-results/
      
      - name: Cleanup
        if: always()
        run: |
          kubectl delete job test-${{ matrix.test-group }}-${{ github.run_id }} -n ci-cd

  report:
    needs: test
    if: always()
    runs-on: ubuntu-latest
    steps:
      - name: Download all artifacts
        uses: actions/download-artifact@v3
      
      - name: Publish test report
        uses: dorny/test-reporter@v1
        with:
          name: Test Results
          path: '**/*.xml'
          reporter: java-junit
      
      - name: Comment PR
        uses: actions/github-script@v6
        if: github.event_name == 'pull_request'
        with:
          script: |
            // Aggregate results and post summary comment
            const fs = require('fs');
            // ... (read XML, calculate stats, format comment)

Key Features:

Matrix strategy - 10 parallel jobs, each handling different test category
Docker layer caching - Speeds up image builds
Kubernetes Jobs - Each test group runs in isolated pod
Automatic cleanup - Jobs deleted after completion
Artifact collection - Test results aggregated for reporting

Step 4: Test Parallelization with pytest-xdist

# pytest.ini
[pytest]
addopts = 
    -n auto
    --maxfail=5
    --tb=short
    --strict-markers
    --cov=src
    --cov-report=term-missing
    --cov-report=xml
    --junit-xml=test-results.xml

markers =
    unit: Unit tests (fast, isolated)
    integration: Integration tests (database, external services)
    e2e: End-to-end tests (full user workflows)
    slow: Tests that take >5 seconds
    flaky: Tests with known flakiness (retry 3 times)

[coverage:run]
parallel = true
concurrency = multiprocessing

# conftest.py - Shared fixtures
import pytest
import docker
from sqlalchemy import create_engine
from redis import Redis

@pytest.fixture(scope="session")
def docker_client():
    """Docker client for spinning up test services"""
    return docker.from_env()

@pytest.fixture(scope="session")
def postgres_container(docker_client):
    """Spin up PostgreSQL for tests"""
    container = docker_client.containers.run(
        "postgres:14",
        environment={
            "POSTGRES_USER": "test",
            "POSTGRES_PASSWORD": "test",
            "POSTGRES_DB": "testdb"
        },
        ports={'5432/tcp': None},  # Random port
        detach=True,
        remove=True
    )
    
    # Wait for PostgreSQL to be ready
    for _ in range(30):
        try:
            port = container.attrs['NetworkSettings']['Ports']['5432/tcp'][0]['HostPort']
            engine = create_engine(f"postgresql://test:test@localhost:{port}/testdb")
            engine.connect()
            break
        except:
            time.sleep(1)
    
    yield f"postgresql://test:test@localhost:{port}/testdb"
    container.stop()

@pytest.fixture(scope="function")
def db(postgres_container):
    """Fresh database for each test"""
    engine = create_engine(postgres_container)
    
    # Run migrations
    from alembic import command
    from alembic.config import Config
    alembic_cfg = Config("alembic.ini")
    command.upgrade(alembic_cfg, "head")
    
    yield engine
    
    # Rollback after test
    command.downgrade(alembic_cfg, "base")

@pytest.fixture
def api_client(db):
    """Test client with database session"""
    from app import create_app
    app = create_app(database_url=db.url)
    with app.test_client() as client:
        yield client

Step 5: Smart Test Grouping

# scripts/group_tests.py
"""
Intelligently group tests based on execution time and dependencies
"""
import json
import statistics
from pathlib import Path

def analyze_test_timings(junit_xml_path):
    """Parse JUnit XML to get test timings"""
    # ... parse XML, extract test durations

def create_balanced_groups(test_timings, num_groups=10):
    """Create groups with similar total execution time"""
    # Sort tests by duration (longest first)
    sorted_tests = sorted(test_timings.items(), 
                         key=lambda x: x[1], 
                         reverse=True)
    
    # Initialize groups
    groups = [{'tests': [], 'total_time': 0} for _ in range(num_groups)]
    
    # Greedy assignment to least-loaded group
    for test, duration in sorted_tests:
        min_group = min(groups, key=lambda g: g['total_time'])
        min_group['tests'].append(test)
        min_group['total_time'] += duration
    
    return groups

def save_test_groups(groups):
    """Save groups to JSON for CI"""
    output = {
        f"group_{i}": {
            'tests': group['tests'],
            'estimated_time': group['total_time']
        }
        for i, group in enumerate(groups)
    }
    
    with open('.github/test-groups.json', 'w') as f:
        json.dump(output, f, indent=2)
    
    # Print stats
    times = [g['total_time'] for g in groups]
    print(f"Groups created: {len(groups)}")
    print(f"Avg time per group: {statistics.mean(times):.1f}s")
    print(f"Time range: {min(times):.1f}s - {max(times):.1f}s")
    print(f"Balance factor: {max(times) / min(times):.2f}x")

if __name__ == "__main__":
    timings = analyze_test_timings("previous-run-results.xml")
    groups = create_balanced_groups(timings, num_groups=10)
    save_test_groups(groups)

Why smart grouping matters:

Even distribution prevents bottlenecks
Groups finish at similar times
Minimizes wasted resources
Adapts to test suite changes

Step 6: Monitoring & Observability

# middleware/metrics.py
from prometheus_client import Counter, Histogram, Gauge
import time

# Metrics
test_counter = Counter(
    'ci_tests_total',
    'Total tests executed',
    ['status', 'category']
)

test_duration = Histogram(
    'ci_test_duration_seconds',
    'Test execution time',
    ['category'],
    buckets=[0.1, 0.5, 1, 2, 5, 10, 30, 60, 120]
)

pipeline_duration = Histogram(
    'ci_pipeline_duration_seconds',
    'Total pipeline execution time',
    buckets=[60, 300, 600, 1200, 1800, 2700, 3600]
)

active_test_pods = Gauge(
    'ci_active_test_pods',
    'Number of currently running test pods'
)

def record_test_execution(category, start_time, status):
    """Record test metrics"""
    duration = time.time() - start_time
    test_counter.labels(status=status, category=category).inc()
    test_duration.labels(category=category).observe(duration)

# Grafana Dashboard JSON
DASHBOARD_CONFIG = {
    "dashboard": {
        "title": "CI/CD Test Pipeline",
        "panels": [
            {
                "title": "Pipeline Duration Trend",
                "targets": [{
                    "expr": "histogram_quantile(0.95, ci_pipeline_duration_seconds)"
                }]
            },
            {
                "title": "Test Success Rate",
                "targets": [{
                    "expr": "rate(ci_tests_total{status='passed'}[5m]) / rate(ci_tests_total[5m])"
                }]
            },
            {
                "title": "Active Test Pods",
                "targets": [{
                    "expr": "ci_active_test_pods"
                }]
            }
        ]
    }
}

Results & Impact

Quantitative Metrics

Speed Improvements:

Pipeline time: 45 min → 8 min (82% reduction)
Unit tests: 20 min → 2 min (90% reduction)
Integration tests: 15 min → 4 min (73% reduction)
E2E tests: 10 min → 2 min (80% reduction)

Reliability Improvements:

Pipeline success rate: 85% → 99% (+14 percentage points)
Mean time to recovery: 2 hours → 10 min (92% faster)
False positive rate: 15% → 2% (87% reduction)
Infrastructure uptime: 95% → 99.9% (+4.9 points)

Scalability Improvements:

Builds per day: 50 → 200 (4x increase)
Concurrent builds: 3 → 30 (10x increase)
Tests per build: 300 → 500 (67% more coverage)
Auto-scaling response: 5 min → 30 sec (90% faster)

Cost Improvements:

Infrastructure costs: $5K/month → $2K/month (60% reduction)
Developer time saved: 60 hours/week (30 devs × 2 hours/week)
Deployment frequency: 2/day → 10/day (5x increase)
Revenue impact: $200K/quarter (faster feature delivery)

Before/After Comparison

Metric	Before	After	Improvement
Pipeline Time	45 min	8 min	82% faster
Success Rate	85%	99%	+14 points
Builds/Day	50	200	4x increase
Cost	$5K/mo	$2K/mo	60% savings
MTTR	2 hours	10 min	92% faster
Deployments/Day	2	10	5x increase

Qualitative Impact

For Developers:

Instant feedback on PRs (8min vs 45min)
More confidence in changes
Less context switching
Happier, more productive teams

For QA Team:

Comprehensive test coverage
Reliable results they can trust
Time for exploratory testing
Proactive bug detection

For DevOps:

No more manual scaling
Self-healing infrastructure
Better resource utilization
Fewer 3am pages

For Business:

5x faster feature delivery
Competitive advantage
Reduced operational costs
Higher team velocity

Stakeholder Feedback

"This transformed our development velocity. We went from 2 deploys per day to 10, and confidence in our releases went through the roof." — VP of Engineering

"I used to dread PR reviews because I'd wait 45 minutes for tests. Now it's 8 minutes and I stay in flow. Game changer." — Senior Software Engineer

"The auto-scaling saved us $3K/month while handling 4x more builds. ROI was positive within the first month." — Head of DevOps

Lessons Learned

What Worked Well

Kubernetes native - Auto-scaling and self-healing solved 90% of operational issues
Matrix parallelization - Simple, effective, easy to understand
Docker layer caching - 5x faster image builds
Smart test grouping - Balanced groups finished at same time
Comprehensive monitoring - Prometheus + Grafana gave visibility into everything

What I'd Do Differently

Start with monitoring - Added it late, wish we had metrics from day 1
Better test categorization - Took time to optimize groupings
Resource limits tuning - Over-provisioned initially, wasted money
Gradual rollout - Should have migrated test by test, not all at once
Documentation earlier - Team onboarding was rough initially

Key Takeaways

Parallelization is king - Biggest single win for performance
Containerization enables reliability - Isolation prevents interference
Auto-scaling saves money - Only pay for what you use
Fast feedback drives quality - Developers test more when it's fast
Invest in infrastructure - Good CI/CD pays for itself quickly

Technical Debt & Future Work

What's Left to Do

Add visual regression testing
Implement test impact analysis (only run affected tests)
Add performance regression detection
Create test data factory
Add chaos engineering tests

Known Limitations

Cold start time can be 30-60 seconds
Kubernetes learning curve is steep
Some tests still have occasional flakiness
Cross-service integration tests are complex

Tech Stack Summary

Infrastructure:

Kubernetes 1.27+
Docker 24.x
GitHub Actions
AWS EKS (managed Kubernetes)

Testing:

pytest 7.x
pytest-xdist (parallel execution)
pytest-cov (coverage)
Selenium WebDriver

Monitoring:

Prometheus (metrics)
Grafana (dashboards)
ELK Stack (logs)
Datadog (APM)

Languages:

Python 3.9+
Bash scripts
YAML (k8s configs)

Blog Posts

Want to Learn More?

This pipeline is fully documented with setup guides and examples.

GitHub Repository: CI-CD-Testing-Pipeline

Documentation: Architecture diagrams, setup guides, troubleshooting

Examples: Kubernetes manifests, GitHub Actions workflows

Let's Work Together

Impressed by this project? I'm available for:

Full-time DevOps/QA roles
Consulting engagements
Infrastructure audits
Team training & workshops

Get in Touch | View Resume | More Projects

Technologies Used:

KubernetesDockerGitHub Actionspytestpytest-xdistPrometheusGrafana

Impressed by this project?

I'm available for consulting and full-time QA automation roles. Let's build quality together.

Get in Touch View More Projects