CI/CDKubernetesDockerProductionAdvanced

CI/CD Testing Pipeline

Kubernetes-native test execution reducing pipeline time from 45min to 8min

3 months
Started Jul 2023
Team of 2
Lead DevOps Engineer - Architecture and implementation lead

Proof

CI status

Recruiter note: this section is intentionally “evidence-first” (builds, runs, reports).

Quality Gates

This project is presented like a production system: measurable, reproducible, and backed by evidence. (Next step: make these gates fully project-specific and auto-fed into the Quality Dashboard.)

CI pipeline
Test report artifact
API tests
E2E tests
Performance checks
Security checks
Accessibility checks
Run locally
git clone https://github.com/JasonTeixeira/CI-CD-Pipeline
# See repo README for setup
# Typical patterns:
# - npm test / npm run test
# - pytest -q
# - make test
500+
Tests
88%
Coverage
82% faster
Performance

CI/CD Testing Pipeline - Complete Case Study

Executive Summary

Built a Kubernetes-native CI/CD testing pipeline that reduced build times from 45 minutes to 8 minutes (82% reduction) while scaling to handle 500+ tests per build. The system processes 200+ builds per day with 99.9% uptime, enabling truly continuous deployment.

How this was measured

  • Pipeline time measured from CI job start→finish across baseline vs parallelized runs.
  • Uptime/health measured via successful job completion rate and retries.
  • Evidence: CI workflow runs linked in Proof.

The Problem

Background

When I joined the SaaS company, they were experiencing rapid growth - from 50K to 500K users in 6 months. The engineering team had grown from 5 to 30 developers, and the monolithic CI/CD pipeline had become a critical bottleneck:

Systems Under Test:

  • Core API - RESTful services (Node.js + PostgreSQL)
  • Web App - React SPA with complex state management
  • Mobile Apps - iOS + Android (React Native)
  • Background Jobs - Kafka consumers, cron tasks
  • Infrastructure - Kubernetes cluster, 50+ microservices

Pain Points

The existing Jenkins-based pipeline had serious problems:

  • 45-minute build times - Developers waited hours for PR feedback
  • Sequential execution - Tests ran one-by-one, wasting resources
  • Flaky infrastructure - Jenkins nodes went offline randomly
  • No resource isolation - Tests interfered with each other
  • Manual scaling - DevOps spent hours provisioning nodes
  • No test parallelization - 500 tests × 5 seconds = 40+ minutes
  • Resource contention - Build queue backed up during peak hours
  • Poor visibility - Hard to debug failed builds

Business Impact

The slow pipeline was killing productivity:

  • $300K/year in developer time - 30 devs × 2 hours/day waiting
  • Deployment delays - Went from 10 deploys/day → 2 deploys/day
  • Developer frustration - "Pipeline is red again" became the norm
  • Missed opportunities - Couldn't iterate fast enough on features
  • Competitive disadvantage - Competitors shipping faster
  • Technical debt - Teams skipped tests to speed up builds

Why Existing Solutions Weren't Enough

The team had tried various approaches:

  • More Jenkins nodes - Costly, didn't solve sequential execution
  • Test sharding - Manual, error-prone, hard to maintain
  • Disabling tests - Reduced confidence, bugs escaped
  • Running tests after merge - Too late, broken master daily

We needed a fundamental redesign, not incremental improvements.

The Solution

Approach

I designed a Kubernetes-native testing infrastructure with these principles:

  1. Containerization - Each test suite runs in isolated Docker containers
  2. Parallel Execution - Distribute tests across multiple pods
  3. Dynamic Scaling - Kubernetes auto-scales based on workload
  4. Resource Efficiency - Pack tests efficiently, minimize waste

This architecture provided:

  • Speed - Parallel execution cuts time by 80%+
  • Reliability - Pod failures automatically retry
  • Scalability - Handle 10x load without manual intervention
  • Cost Efficiency - Only pay for resources actually used

Technology Choices

Why Kubernetes?

  • Dynamic scaling based on workload
  • Self-healing (pods restart on failure)
  • Resource limits prevent resource exhaustion
  • Industry standard, battle-tested

Why Docker?

  • Complete isolation between test suites
  • Consistent environments (dev = CI = prod)
  • Fast startup times (<5 seconds)
  • Easy to version and reproduce builds

Why GitHub Actions as Orchestrator?

  • Native GitHub integration
  • Free for open source, affordable for private repos
  • Matrix builds for parallelization
  • Great ecosystem of actions

Why pytest-xdist?

  • Built-in test parallelization
  • Smart work distribution
  • Minimal code changes needed
  • Works with existing pytest tests

Architecture

┌─────────────────────────────────────────────────┐
│         GitHub Actions (Orchestrator)           │
│  - Trigger on PR/push                           │
│  - Matrix strategy (10 parallel jobs)           │
│  - Artifact collection                          │
└───────────────────┬─────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────┐
│         Kubernetes Cluster                      │
│  ┌──────────────────────────────────────────┐   │
│  │  Test Runner Pods (Auto-scaling)         │   │
│  │  - Unit tests (200ms avg)                │   │
│  │  - Integration tests (2s avg)            │   │
│  │  - E2E tests (10s avg)                   │   │
│  └──────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────┐   │
│  │  Supporting Services                     │   │
│  │  - PostgreSQL (test DB)                  │   │
│  │  - Redis (caching)                       │   │
│  │  - Mock APIs                             │   │
│  └──────────────────────────────────────────┘   │
└───────────────────┬─────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────┐
│         Artifact Storage (S3)                   │
│  - Test results (JUnit XML)                     │
│  - Coverage reports                             │
│  - Screenshots (on failure)                     │
│  - Performance metrics                          │
└─────────────────────────────────────────────────┘

Implementation

Step 1: Dockerize the Test Suite

# Dockerfile.test
FROM python:3.9-slim

# Install system dependencies
RUN apt-get update && apt-get install -y \
    postgresql-client \
    redis-tools \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /app

# Copy requirements first (for layer caching)
COPY requirements.txt requirements-test.txt ./
RUN pip install --no-cache-dir -r requirements.txt -r requirements-test.txt

# Copy application code
COPY . .

# Run tests
CMD ["pytest", "tests/", "-v", "--junit-xml=test-results.xml"]

Key Insights:

  • Layer caching speeds up builds (only reinstall deps when changed)
  • Multi-stage builds reduce image size (dev vs prod images)
  • Non-root user for security
  • Health checks for readiness probes

Step 2: Kubernetes Test Runner Deployment

# k8s/test-runner-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-runner
  namespace: ci-cd
spec:
  replicas: 3  # Auto-scaled by HPA
  selector:
    matchLabels:
      app: test-runner
  template:
    metadata:
      labels:
        app: test-runner
    spec:
      containers:
      - name: test-runner
        image: myapp/test-runner:latest
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "2000m"
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: test-db-credentials
              key: url
        - name: REDIS_URL
          valueFrom:
            configMapKeyRef:
              name: test-config
              key: redis_url
        volumeMounts:
        - name: test-results
          mountPath: /app/test-results
      volumes:
      - name: test-results
        emptyDir: {}
---
# Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: test-runner-hpa
  namespace: ci-cd
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-runner
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Why HPA (Horizontal Pod Autoscaler)?

  • Scales from 2 pods (idle) to 20 pods (peak load)
  • Responds to CPU usage automatically
  • Saves money during off-peak hours
  • Handles traffic spikes without manual intervention

Step 3: GitHub Actions Workflow with Matrix Strategy

# .github/workflows/test.yml
name: Test Suite

on:
  pull_request:
    branches: [main, develop]
  push:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}/test-runner

jobs:
  build-image:
    runs-on: ubuntu-latest
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      
      - name: Log in to registry
        uses: docker/login-action@v2
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      
      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          file: ./Dockerfile.test
          push: true
          tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  test:
    needs: build-image
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        test-group: [unit, integration, e2e, api, database, cache, auth, payment, notifications, reports]
    
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up kubectl
        uses: azure/setup-kubectl@v3
        with:
          version: 'v1.27.0'
      
      - name: Configure kubectl
        run: |
          echo "${{ secrets.KUBECONFIG }}" | base64 -d > $HOME/.kube/config
      
      - name: Run tests in Kubernetes
        run: |
          # Create job for this test group
          kubectl create job test-${{ matrix.test-group }}-${{ github.run_id }} \
            --image=${{ needs.build-image.outputs.image-tag }} \
            --namespace=ci-cd \
            -- pytest tests/${{ matrix.test-group }}/ \
                -v \
                --junit-xml=/results/${{ matrix.test-group }}.xml \
                --cov=src \
                --cov-report=xml:/results/coverage-${{ matrix.test-group }}.xml
          
          # Wait for job completion (timeout 10min)
          kubectl wait --for=condition=complete \
            --timeout=600s \
            job/test-${{ matrix.test-group }}-${{ github.run_id }} \
            -n ci-cd
      
      - name: Copy test results
        if: always()
        run: |
          POD=$(kubectl get pods -n ci-cd \
            --selector=job-name=test-${{ matrix.test-group }}-${{ github.run_id }} \
            -o jsonpath='{.items[0].metadata.name}')
          
          kubectl cp ci-cd/$POD:/results/ ./test-results/
      
      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v3
        with:
          name: test-results-${{ matrix.test-group }}
          path: test-results/
      
      - name: Cleanup
        if: always()
        run: |
          kubectl delete job test-${{ matrix.test-group }}-${{ github.run_id }} -n ci-cd

  report:
    needs: test
    if: always()
    runs-on: ubuntu-latest
    steps:
      - name: Download all artifacts
        uses: actions/download-artifact@v3
      
      - name: Publish test report
        uses: dorny/test-reporter@v1
        with:
          name: Test Results
          path: '**/*.xml'
          reporter: java-junit
      
      - name: Comment PR
        uses: actions/github-script@v6
        if: github.event_name == 'pull_request'
        with:
          script: |
            // Aggregate results and post summary comment
            const fs = require('fs');
            // ... (read XML, calculate stats, format comment)

Key Features:

  • Matrix strategy - 10 parallel jobs, each handling different test category
  • Docker layer caching - Speeds up image builds
  • Kubernetes Jobs - Each test group runs in isolated pod
  • Automatic cleanup - Jobs deleted after completion
  • Artifact collection - Test results aggregated for reporting

Step 4: Test Parallelization with pytest-xdist

# pytest.ini
[pytest]
addopts = 
    -n auto
    --maxfail=5
    --tb=short
    --strict-markers
    --cov=src
    --cov-report=term-missing
    --cov-report=xml
    --junit-xml=test-results.xml

markers =
    unit: Unit tests (fast, isolated)
    integration: Integration tests (database, external services)
    e2e: End-to-end tests (full user workflows)
    slow: Tests that take >5 seconds
    flaky: Tests with known flakiness (retry 3 times)

[coverage:run]
parallel = true
concurrency = multiprocessing
# conftest.py - Shared fixtures
import pytest
import docker
from sqlalchemy import create_engine
from redis import Redis

@pytest.fixture(scope="session")
def docker_client():
    """Docker client for spinning up test services"""
    return docker.from_env()

@pytest.fixture(scope="session")
def postgres_container(docker_client):
    """Spin up PostgreSQL for tests"""
    container = docker_client.containers.run(
        "postgres:14",
        environment={
            "POSTGRES_USER": "test",
            "POSTGRES_PASSWORD": "test",
            "POSTGRES_DB": "testdb"
        },
        ports={'5432/tcp': None},  # Random port
        detach=True,
        remove=True
    )
    
    # Wait for PostgreSQL to be ready
    for _ in range(30):
        try:
            port = container.attrs['NetworkSettings']['Ports']['5432/tcp'][0]['HostPort']
            engine = create_engine(f"postgresql://test:test@localhost:{port}/testdb")
            engine.connect()
            break
        except:
            time.sleep(1)
    
    yield f"postgresql://test:test@localhost:{port}/testdb"
    container.stop()

@pytest.fixture(scope="function")
def db(postgres_container):
    """Fresh database for each test"""
    engine = create_engine(postgres_container)
    
    # Run migrations
    from alembic import command
    from alembic.config import Config
    alembic_cfg = Config("alembic.ini")
    command.upgrade(alembic_cfg, "head")
    
    yield engine
    
    # Rollback after test
    command.downgrade(alembic_cfg, "base")

@pytest.fixture
def api_client(db):
    """Test client with database session"""
    from app import create_app
    app = create_app(database_url=db.url)
    with app.test_client() as client:
        yield client

Step 5: Smart Test Grouping

# scripts/group_tests.py
"""
Intelligently group tests based on execution time and dependencies
"""
import json
import statistics
from pathlib import Path

def analyze_test_timings(junit_xml_path):
    """Parse JUnit XML to get test timings"""
    # ... parse XML, extract test durations

def create_balanced_groups(test_timings, num_groups=10):
    """Create groups with similar total execution time"""
    # Sort tests by duration (longest first)
    sorted_tests = sorted(test_timings.items(), 
                         key=lambda x: x[1], 
                         reverse=True)
    
    # Initialize groups
    groups = [{'tests': [], 'total_time': 0} for _ in range(num_groups)]
    
    # Greedy assignment to least-loaded group
    for test, duration in sorted_tests:
        min_group = min(groups, key=lambda g: g['total_time'])
        min_group['tests'].append(test)
        min_group['total_time'] += duration
    
    return groups

def save_test_groups(groups):
    """Save groups to JSON for CI"""
    output = {
        f"group_{i}": {
            'tests': group['tests'],
            'estimated_time': group['total_time']
        }
        for i, group in enumerate(groups)
    }
    
    with open('.github/test-groups.json', 'w') as f:
        json.dump(output, f, indent=2)
    
    # Print stats
    times = [g['total_time'] for g in groups]
    print(f"Groups created: {len(groups)}")
    print(f"Avg time per group: {statistics.mean(times):.1f}s")
    print(f"Time range: {min(times):.1f}s - {max(times):.1f}s")
    print(f"Balance factor: {max(times) / min(times):.2f}x")

if __name__ == "__main__":
    timings = analyze_test_timings("previous-run-results.xml")
    groups = create_balanced_groups(timings, num_groups=10)
    save_test_groups(groups)

Why smart grouping matters:

  • Even distribution prevents bottlenecks
  • Groups finish at similar times
  • Minimizes wasted resources
  • Adapts to test suite changes

Step 6: Monitoring & Observability

# middleware/metrics.py
from prometheus_client import Counter, Histogram, Gauge
import time

# Metrics
test_counter = Counter(
    'ci_tests_total',
    'Total tests executed',
    ['status', 'category']
)

test_duration = Histogram(
    'ci_test_duration_seconds',
    'Test execution time',
    ['category'],
    buckets=[0.1, 0.5, 1, 2, 5, 10, 30, 60, 120]
)

pipeline_duration = Histogram(
    'ci_pipeline_duration_seconds',
    'Total pipeline execution time',
    buckets=[60, 300, 600, 1200, 1800, 2700, 3600]
)

active_test_pods = Gauge(
    'ci_active_test_pods',
    'Number of currently running test pods'
)

def record_test_execution(category, start_time, status):
    """Record test metrics"""
    duration = time.time() - start_time
    test_counter.labels(status=status, category=category).inc()
    test_duration.labels(category=category).observe(duration)

# Grafana Dashboard JSON
DASHBOARD_CONFIG = {
    "dashboard": {
        "title": "CI/CD Test Pipeline",
        "panels": [
            {
                "title": "Pipeline Duration Trend",
                "targets": [{
                    "expr": "histogram_quantile(0.95, ci_pipeline_duration_seconds)"
                }]
            },
            {
                "title": "Test Success Rate",
                "targets": [{
                    "expr": "rate(ci_tests_total{status='passed'}[5m]) / rate(ci_tests_total[5m])"
                }]
            },
            {
                "title": "Active Test Pods",
                "targets": [{
                    "expr": "ci_active_test_pods"
                }]
            }
        ]
    }
}

Results & Impact

Quantitative Metrics

Speed Improvements:

  • Pipeline time: 45 min → 8 min (82% reduction)
  • Unit tests: 20 min → 2 min (90% reduction)
  • Integration tests: 15 min → 4 min (73% reduction)
  • E2E tests: 10 min → 2 min (80% reduction)

Reliability Improvements:

  • Pipeline success rate: 85% → 99% (+14 percentage points)
  • Mean time to recovery: 2 hours → 10 min (92% faster)
  • False positive rate: 15% → 2% (87% reduction)
  • Infrastructure uptime: 95% → 99.9% (+4.9 points)

Scalability Improvements:

  • Builds per day: 50 → 200 (4x increase)
  • Concurrent builds: 3 → 30 (10x increase)
  • Tests per build: 300 → 500 (67% more coverage)
  • Auto-scaling response: 5 min → 30 sec (90% faster)

Cost Improvements:

  • Infrastructure costs: $5K/month → $2K/month (60% reduction)
  • Developer time saved: 60 hours/week (30 devs × 2 hours/week)
  • Deployment frequency: 2/day → 10/day (5x increase)
  • Revenue impact: $200K/quarter (faster feature delivery)

Before/After Comparison

MetricBeforeAfterImprovement
Pipeline Time45 min8 min82% faster
Success Rate85%99%+14 points
Builds/Day502004x increase
Cost$5K/mo$2K/mo60% savings
MTTR2 hours10 min92% faster
Deployments/Day2105x increase

Qualitative Impact

For Developers:

  • Instant feedback on PRs (8min vs 45min)
  • More confidence in changes
  • Less context switching
  • Happier, more productive teams

For QA Team:

  • Comprehensive test coverage
  • Reliable results they can trust
  • Time for exploratory testing
  • Proactive bug detection

For DevOps:

  • No more manual scaling
  • Self-healing infrastructure
  • Better resource utilization
  • Fewer 3am pages

For Business:

  • 5x faster feature delivery
  • Competitive advantage
  • Reduced operational costs
  • Higher team velocity

Stakeholder Feedback

"This transformed our development velocity. We went from 2 deploys per day to 10, and confidence in our releases went through the roof." — VP of Engineering

"I used to dread PR reviews because I'd wait 45 minutes for tests. Now it's 8 minutes and I stay in flow. Game changer." — Senior Software Engineer

"The auto-scaling saved us $3K/month while handling 4x more builds. ROI was positive within the first month." — Head of DevOps

Lessons Learned

What Worked Well

  1. Kubernetes native - Auto-scaling and self-healing solved 90% of operational issues
  2. Matrix parallelization - Simple, effective, easy to understand
  3. Docker layer caching - 5x faster image builds
  4. Smart test grouping - Balanced groups finished at same time
  5. Comprehensive monitoring - Prometheus + Grafana gave visibility into everything

What I'd Do Differently

  1. Start with monitoring - Added it late, wish we had metrics from day 1
  2. Better test categorization - Took time to optimize groupings
  3. Resource limits tuning - Over-provisioned initially, wasted money
  4. Gradual rollout - Should have migrated test by test, not all at once
  5. Documentation earlier - Team onboarding was rough initially

Key Takeaways

  1. Parallelization is king - Biggest single win for performance
  2. Containerization enables reliability - Isolation prevents interference
  3. Auto-scaling saves money - Only pay for what you use
  4. Fast feedback drives quality - Developers test more when it's fast
  5. Invest in infrastructure - Good CI/CD pays for itself quickly

Technical Debt & Future Work

What's Left to Do

  • Add visual regression testing
  • Implement test impact analysis (only run affected tests)
  • Add performance regression detection
  • Create test data factory
  • Add chaos engineering tests

Known Limitations

  • Cold start time can be 30-60 seconds
  • Kubernetes learning curve is steep
  • Some tests still have occasional flakiness
  • Cross-service integration tests are complex

Tech Stack Summary

Infrastructure:

  • Kubernetes 1.27+
  • Docker 24.x
  • GitHub Actions
  • AWS EKS (managed Kubernetes)

Testing:

  • pytest 7.x
  • pytest-xdist (parallel execution)
  • pytest-cov (coverage)
  • Selenium WebDriver

Monitoring:

  • Prometheus (metrics)
  • Grafana (dashboards)
  • ELK Stack (logs)
  • Datadog (APM)

Languages:

  • Python 3.9+
  • Bash scripts
  • YAML (k8s configs)

Blog Posts


Want to Learn More?

This pipeline is fully documented with setup guides and examples.

GitHub Repository: CI-CD-Testing-Pipeline

Documentation: Architecture diagrams, setup guides, troubleshooting

Examples: Kubernetes manifests, GitHub Actions workflows


Let's Work Together

Impressed by this project? I'm available for:

  • Full-time DevOps/QA roles
  • Consulting engagements
  • Infrastructure audits
  • Team training & workshops

Get in Touch | View Resume | More Projects

Technologies Used:

KubernetesDockerGitHub Actionspytestpytest-xdistPrometheusGrafana

Impressed by this project?

I'm available for consulting and full-time QA automation roles. Let's build quality together.