System Design
Quality Telemetry Pipeline
This portfolio isn’t just a UI—it's a small quality platform. It collects evidence from CI, turns it into structured telemetry, and displays it as a live-style dashboard.
GitHub ActionsArtifact ingestion (ZIP)Schema-driven metricsAWS S3 cloud modeGraceful fallbackVercel-friendly
Architecture (high level)
The key idea: treat CI as an event source and artifacts as a transport for evidence.
QA repos (pytest/playwright/allure)
└─ GitHub Actions workflow
├─ runs tests
├─ writes qa-metrics.json (schema)
├─ uploads artifact: qa-metrics (zip)
└─ uploads evidence artifacts (optional)
Optional Cloud Mode (AWS)
└─ GitHub OIDC → assume IAM role (no long-lived keys)
└─ write latest.json to S3 (cost-controlled retention)
qa-portfolio (Next.js on Vercel)
├─ /api/quality (server)
│ ├─ fetches recent workflow runs
│ ├─ finds newest run containing qa-metrics artifact
│ ├─ downloads artifact zip
│ ├─ extracts qa-metrics.json
│ └─ returns merged snapshot + telemetry (snapshot/live/cloud)
└─ /dashboard (client)
└─ renders KPIs + links + debug/observabilityFailure Modes (and how the system responds)
This dashboard is intentionally built with production-style degradation. When upstream systems fail, the UI stays usable and the API returns a coherent payload.
GitHub API rate limits / outages
- Impact: Live mode cannot fetch run metadata/artifacts.
- Detection: debug panel + response notes; CI remains the source of truth.
- Response: fall back to Snapshot mode (committed metrics.json).
Missing artifact / empty repo signal
- Impact: Live scan may not find qa-metrics on the newest run.
- Detection: debug fields show scan depth + matched run ID.
- Response: scan back through recent runs, or degrade to Snapshot if needed.
AWS proxy down / token mismatch
- Impact: Cloud mode cannot read metrics from AWS.
- Detection: CloudWatch alarms (errors/p95) + access logs.
- Response: fall back to Snapshot mode; Cloud mode still shows proof links.
S3 object missing / retention expired
- Impact: AWS mode returns 404/NoSuchKey.
- Detection: access logs + Lambda error alarm.
- Response: fail closed (no secrets) + degrade to Snapshot mode.
Reliability / Fall Back
Live data is best-effort. If GitHub rate-limits or a repo has no artifact on the latest run, the API scans recent runs and still returns a coherent response. If live fetch fails entirely, the dashboard degrades to the committed snapshot.
Pattern: progressive enrichment + deterministic baseline
Data Contract
Metrics are schema-driven (see
QUALITY_METRICS_SCHEMA.md). Workflows generate qa-metrics.json so the portfolio stays decoupled from individual frameworks.Pattern: contract-first telemetry
Security / Evidence
The API exposes only safe metadata (run IDs/URLs, scan depth). Secrets never leave the server. Evidence artifacts (reports, junit xml) are linked, not embedded.
Pattern: least privilege + safe observability
Performance
The API caches responses briefly to reduce GitHub API calls. Live mode is designed to be Vercel-friendly (serverless execution, short compute, explicit no-store).
Pattern: cache + rate-limit aware design
Threat model (abuse cases + mitigations)
This is intentionally a read-only telemetry surface. The system is designed so that even if someone abuses the public endpoints, the blast radius stays small.
- GitHub API rate-limit abuse: requests are cached briefly, and the API falls back to the committed snapshot.
- Token exposure: GitHub token stays server-only; the client never receives it. Responses include only safe metadata (run URLs/IDs) and sanitized debug fields.
- Artifact / ZIP attacks: artifacts are treated as untrusted input. Extraction is scoped to expected filenames and the metrics payload must validate against the schema.
- Data leakage: no secrets, logs, or raw environment are embedded in the dashboard. Evidence is linked, not embedded.
- Denial-of-service: live mode is best-effort; failures degrade gracefully to static mode rather than cascading.
Patterns: least privilege + untrusted input handling + graceful degradation
What this demonstrates
- Cloud/platform automation: CI as an event source, artifacts as evidence, optional AWS S3 cloud ingestion.
- Backend skills: API composition, schema contracts, caching, resilience to partial failure.
- Security posture: least privilege patterns (server-only tokens, OIDC-ready cloud auth).
- Product thinking: a dashboard that explains itself and links directly to proof (runs/reports).
Cloud deployment path (low cost)
Cloud mode is designed to be budget-friendly: S3-only storage (optional DynamoDB) with lifecycle retention. GitHub Actions can authenticate using OIDC (no long-lived AWS keys).