Appearance
Warrior AI — QA Strategy
Full QA coverage from unit tests through 10,000-user load simulation. Every item Anna needs to sign off — functional, load, security, and CI/CD gates — documented with pass criteria and current status.
Every endpoint works correctly, auth is solid, user data is isolated per Warrior.
System stays operational at 5,000–10,000 concurrent Warriors with sub-3s p95 response times.
Malicious users cannot break, extract, or corrupt Warrior data across any attack surface.
All QA work uses Staging only. No test ever runs against the Production Firebase project.
| Environment | Firebase Project | Purpose | Access |
|---|---|---|---|
| Staging | attack-with-stack-staging | All QA, load testing, integration testing | Steffen, Weston, QA |
| Beta | TBD | Pre-production validation with invited Warriors | Weston to provision |
| Production | TBD | Live system for 3,000+ Warriors | Post-launch only |

| File | Test | Expected |
|---|---|---|
firebase-auth.test.ts | Valid Firebase ID token | 200 — userId in context |
| Missing Authorization header | 401 — Unauthorized | |
| Malformed bearer token | 401 — Unauthorized | |
| Expired token | 401 — Unauthorized | |
| Token for User A used by User B | 401 — invalid signature | |
rate-limit.test.ts | 20 requests in 60s | All 200 |
| 21st request in window | 429 with Retry-After header | |
| Two users, each at 20 req | Both 200 — per-user limits | |
| /health — any request count | Always 200 — not rate limited | |
chat.test.ts | Valid auth + message → Dify mock | 200 SSE stream |
| Missing/empty message field | 400 — Bad Request | |
| Dify returns 5xx | 502 — Upstream Error |
| Test | Expected |
|---|---|
| Stack content at 5,000 chars | Passes validation |
| Stack content at 5,001 chars | Rejected — content too long |
| Stack type not in allowed enum | Rejected — invalid enum |
| Missing required fields | Rejected — missing field error |
digital_trainer_stack flag on create | Always set to true |
| GET /stacks/:userId — known user | 200 with array |
| POST /stacks/:userId — valid payload | 201 — digital_trainer_stack: true |
| PATCH /stacks/:userId/:stackId — non-existent | 404 |
| GET /context/:userId — cold Redis | Response < 300ms |
| GET /context/:userId — warm Redis | Response < 10ms |
| PUT /core4/:userId/:date — 6 of 8 toggles true | score = 75.0 |
These are the scenarios that represent a passing QA sign-off. Each requires a real staging session.
| # | Scenario | Pass Signal |
|---|---|---|
| E2E-01 | Power Stack conversation (happy path) | SSE streams within 3s; agent correctly identified; prior stacks referenced |
| E2E-02 | Coordinator routing accuracy | ≥ 8/10 messages routed to correct specialist (80% threshold) |
| E2E-03 | Firebase personalization | Agent's response references user's prior stack content |
| E2E-04 | Stack write and verification | New stack in staging Firestore with digital_trainer_stack: true |
| E2E-05 | Auth rejection | 401 within 200ms; no Dify call made |
| E2E-06 | User data isolation | User A and User B see only their own data — zero cross-contamination |
| E2E-07 | Rate limit enforcement | 21st request returns 429 with Retry-After |
| E2E-08 | Core4 score calculation | PUT Core4 with 6/8 true → score = 75.0 in Firestore |
| E2E-09 | Bridge write rate limit | 11th write/minute returns 429 from Bridge |
| E2E-10 | Full voice conversation | Audio plays within 5s; no transcription gaps |
Tool: k6. SSE-native, TypeScript-friendly, clean dashboards, CI-ready.
| Phase | Users | Duration | Goal | Trigger |
|---|---|---|---|---|
| Baseline | 0 → 100 | 1 hour | Find current ceiling | Now (before Garrett demo) |
| Confidence | 0 → 200 | 2 hours | Confirm 200-user stability | After baseline passes |
| Stress | 0 → 500 | 4 hours | Find the failure mode | Pre-beta launch |
| Scale | 0 → 2,000 | 8 hours | Validate Step 1 upgrade | After VPS upgrade |
| Production Sim | 0 → 5,000 | 24 hours | Validate Stage 2 architecture | Pre-production launch |
| Metric | Green | Yellow | Red |
|---|---|---|---|
| p95 response time | < 3s | 3–8s | > 8s |
| Error rate | < 0.5% | 0.5–2% | > 2% |
| VPS RAM | < 9GB | 9–11GB | > 11GB |
| VPS CPU | < 70% | 70–85% | > 85% |
| Dify worker queue depth | < 10 | 10–50 | > 50 |
| Redis cache hit rate | > 80% | 60–80% | < 60% |
All items require evidence, not just "looks good". All F-01–F-09 must be GREEN before the Garrett demo. All S-01–S-20 before production launch.
digital_trainer_stack: true401 within 200ms429 on 11th write/minute) confirmedFull security analysis in Security Documentation. Items below are the minimum gate. S-01–S-20 all required before production launch.
| Item | Owner | Priority | Blocks |
|---|---|---|---|
| Set ALLOWED_ORIGIN in staging .env | Steffen / Weston | HIGH | S-01 |
| Implement Bridge write rate limit (ADR-W021 Redis) | Jeremy | HIGH | S-07 |
| Add input length limit on Gateway /chat message | Jeremy | MEDIUM | S-06 |
| Build unit test suite for both repos | Jeremy | HIGH | F-01, F-02 |
| Set up GitHub Actions CI pipeline | Jeremy | MEDIUM | CI-01 |
| Provision Beta Firebase project | Weston | MEDIUM | Beta launch |
| Full security audit — prompt injection, Dify sandboxing | Separate session | HIGH | Production |