# Marketing Readiness Analysis — veri-pr-agent (efix)

**Date:** 2026-02-23
**Reviewers:** Backend Engineer, Product Expert, Technical Architect, Devil's Advocate (agent panel)

---

## Executive Summary

The core idea — forcing an AI coding agent to follow a reproducible, evidence-backed protocol before opening a PR — is worth building. The policy framework and evidence recorder show real engineering thought. The architectural skeleton is extensible. **However, the product is not ready for public marketing.**

### Composite Scores

| Dimension | Score | Reviewer |
|---|---|---|
| Production readiness | 2/10 | Backend Engineer |
| Marketing readiness | 3.5/10 | Product Expert |
| Architectural soundness | 6.5/10 | Technical Architect |
| Premise validation | Pre-launch only | Devil's Advocate |

---

## What's Working Well

- **Evidence capture** is the most complete and differentiated part: structured JSON + Markdown artifacts, secret redaction, phase tracking, before/after command output capture.
- **Policy framework** (test, diff, command allowlist) is mature for an MVP — configurable, with sensible defaults and documented exceptions.
- **Zero-dependency architecture** is a deliberate, respectable trade-off — no supply chain risk, no build step, Node 22 native APIs only.
- **Engine interface** (`src/types.ts:136`) is cleanly designed for extensibility.
- **TypeScript + test coverage** is solid: 5 test files, dependency injection, mock engine pattern (`openaiEngine.ts:50-54`).

---

## Blocking Issues

### B1 — Primary User Story Is Unimplemented
**File:** `.github/workflows/efix.yml`
The PRD's core UX (`/efix` PR comment trigger) is completely absent. Only `workflow_dispatch` exists. This is 5 manual UI clicks vs. typing a comment. Until a PR comment trigger is wired up, the product doesn't match its own pitch.

### B2 — GitHub Action Breaks on Real Repos
**File:** `.github/workflows/efix.yml:38-46`
No `npm install` / `pnpm install` step before the repro command. Any Node.js repo with dependencies will fail for the wrong reason — the bot diagnoses a phantom bug. This is a one-line fix.

### B3 — `git add -A` Is a Security Risk
**File:** `.github/workflows/efix.yml:68`
Commits everything in the working tree including: log files, secret files written during repro phase, AI-generated files outside `src/`. Must be scoped to an explicit path allowlist.

### B4 — No Validated Success Rate
Zero production runs on real repos. The claim "turns failing tests into PRs" is unsubstantiated. The LLM generates unified diffs from truncated context; nobody knows the real success rate. Marketing without this data is reputational risk.

### B5 — OpenAI-Only Despite Multi-Engine Pitch
**File:** `src/types.ts:31`
`engine: "openai"` is hardcoded. The PRD and architecture describe Claude Code and Codex support. Shipping one engine while marketing pluggability is a trust risk.

---

## High-Priority Gaps

### H1 — `verify()` Is a Stub
**File:** `src/engine/openaiEngine.ts:131-139`
Counts pass/fail exit codes only. Does not validate that tests are meaningful or that the fix actually addresses the original failure.

### H2 — No Dry-Run Before `git apply`
**File:** `src/git/git.ts:25-28`
`git apply` runs without `--check` (dry-run) first. If the patch partially applies, the working tree is left dirty with no rollback. Must add `git apply --check` then rollback on failure.

### H3 — Shell Injection Risk
**File:** `src/evidence/shellRunner.ts:36`
`spawn()` runs with `shell: true`. The command allowlist uses prefix string matching — `npm test; rm -rf /` would pass if `npm test` is in the allow list. Needs stricter argument handling.

### H4 — Noisy Failure PRs
**File:** `.github/workflows/efix.yml:74-93`
Failed runs open a PR with a generic error body. Teams get noise PRs they must manually close. Failures should post a GitHub Check + comment, not open a PR.

### H5 — No Cost/Token Tracking
Every run makes 2+ OpenAI API calls with no token count logging, no per-run cost estimate, no rate-limit backoff. At scale this is blind spending.

### H6 — No Getting Started Guide, Demo, or Screenshots
The README is a stub. No one can evaluate this tool without running it. Need: step-by-step setup guide, one end-to-end screenshot walkthrough, example PR body showing the Evidence section.

---

## Secondary Issues

| Issue | File | Severity |
|---|---|---|
| Test policy only checks if test file was modified, not if tests are meaningful | `src/policies/testPolicy.ts:65-77` | Medium |
| No overall orchestrator timeout or cleanup on failure | `src/orchestrator.ts:59-161` | Medium |
| No retry loop for repro/verify steps | `src/orchestrator.ts:70-122` | Medium |
| `simpleYaml.ts` is a hand-rolled partial parser | `src/config/simpleYaml.ts` | Low |
| "Localize" and "Prevent" protocol phases not implemented | Protocol gap | Medium |
| OpenAI retry logic does not implement exponential backoff | `src/engine/openaiEngine.ts:282-307` | Medium |
| No structured logging or observability hooks | Global | Medium |
| `getChangedFiles()` lists only first N files in large repos | `src/engine/openaiEngine.ts:57` | Low |

---

## What Marketing Would Require

### Minimum Viable Announcement (~25–35 hours)

1. Implement `/efix` PR comment trigger — **[DONE in this sprint]**
2. Fix `npm install` step in workflow — **[DONE in this sprint]**
3. Fix `git add -A` to scoped allowlist — **[DONE in this sprint]**
4. Add `git apply --check` dry-run + rollback — **[DONE in this sprint]**
5. Fix failure-PR UX (GitHub Check + comment, not PR) — **[DONE in this sprint]**
6. Dogfood on 5–10 real repos and document results honestly
7. Write Getting Started guide with screenshot walkthrough — **[DONE in this sprint]**

### Strong Announcement (add ~15–20 more hours)

8. Add token cost logging per run — **[DONE in this sprint]**
9. Add a second engine (Claude Code stub minimum)
10. Publish a benchmark: "X% of Node.js unit-test regressions fixed in our test set"

---

## Architecture Recommendations (v1)

From the Technical Architect review:

- **Add orchestrator retry loop** (Option A: `runWithRetry()` wrapper per step) — 30 min, high impact
- **Stateful orchestration** — migrate to state machine, add checkpoint persistence
- **Multi-engine registry** — `EngineRegistry.get(name)` pattern for swappable engines
- **Distributed job queue** — for concurrent runs and rate-limit-aware scheduling
- **Structured logging** — JSON logs per phase for observability
- **Web dashboard** — real-time run progress, evidence viewer, manual overrides

---

## Devil's Advocate Summary

The five most important pre-marketing blockers (beyond the technical gaps):

1. **Core premise is unvalidated** — zero production success rate data
2. **Blast radius is uncontrolled** — `git add -A` + AI diffs + no safety perimeter
3. **"Evidence-first" isn't differentiated** — it's what every CI system already does; the real differentiator is the protocol enforcement and policy framework
4. **100% OpenAI dependency** — no cost controls, no fallback, no SLA
5. **UX doesn't match the pitch** — workflow_dispatch ≠ `/efix` comment

---

## Files Changed in This Sprint

See `docs/PROCESS.md` for implementation notes and change log.

---

## Post-Sprint Re-Analysis (Sprint 1 Results)

**Date:** 2026-02-23 (same day, after implementation)
**Tests:** 13/13 passing

### Updated Scores

| Dimension | Pre-Sprint | Post-Sprint | Change |
|---|---|---|---|
| Production readiness | 2/10 | 4/10 | +2 |
| Marketing readiness | 3.5/10 | 5.5/10 | +2 |
| Architectural soundness | 6.5/10 | 6.5/10 | — |
| Premise validation | unvalidated | unvalidated | — |

### Blocking Issues Status

| ID | Issue | Status |
|---|---|---|
| B1 | `/efix` PR comment trigger missing | **CLOSED** — `issue_comment` trigger implemented |
| B2 | No `npm install` in workflow | **CLOSED** — lockfile-aware install step added |
| B3 | `git add -A` stages secrets/artifacts | **CLOSED** — scoped to `artifacts/ src/ test/` |
| B4 | No `git apply --check` dry-run | **CLOSED** — dry-run + rollback on failure |
| B5 | Failed runs open noise PRs | **CLOSED** — posts comment instead; PR only on success |
| H1 | `verify()` only counts exit codes | **OPEN** — deferred to v1 |
| H3 | Shell injection via `shell: true` | **PARTIAL** — metacharacter guard added; `shell: true` still used |
| H5 | No token cost tracking | **CLOSED** — token counts logged to stderr per API call |
| H6 | No Getting Started guide | **CLOSED** — `docs/GETTING_STARTED.md` created (324 lines) |

### New Issues Identified in Re-Analysis

| Issue | Severity | File | Notes |
|---|---|---|---|
| `git add -u` stages deletions in all tracked paths, not just `src/ test/ artifacts/` | Low | `.github/workflows/efix.yml:112` | Should be `git add -u artifacts/ src/ test/` |
| Metacharacter regex misses `>` redirect and newline injection | Low | `src/evidence/shellRunner.ts:29` | Band-aid; v1 fix is argument-array spawn |
| Hardcoded `src/ test/` paths don't match all repo layouts (e.g., `lib/`) | Medium | `.github/workflows/efix.yml:111` | Silent incomplete-patch risk |

### What Still Blocks a Broad Public Announcement

1. **Zero production success rate data** — the single highest-risk unvalidated assumption. Run efix against 5–10 real repos and publish honest results before any broad marketing.
2. **`verify()` semantic gap** — verification passes if the repro command exits 0; it does not confirm the specific failing test now passes. False positives are possible.
3. **Shell injection not fully eliminated** — `shell: true` remains. The metacharacter guard reduces risk but does not eliminate it. Must be addressed before marketing to security-conscious teams.

### What a Credible Announcement Looks Like Now

**Today (beta/research framing):** Credible with full disclosure of MVP status and unvalidated success rate. Good audience: early-adopter engineers, CI/CD researchers.

**In 2 weeks (with dogfood data + `verify()` fix):** Credible as a production beta. Good audience: engineering teams, DevEx platform builders.

**Recommended next sprint:** Dogfood on 5+ real repos → publish results → fix `verify()` semantic validation → fix `shell: true` → announce.
