The Last 10%: a production-readiness checklist for AI-built software
The concrete list of things that separate a prototype that demos well from software that survives real use — the gap where most AI projects stall.
who it's for — Founders and teams with an AI-built prototype that's functionally complete but not yet in production.
AI coding tools get you to a working prototype fast. The danger is mistaking 'it works on my machine in the happy path' for 'it's ready for real users.' The remaining work is unglamorous, hard to estimate, and exactly where projects die.
This checklist is the list we run on every Last Mile engagement. It's organized by the categories that actually break in production, in rough order of how often they bite.
Confirm every route and API is actually protected, not just hidden in the UI. Check session expiry, token refresh, and what happens when a token is invalid mid-request.
If you use row-level security (Supabase RLS) or multi-tenant scoping, write tests that prove one user cannot read another's data. This is the single most common gap in AI-built apps.
Empty states, huge inputs, duplicate submissions, slow networks, and the back button. Each one is a place a prototype quietly assumes the happy path.
For AI features specifically: what happens when the model returns malformed output, times out, or refuses? Validate and handle every model response as untrusted input.
Get it onto real infrastructure with a real domain, not a preview URL. Audit environment variables — the classic failure is a value that only differs in production.
Confirm build, migrations, and secrets all work in the deployed environment, and that a rollback path exists.
You cannot operate what you cannot see. Add structured logging, error tracking, and — for AI features — a way to replay any run and see every model and tool call.
Put rate limits and cost controls on anything that calls a paid model API before the first surprise bill, not after.
Production-ready also means someone other than the original builder can run it. Write down how to deploy, where the secrets live, and what to do when it breaks.
A system only one person understands isn't deployed — it's a liability with good uptime.
- ▸Every route and API is authenticated and authorized, verified by test
- ▸Multi-tenant / RLS isolation proven with a cross-account test
- ▸Empty, huge, duplicate, and slow-network inputs all handled gracefully
- ▸Every model/tool response validated and handled as untrusted
- ▸Deployed to real infrastructure with a real domain and a rollback path
- ▸Environment variables audited; no prod-only surprises
- ▸Structured logging, error tracking, and run replay in place
- ▸Rate limits and cost caps on every paid model call
- ▸Deploy, secrets, and incident steps documented for handoff
How long does the last 10% actually take?
For a focused, single-product prototype, the highest-risk items above are usually a one-week hardening sprint. That's exactly what the Last Mile engagement is scoped to.
Can't AI tools just finish this too?
They help, but the last mile is judgment work — knowing which edge cases matter, where the security gaps hide, and what 'done' means operationally. That's the senior-engineer part the tools don't replace yet.