The Verifiability Framework

I. The Problem with “AI in Government”

Every few months, a vendor walks into a municipal building somewhere and promises transformation. AI will streamline permitting. AI will optimize traffic. AI will predict crime, allocate resources, revolutionize constituent services. The pitch is always confident, the demos always impressive, the ROI projections always compelling.

And every few months, someone else publishes an op-ed warning that AI in government means algorithmic bias, job losses, surveillance, and the erosion of democratic accountability. The fears are real, the examples troubling, the caution warranted.

Here's what frustrates me about both narratives: they treat “AI in government” as a single thing you're either for or against. After four and a half years on Elgin's city council—and as someone who builds civic AI tools—I've come to believe that framing is the actual problem. It's not that AI is good or bad for government. It's that we lack a coherent framework for deciding where it belongs.

I want to offer one.

II. The Foundation: What Can We Verify?

The insight that changed how I think about this comes from Andrej Karpathy, former director of AI at Tesla. In discussing the current wave of AI development, he draws a distinction between two types of software:

Software 1.0 automates what you can specify—write explicit rules, and the computer follows them.

Software 2.0 (neural networks, large language models) automates what you can verify—define what success looks like, and the system learns to achieve it through iteration.

“Software 1.0 easily automates what you can specify. Software 2.0 easily automates what you can verify.”

This sounds abstract, but the implications are concrete. For AI to reliably automate something, three conditions need to be met:

Resettable: You can start fresh and try again
Efficient: You can make many attempts quickly
Rewardable: You can automatically evaluate whether each attempt succeeded

Think about what this means. AI excels at chess because every game starts from the same position, games are fast, and win/lose is unambiguous. AI excels at code generation because you can run the code, see if it works, and iterate. AI excels at math because answers are provably correct or incorrect.

Now think about what AI struggles with: creative work with no “right answer,” strategic decisions involving tradeoffs between competing values, tasks requiring real-world context and common sense judgment. These aren't just hard—they're non-verifiable. There's no automated process to score whether a poem is good, a strategy is wise, or a judgment call was correct.

This is the lens I now apply to every AI deployment question in government: Can we verify success?

III. Applying the Framework to Municipal Government

When you look at local government through this lens, a clarifying pattern emerges.

What's Verifiable (Candidates for Automation)

Many government processes are genuinely rule-based. They have clear inputs, defined procedures, and unambiguous success criteria:

Permit completeness checks: Did the application include all required documents? Are the measurements within zoning tolerances? This is a checklist with yes/no answers.
Document classification and routing: Is this a FOIA request, a complaint, a compliment, a service request? Route accordingly. Categories are defined; accuracy is measurable.
Compliance verification: Does this contractor's insurance meet our requirements? Does this business license application satisfy the prerequisites? The rules exist in writing.
Data validation: Are these budget numbers internally consistent? Do these entries match the required format? Does this record contain the mandatory fields?

These tasks share a common feature: a human could explain exactly what “correct” looks like, and you could check whether the AI got it right. That's verifiability.

What's Not Verifiable (Requires Human Judgment)

Now consider the decisions that actually matter in municipal governance:

Budget priorities: Should we fund the new fire station or the road repairs? Both are legitimate needs. There's no algorithm for weighing public safety against infrastructure maintenance against tax burden.
Policy tradeoffs: Should we prioritize affordable housing development or neighborhood character preservation? Downtown economic development or residential parking? These aren't optimization problems—they're value conflicts that require democratic deliberation.
Community needs assessment: What does this neighborhood actually need? Not what the data suggests, but what the people who live there want for their community. That requires listening, judgment, and representation.
Accountability decisions: When a city-funded organization underperforms, what's the appropriate response? Increased oversight? Funding reduction? Partnership termination? These are judgment calls with no objectively correct answer.

These decisions are non-verifiable by definition. There's no ground truth to train against. You can't score whether a budget was “right”—only whether it reflected community priorities, balanced competing needs responsibly, and achieved intended outcomes over time. That's a human judgment, made democratically, with accountability.

IV. The Decision Heuristic

This gives us a simple but powerful heuristic for AI deployment in local government:

Before deploying AI, ask: “Can we verify this?”

If YES: The task is a candidate for automation. AI can practice, iterate, and improve. Success criteria are clear. Deploy with appropriate oversight, but deploy.

If NO: AI should augment human judgment, not replace it. Use it to expand your research, surface options you hadn't considered, stress-test your reasoning. But the human—accountable to constituents—makes the decision.

This framework does two things simultaneously:

It enables progress. Instead of paralyzing debates about whether AI belongs in government at all, we can identify specific, high-value automation opportunities and pursue them. Permit intake processing isn't a threat to democracy—it's a chance to reduce wait times and free staff for higher-value work.

It prevents misuse. When someone proposes automating something non-verifiable—using AI to make parole recommendations, allocate social services, or “optimize” policy decisions—we have principled grounds for resistance. Not technophobia, but a coherent framework: that's not a verifiable task, so automation is inappropriate.

It surfaces a third path: building internal capacity. Here's something the framework reveals that often gets missed. Many verifiable tasks don't actually need AI—they need better software. Traditional automation (Software 1.0) can handle anything you can fully specify with rules. AI is only necessary when the task is verifiable but too complex to write explicit rules for.

This matters because municipalities don't have to choose between expensive vendor solutions and doing nothing. With AI-assisted development tools, local governments can build custom software in-house for their specific needs. The same AI that might be overkill for running a process can be invaluable for building the software that runs it.

I'm focused on building local government's internal capacity for custom software development...starting with Elgin. Too often, municipalities run to vendors for solutions that could be built in-house with the right tools and approach. AI-assisted development makes this more accessible than ever, and it reduces long-term vendor dependency while building institutional knowledge.

Test Your Use Case

Use the interactive Verifiability Lab to assess whether your municipal AI project is a candidate for automation, augmentation, or should remain human-led.

Try the Assessment Tool→

V. PolicyAide: The Framework in Practice

This is why I built PolicyAide the way I did.

PolicyAide is an 8-agent system for policy research. It helps me explore immigration policy, economic development options, AI governance frameworks—the complex questions that land on a council member's desk without obvious answers.

Crucially, it's designed as a thought partner, not a decision-maker. Policy questions are non-verifiable. There's no “correct” immigration policy, no objectively optimal economic development strategy. These are value-laden questions requiring democratic input.

So PolicyAide expands my research surface area. It finds perspectives I'd miss, identifies tradeoffs I hadn't considered, surfaces relevant precedents from other municipalities. It makes me a better-informed decision-maker.

But I make the decision. I'm the one accountable to my constituents. I'm the one who has to explain my vote to constituents. That accountability can't be delegated to an algorithm—and with non-verifiable decisions, it shouldn't be.

This is what responsible civic AI looks like: matching the tool to the task based on verifiability, not hype or fear.

VI. Why This Matters Now

The window for establishing these norms is narrow.

AI capabilities are advancing rapidly. Vendors are pitching aggressively. Municipalities are under pressure to modernize, reduce costs, and do more with less. The temptation to automate everything—or to ban everything out of caution—is real.

We need practitioners inside government who can navigate this. Not consultants selling implementations, not academics theorizing from outside, but elected officials and public servants who understand both the technology and the context.

The verifiability framework gives us common language. When a vendor proposes AI for “predictive policing,” we can ask: what exactly are we verifying? If the answer is unclear—or if “success” requires predicting human behavior in ways that embed bias—we have grounds to decline. When IT proposes AI for form validation, we can recognize that as a verifiable task with clear success criteria and appropriate for automation.

This isn't about being pro-AI or anti-AI. It's about being precise about where AI helps and where it doesn't.

VII. Conclusion: Building from Inside

I'm a city council member who builds AI tools. That's unusual, and it gives me a perspective that's hard to get otherwise. I've sat in budget meetings where we debated values and tradeoffs—decisions no algorithm should make. I've also waded through processes that are pure rule-following, begging for automation.

The verifiability framework emerged from that experience. It's not theoretical—it's a heuristic I actually use, every week, as proposals cross my desk and tools cross my screen.

If you're in local government, I hope this framework helps you evaluate the AI pitches coming your way. Ask the verifiability question. Push vendors to be specific. Distinguish between automation opportunities and judgment calls that require human accountability.

If you're building civic technology, I hope this framework shapes what you build. Create tools that augment non-verifiable decisions rather than claiming to automate them. Focus automation efforts on genuinely verifiable tasks where success criteria are clear.

The municipalities that get this right will be the ones that adopt AI thoughtfully—capturing efficiency gains where appropriate while preserving democratic accountability where it matters. That's the goal. The verifiability framework is a tool for getting there.

The Companion Framework

The Verifiability Framework answers can AI do this? The Trust Stack answers when does investing in trust pay technical dividends? Together they form a complete diagnostic for municipal AI deployment.

Read the Trust Stack→