Why I'm building a runtime governance layer for AI agents and Apps

Is “system prompting” enough for production? Why I’m building a runtime governance layer for AI agents. - Indie Hackers

Join

Bookmark

Comment

Report

I’ve spent the last few months obsessed with one realization:

System prompts alone are a weak foundation for production AI SaaS.

They are useful. They guide behavior. They make demos look impressive.

But once an AI product touches real users, messy context, business rules, customer pressure, pricing logic, memory, tools, and edge cases — the system prompt starts carrying too much responsibility.

That is why I’m building NEES Core Engine.

I’m trying to validate one honest question with the Indie Hackers community:

Are AI founders already feeling this pain in production, or am I still too early?

The problem: Agent Drift

In a demo, your AI agent can look perfect.

In production, it can slowly drift away from your intended product logic, business boundaries, tone, safety rules, and unit economics.

It is not always a dramatic failure.

Sometimes the answer sounds reasonable.

But behind that answer, the AI has ignored a policy, skipped a workflow step, used the wrong context, or made a decision nobody on the team can clearly trace.

A few examples:

Policy bypass

A support bot sounds polite and “safe,” but ignores company policy or terms just to satisfy the user.

Pricing hallucination

A sales assistant offers a discount, refund, or promise that was never approved by the business.

Context chaos

A CRM assistant changes tone or behavior based on messy, unfiltered, or outdated user history.

The black box problem

The model makes a decision, but the team cannot explain why that decision was allowed.

The LLM tax

The product keeps paying for repeated model calls for answers that should have been governed, reused, cached, or handled deterministically.

This is what I call Agent Drift.

The agent may still “work,” but it slowly moves away from the product’s intended behavior.

My thesis: prompts are for creativity; governance is for reliability.

Most builders try to fix this by adding longer prompts, more instructions, recursive checks, output filters, or simple guardrails.

That can help.

But I don’t think it is enough for production AI systems where behavior, policy, memory, cost, and traceability matter.

I believe production AI needs runtime governance.

The basic flow I’m building with NEES is:

App → NEES Governance Runtime → Model Provider → Governed Response

Instead of putting all behavioral responsibility inside a soft prompt, NEES adds a governance layer between the application and the model.

The goal is not to replace OpenAI, Anthropic, Google, LangChain, CrewAI, or any framework.

The goal is to make AI behavior more product-aligned before the response or action reaches the user.

NEES is designed around things like:

Pre-execution intent checks

Understanding what the user is trying to do before spending tokens or allowing a workflow path.

Policy enforcement

Checking model behavior against product-specific rules instead of relying only on prompt instructions.

Memory boundaries

Controlling what the AI can remember, use, or carry forward across interactions.

Traceable decisions

Recording why a response or action was allowed, blocked, escalated, or modified.

Escalation logic

Knowing when the AI should not answer directly and should hand off, clarify, or stop.

Cost governance

Avoiding unnecessary model calls when a safe deterministic path, cached answer, or reusable governed response is enough.

Fallback behavior

Keeping the product stable when the model provider fails, latency spikes, or a lower-cost/local route is more appropriate.

I’m looking for design partners, not customers.

I’m not looking for broad marketing feedback right now.

I’m looking for honest signal from founders and developers building:

AI SaaS products

support agents

CRM assistants

workflow automation tools

internal copilots

education AI

agentic products with tool use

Have you hit the “system prompt wall” yet?

Are you struggling with inconsistent behavior, lack of traceability, memory concerns, repeated LLM cost, or AI actions that need stronger business-rule control?

Or do you feel that prompts, guardrails, and custom checks are still good enough for where production AI is in 2026?

I’m looking to talk to 2–3 founders who are willing to test this on one small workflow.

This is not a sales pitch.

I want to personally help map one real workflow into a NEES-style governance structure and see whether runtime governance can reduce Agent Drift in a practical product environment.

Progress so far:

Developer Preview:

https://github.com/NEES-Anna/nees-core-developer-preview

Live Sample App:

https://naina.nees.cloud

Would love honest feedback from AI builders here:

Is runtime governance becoming a real missing layer for production AI, or is the market still too early?

Anna2612

posted to

Developers

on May 26, 2026

Say something...

Why I'm building a runtime governance layer for AI agents and Apps

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

Naphtha Shortages Having a Growing Impact in Japan