PlanVault platform for AI execution
Connect existing services, govern tool execution, and keep every run observable. Secrets stay outside prompts.
A model plan becomes a validated, replayable execution trace
PlanValidator passed
Schemas, params, and terminal path checked before execution.Approval required
Runtime pauses side effects until policy approval.Model-safe context
Secrets stay out of the prompt; raw large outputs stay out of the planner by default, except explicitly enabled bounded evidence for read-only replans.The LLM returns a plan, not side effects
plan "notify_customers"
orders = list_orders(limit: 10)
for order in orders
email = get_field(order, "billing_email")
if email.present?
notify_customer(order.id, email)
reply("Notifications queued")PlanVault validates and executes
tool shortlist
donePlanValidator
doneapproval gate
waitingreply
queuedEvery step leaves evidence
Approval event
Approval policy, actor, and decision are preserved with the run.Secret boundary
Secrets resolve inside runtime and never become prompt text.Replay trail
Diagnostics keep the trace inspectable without exposing raw output by default.Controls for risks you cannot leave to the model
Data boundaries, routing, recovery, secrets, and large responses are controlled by the system, not prompt instructions.
Data stays under your control
Deploy inside your VPC or an air-gapped capable environment, connect local models through LiteLLM, and keep data inside your own perimeter.
Tool scale without prompt chaos
PlanVault narrows a large service catalog to the relevant tools before planning, so the model does not see thousands of descriptions or spend tokens on routing. A Semantic Routing Cache accumulates anonymized embedding vectors from successful runs and improves retrieval accuracy over time — without ever storing raw query text.
Recovery without uncontrolled side effects
Every execution step is recorded as an event. After a failure, a run can recover or stop safely without zombie processes or duplicated mutations.
Late-bound secrets, encrypted at rest
Secrets are late-bound through scoped references; prompts receive variable names, not plaintext credentials.
Large responses without context overflow
The runtime works with large service responses without pushing raw payloads into the LLM by default; read-only tools can explicitly opt into a bounded evidence excerpt for replanning.
Named outcomes for stakeholders
Short phrases teams can reuse when explaining the governed layer between models and APIs.
Controlled side effects
Approvals, policies, and runtime pauses stop unintended changes until the next step is explicitly safe.
Less context bloat and cost
Relevant tool selection and compact context reduce wasted tokens on large catalogs and schemas.
Faster integration work
OpenAPI, MCP, and webhooks attach to a catalog without rewriting every agent by hand.
Security review support
Approvals, policies, and execution records stay in one journal for security reviews.
Incident investigation
Replay and event logs help teams understand what actually ran after an outage or ticket.
GDPR and user-data operations
Export, delete, and pseudonymisation workflows for AI run history aligned to personal-data obligations.
Where PlanVault sits next to your agent stack
PlanVault is the governance and execution layer that connects to your agent stack — existing LangChain, LangGraph, CrewAI, AutoGen, DSPy, OpenAI agent API, or MCP agents can run behind a PlanVault tool boundary, with tool calls, secrets, approvals, and recovery handled outside the model and every run recorded for audit and replay.
From existing services to governed AI execution
Five steps show where the model stops and the governed runtime takes over: tool selection, validation, execution, audit, and replay.
Connect the services that already do the work
Start with what your team already has: APIs, MCP servers, webhooks, session context, and correlation metadata. PlanVault turns them into a tool catalog that is easier to select from and safer to plan against.
OpenAPI/Swagger import runs asynchronously and versions tool definitions.
MCP servers and outbound webhooks become normal tools in the same catalog.
Flattened schemas, descriptions, and search documents make tools easier for retrieval and planner prompts.
X-Request-Id, traceparent, tags, and metadata help correlate runs with your systems.
PlanVault prepares the shortlist and context before the LLM
Before the model is called, PlanVault narrows a large catalog into a relevant shortlist. Selection combines retrieval, scenarios, usage signals, and project limits so the model sees prepared context instead of the whole enterprise catalog.
Manual and auto-recorded scenarios provide boosts and optional planner guidance.
Adaptive retrieval moves from direct selection to FTS/vector/hierarchical search as the catalog grows. A Semantic Routing Cache accumulates anonymized embedding vectors after successful runs, improving retrieval precision over time without storing any raw query text.
Hybrid fusion combines retrieval score, scenario boost, and capped usage signal.
Secrets and raw large outputs do not become planner prompt content by default; bounded evidence replan is explicit for read-only tools.
The model plans, but does not execute side effects
The planner can use different providers through LiteLLM, and the planning mode depends on the selected model. The LLM creates a plan from allowed context; execution, secrets, and side effects stay in runtime.
The model resolves at session, project, organization, or global default level.
Structured JSON and Python-like DSL planning modes are supported.
A utility model can handle short auxiliary responses such as slot summaries.
An embedding model builds vectors for tools and scenarios in the background so future requests can find relevant tools faster.
PlanVault validates the plan and executes it deterministically
The plan is validated and executed through a reactive JVM actor backend. Plan approval, runtime policy for concrete tool calls, idempotency safeguards, error handling, recovery, and event sourcing live here, instead of letting the model directly control tool calls.
PlanValidator checks tools, params, variables, and exactly one terminal reply/fail.
The execution graph runs through a crash-resilient engine with event-sourced state.
Runtime gates can wait for approval or block a tool call after live parameters are evaluated.
Retry and replan paths are bounded by configuration and runtime control.
Get traceable output, audit, and production workflow integration
The result is more than a user-facing answer. PlanVault leaves an operational trail: SSE, history, diagnostics, replay, audit logs, encrypted data, and lifecycle hooks for your systems.
SSE and history expose planGraph, tool events, approvals, errors, and terminal states.
Diagnostics and replay help investigate runs without unsafe raw-content access by default.
Envelope encryption, retention, export/erasure, and pseudonymisation support GDPR-oriented operations.
Lifecycle webhooks and APIs let you integrate incrementally with the team and services you already have.
Connect the services that already do the work
Start with what your team already has: APIs, MCP servers, webhooks, session context, and correlation metadata. PlanVault turns them into a tool catalog that is easier to select from and safer to plan against.
OpenAPI/Swagger import runs asynchronously and versions tool definitions.
MCP servers and outbound webhooks become normal tools in the same catalog.
Flattened schemas, descriptions, and search documents make tools easier for retrieval and planner prompts.
X-Request-Id, traceparent, tags, and metadata help correlate runs with your systems.
PlanVault prepares the shortlist and context before the LLM
Before the model is called, PlanVault narrows a large catalog into a relevant shortlist. Selection combines retrieval, scenarios, usage signals, and project limits so the model sees prepared context instead of the whole enterprise catalog.
Manual and auto-recorded scenarios provide boosts and optional planner guidance.
Adaptive retrieval moves from direct selection to FTS/vector/hierarchical search as the catalog grows. A Semantic Routing Cache accumulates anonymized embedding vectors after successful runs, improving retrieval precision over time without storing any raw query text.
Hybrid fusion combines retrieval score, scenario boost, and capped usage signal.
Secrets and raw large outputs do not become planner prompt content by default; bounded evidence replan is explicit for read-only tools.
The model plans, but does not execute side effects
The planner can use different providers through LiteLLM, and the planning mode depends on the selected model. The LLM creates a plan from allowed context; execution, secrets, and side effects stay in runtime.
The model resolves at session, project, organization, or global default level.
Structured JSON and Python-like DSL planning modes are supported.
A utility model can handle short auxiliary responses such as slot summaries.
An embedding model builds vectors for tools and scenarios in the background so future requests can find relevant tools faster.
PlanVault validates the plan and executes it deterministically
The plan is validated and executed through a reactive JVM actor backend. Plan approval, runtime policy for concrete tool calls, idempotency safeguards, error handling, recovery, and event sourcing live here, instead of letting the model directly control tool calls.
PlanValidator checks tools, params, variables, and exactly one terminal reply/fail.
The execution graph runs through a crash-resilient engine with event-sourced state.
Runtime gates can wait for approval or block a tool call after live parameters are evaluated.
Retry and replan paths are bounded by configuration and runtime control.
Get traceable output, audit, and production workflow integration
The result is more than a user-facing answer. PlanVault leaves an operational trail: SSE, history, diagnostics, replay, audit logs, encrypted data, and lifecycle hooks for your systems.
SSE and history expose planGraph, tool events, approvals, errors, and terminal states.
Diagnostics and replay help investigate runs without unsafe raw-content access by default.
Envelope encryption, retention, export/erasure, and pseudonymisation support GDPR-oriented operations.
Lifecycle webhooks and APIs let you integrate incrementally with the team and services you already have.
How systems connect, get governed, and emit signals
One catalog and one runtime for inbound integrations, internal routing, and outbound observability and compliance signals.
Inputs and existing surfaces
- REST / OpenAPI
- MCP servers
- Inbound webhooks
- Outbound webhooks
- Knowledge Base
- Context and session metadata
PlanVault control layer
- Tool catalog
- Adaptive retrieval
- Semantic Routing Cache
- Manual scenarios
- Runtime policy gates
- Secrets boundary
- Execution FSM
- Idempotency
- Audit / replay
- Cost tracking
Outputs and operations
- Controlled API calls
- Lifecycle webhooks
- SSE progress
- Diagnostics
- Replay
- Audit logs
- Spend, token limits, and budget caps
- Granular permissions
- GDPR export / delete
Architecture and developer workflow for governed AI execution
One section for the team that integrates and operates the platform: service contracts, tool catalog, runtime policy gates, diagnostics, replay, and the docs-to-production path.
Connect existing service contracts
Import OpenAPI, connect MCP servers, or add webhooks without rewriting services around an agent framework.
See what the model is allowed to use
Review flattened schemas, search documents, scenario guidance, and the shortlist that will become planner context.
Find integration failures faster
Diagnostics surface contract drift, failed tool calls, validation errors, and runtime state without manual payload archaeology.
Recover from persisted execution state
Replay and history let teams reproduce a run from the controlled event log instead of reconstructing it from chat.
Debug the system boundary, not the model transcript
Instead of debugging agent behavior from model logs, teams inspect service contracts, selected context, runtime state, and the replay path.
OpenAPI + MCP + webhooks indexed
Shortlist, schemas, and scenario guidance visible
Contract drift marked for shadow version
Replay from persisted run events