PlanVault platform for AI execution

Connect existing services, govern tool execution, and keep every run observable. Secrets stay outside prompts.

REST, MCP, webhooks

Policies and permissions

Audit and replay

app.planvault.ai/runs/pv-2481

Product proof

A model plan becomes a validated, replayable execution trace

Plan onlyRuntime controlledAudit ready

Validation

PlanValidator passed

Schemas, params, and terminal path checked before execution.

Action control

Approval required

Runtime pauses side effects until policy approval.

Data boundary

Model-safe context

Secrets stay out of the prompt; raw large outputs stay out of the planner by default, except explicitly enabled bounded evidence for read-only replans.

Model output

The LLM returns a plan, not side effects

plan "notify_customers"
  orders = list_orders(limit: 10)
  for order in orders
    email = get_field(order, "billing_email")
    if email.present?
      notify_customer(order.id, email)
  reply("Notifications queued")

No plaintext secrets in prompt

Allowed tools and parameters only

Exactly one terminal reply/fail path

Validation + runtime

PlanVault validates and executes

tool shortlist

done

Adaptive retrieval selected the allowed tools.

PlanValidator

done

Tools, params, variables, and terminal state checked.

approval gate

waiting

Policy pauses side effects before customer notification.

queued

Final response waits for controlled execution state.

Audit / replay

Every step leaves evidence

Approval event

Approval policy, actor, and decision are preserved with the run.

Secret boundary

Secrets resolve inside runtime and never become prompt text.

Replay trail

Diagnostics keep the trace inspectable without exposing raw output by default.

Replay available for this run

Technical differentiators

Controls for risks you cannot leave to the model

Data boundaries, routing, recovery, secrets, and large responses are controlled by the system, not prompt instructions.

Data boundary

Data stays under your control

Deploy inside your VPC or an air-gapped capable environment, connect local models through LiteLLM, and keep data inside your own perimeter.

Tool routing

Tool scale without prompt chaos

PlanVault narrows a large service catalog to the relevant tools before planning, so the model does not see thousands of descriptions or spend tokens on routing. A Semantic Routing Cache accumulates anonymized embedding vectors from successful runs and improves retrieval accuracy over time — without ever storing raw query text.

Recovery

Recovery without uncontrolled side effects

Every execution step is recorded as an event. After a failure, a run can recover or stop safely without zombie processes or duplicated mutations.

Secrets

Late-bound secrets, encrypted at rest

Secrets are late-bound through scoped references; prompts receive variable names, not plaintext credentials.

Large outputs

Large responses without context overflow

The runtime works with large service responses without pushing raw payloads into the LLM by default; read-only tools can explicitly opt into a bounded evidence excerpt for replanning.

Outcomes

Named outcomes for stakeholders

Short phrases teams can reuse when explaining the governed layer between models and APIs.

Controlled side effects

Approvals, policies, and runtime pauses stop unintended changes until the next step is explicitly safe.

Less context bloat and cost

Relevant tool selection and compact context reduce wasted tokens on large catalogs and schemas.

Faster integration work

OpenAPI, MCP, and webhooks attach to a catalog without rewriting every agent by hand.

Security review support

Approvals, policies, and execution records stay in one journal for security reviews.

Incident investigation

Replay and event logs help teams understand what actually ran after an outage or ticket.

GDPR and user-data operations

Export, delete, and pseudonymisation workflows for AI run history aligned to personal-data obligations.

Head-to-head

Where PlanVault sits next to your agent stack

PlanVault is the governance and execution layer that connects to your agent stack — existing LangChain, LangGraph, CrewAI, AutoGen, DSPy, OpenAI agent API, or MCP agents can run behind a PlanVault tool boundary, with tool calls, secrets, approvals, and recovery handled outside the model and every run recorded for audit and replay.

Full category comparison

Execution journey

From existing services to governed AI execution

Five steps show where the model stops and the governed runtime takes over: tool selection, validation, execution, audit, and replay.

Connect01 / 05

Connect the services that already do the work

Start with what your team already has: APIs, MCP servers, webhooks, session context, and correlation metadata. PlanVault turns them into a tool catalog that is easier to select from and safer to plan against.

OpenAPI/Swagger import runs asynchronously and versions tool definitions.

MCP servers and outbound webhooks become normal tools in the same catalog.

Flattened schemas, descriptions, and search documents make tools easier for retrieval and planner prompts.

X-Request-Id, traceparent, tags, and metadata help correlate runs with your systems.

Read integration docs

Prepare02 / 05

PlanVault prepares the shortlist and context before the LLM

Before the model is called, PlanVault narrows a large catalog into a relevant shortlist. Selection combines retrieval, scenarios, usage signals, and project limits so the model sees prepared context instead of the whole enterprise catalog.

Manual and auto-recorded scenarios provide boosts and optional planner guidance.

Adaptive retrieval moves from direct selection to FTS/vector/hierarchical search as the catalog grows. A Semantic Routing Cache accumulates anonymized embedding vectors after successful runs, improving retrieval precision over time without storing any raw query text.

Hybrid fusion combines retrieval score, scenario boost, and capped usage signal.

Secrets and raw large outputs do not become planner prompt content by default; bounded evidence replan is explicit for read-only tools.

Read tool selection docs

Plan03 / 05

The model plans, but does not execute side effects

The planner can use different providers through LiteLLM, and the planning mode depends on the selected model. The LLM creates a plan from allowed context; execution, secrets, and side effects stay in runtime.

The model resolves at session, project, organization, or global default level.

Structured JSON and Python-like DSL planning modes are supported.

A utility model can handle short auxiliary responses such as slot summaries.

An embedding model builds vectors for tools and scenarios in the background so future requests can find relevant tools faster.

Read planner architecture

Execute04 / 05

PlanVault validates the plan and executes it deterministically

The plan is validated and executed through a reactive JVM actor backend. Plan approval, runtime policy for concrete tool calls, idempotency safeguards, error handling, recovery, and event sourcing live here, instead of letting the model directly control tool calls.

PlanValidator checks tools, params, variables, and exactly one terminal reply/fail.

The execution graph runs through a crash-resilient engine with event-sourced state.

Runtime gates can wait for approval or block a tool call after live parameters are evaluated.

Retry and replan paths are bounded by configuration and runtime control.

Read runtime controls

Operate05 / 05

Get traceable output, audit, and production workflow integration

The result is more than a user-facing answer. PlanVault leaves an operational trail: SSE, history, diagnostics, replay, audit logs, encrypted data, and lifecycle hooks for your systems.

SSE and history expose planGraph, tool events, approvals, errors, and terminal states.

Diagnostics and replay help investigate runs without unsafe raw-content access by default.

Envelope encryption, retention, export/erasure, and pseudonymisation support GDPR-oriented operations.

Lifecycle webhooks and APIs let you integrate incrementally with the team and services you already have.

Read security and operations

Connect01 / 05

Connect the services that already do the work

OpenAPI/Swagger import runs asynchronously and versions tool definitions.

MCP servers and outbound webhooks become normal tools in the same catalog.

Flattened schemas, descriptions, and search documents make tools easier for retrieval and planner prompts.

X-Request-Id, traceparent, tags, and metadata help correlate runs with your systems.

Read integration docs

Prepare02 / 05

PlanVault prepares the shortlist and context before the LLM

Manual and auto-recorded scenarios provide boosts and optional planner guidance.

Hybrid fusion combines retrieval score, scenario boost, and capped usage signal.

Secrets and raw large outputs do not become planner prompt content by default; bounded evidence replan is explicit for read-only tools.

Read tool selection docs

Plan03 / 05

The model plans, but does not execute side effects

The model resolves at session, project, organization, or global default level.

Structured JSON and Python-like DSL planning modes are supported.

A utility model can handle short auxiliary responses such as slot summaries.

An embedding model builds vectors for tools and scenarios in the background so future requests can find relevant tools faster.

Read planner architecture

Execute04 / 05

PlanVault validates the plan and executes it deterministically

PlanValidator checks tools, params, variables, and exactly one terminal reply/fail.

The execution graph runs through a crash-resilient engine with event-sourced state.

Runtime gates can wait for approval or block a tool call after live parameters are evaluated.

Retry and replan paths are bounded by configuration and runtime control.

Read runtime controls

Operate05 / 05

Get traceable output, audit, and production workflow integration

The result is more than a user-facing answer. PlanVault leaves an operational trail: SSE, history, diagnostics, replay, audit logs, encrypted data, and lifecycle hooks for your systems.

SSE and history expose planGraph, tool events, approvals, errors, and terminal states.

Diagnostics and replay help investigate runs without unsafe raw-content access by default.

Envelope encryption, retention, export/erasure, and pseudonymisation support GDPR-oriented operations.

Lifecycle webhooks and APIs let you integrate incrementally with the team and services you already have.

Read security and operations

Integration surfaces

How systems connect, get governed, and emit signals

One catalog and one runtime for inbound integrations, internal routing, and outbound observability and compliance signals.

Inputs and existing surfaces

REST / OpenAPI
MCP servers
Inbound webhooks
Outbound webhooks
Knowledge Base
Context and session metadata

PlanVault control layer

Tool catalog
Adaptive retrieval
Semantic Routing Cache
Manual scenarios
Runtime policy gates
Secrets boundary
Execution FSM
Idempotency
Audit / replay
Cost tracking

Outputs and operations

Controlled API calls
Lifecycle webhooks
SSE progress
Diagnostics
Replay
Audit logs
Spend, token limits, and budget caps
Granular permissions
GDPR export / delete

For engineering teams

Architecture and developer workflow for governed AI execution

One section for the team that integrates and operates the platform: service contracts, tool catalog, runtime policy gates, diagnostics, replay, and the docs-to-production path.

Integrate

Connect existing service contracts

Import OpenAPI, connect MCP servers, or add webhooks without rewriting services around an agent framework.

Inspect

See what the model is allowed to use

Review flattened schemas, search documents, scenario guidance, and the shortlist that will become planner context.

Debug

Find integration failures faster

Diagnostics surface contract drift, failed tool calls, validation errors, and runtime state without manual payload archaeology.

Replay

Recover from persisted execution state

Replay and history let teams reproduce a run from the controlled event log instead of reconstructing it from chat.

Engineering handoff

Debug the system boundary, not the model transcript

Instead of debugging agent behavior from model logs, teams inspect service contracts, selected context, runtime state, and the replay path.

Catalog

OpenAPI + MCP + webhooks indexed

Context

Shortlist, schemas, and scenario guidance visible

Failure

Contract drift marked for shadow version

Recovery

Replay from persisted run events

✓Versioned tool definitions

✓Diagnostics without raw content by default

✓Replay from event-sourced state

✓Docs/API path for production integration

Read developer docs Open API docs