Skip to content

Laws of AI Agents Audit your agent

Audit Kit · Free during launch

AI Agent Audit Intake Checklist

Paste what you have. Missing items are fine; the audit will mark confidence and unknowns.

Pick The Surface

Code repo: local files, GitHub repo, or uploaded source.
Workflow builder: n8n, Zapier, Make, Retool, Voiceflow, Botpress, or similar.
SDK/API: OpenAI Agents SDK, Assistants, LangGraph, LangChain, CrewAI, AutoGen, Semantic Kernel, or custom stack.
Black-box: transcripts, screenshots, demos, or observed behavior only.
Client report: audit output needs to become a formal deliverable.

If you are unsure, start with the general sections below and include whatever screenshots, exports, or traces you have.

Agent Overview

What does the agent do?
Who uses it?
What actions can it take?
What would a bad failure look like?
Where does it run?
What is the highest-stakes thing it can do without a human?

Prompts

System prompt:
Developer prompt:
User prompt template:
Tool-selection instructions:
Refusal/escalation instructions:

Tools

For each tool:

Name:
Description shown to the model:
Inputs:
Outputs:
Permissions:
Side effects:
Can it read private data?
Can it send data outside the system?
Is there a human approval step?

Retrieval / Memory

What sources can the agent retrieve from?
How are documents chunked?
How are chunks ranked or reranked?
How is freshness handled?
Are citations required?
Can user-supplied or web content enter the same context as privileged instructions?

Evals

What test cases exist?
What metrics are tracked?
Are failures sliced by task type, customer, language, or risk?
Is there a regression suite?
Are real production traces reviewed?

Logs / Traces

Include 1-3 successful runs and 1-3 failed runs:

User input:
Retrieved context:
Model output:
Tool calls:
Final result:
What went wrong or right:

Constraints

Latency target:
Cost target:
Autonomy level:
Human review requirements:
Sensitive data involved:
Compliance or legal constraints:

n8n / Workflow Builder Exports

Exported workflow JSON or blueprint:
Screenshots of AI/prompt nodes:
Screenshots of action/tool nodes:
Credential scopes, described without secrets:
Error branches and retry settings:
Sample execution log:

Code Repo / SDK Artifacts

Agent entrypoint files:
Tool/function schema files:
Retrieval or memory files:
Eval/test files:
Trace/log examples:
Deployment config:

Black-Box Evidence

Public docs or screenshots:
5-10 transcripts:
Known complaints or incidents:
Probe questions you are allowed to run:
What you cannot inspect:

View raw Download zip

All kit files