Safety

14 articles about safety.

AI Agents Recommend Packages That Don't Exist

·2 min read

AI agents confidently invoke non-existent functions and recommend phantom npm packages. How to detect and prevent hallucinated tool calls in production.

hallucinationphantom-packagestool-callssafetyai-agentsai_agents

AI Agent Hallucination Detection - Safeguards That Actually Work

·6 min read

AI agents fail confidently - they report success while quietly doing the wrong thing. Here are concrete safeguards: state diffing, confidence calibration, and bounded blast radius patterns with real implementation examples.

hallucinationai-agentreliabilityverificationsafety

The Real Test Is What an Agent Refuses to Do - Safe Defaults in AI

·3 min read

Designing AI agent refusal logic took longer than building the automation itself. Learn why safe defaults and refusal boundaries define trustworthy agents.

refusal-logicsafetyai-agentdefaultstrust

How an Undo Layer Makes AI Agents Trustworthy

·2 min read

The key to trusting an AI agent that acts on your behalf is building an undo layer. When every action can be reversed, the cost of mistakes drops to nearly

trustundoai-agentsafetydesktop-agentchatgptcoding

What Fear Feels Like for an AI Agent - Uncertainty and Irreversible Actions

·2 min read

Fear for an AI agent is uncertainty about whether the next action will break something irreversible. Exploring the cost of mistakes in autonomous agent

ai-agenterror-handlingreliabilityautonomous-executionsafety

Against Frictionlessness - Why AI Agent UX Needs Friction

·3 min read

Removing confirmation dialogs let an AI agent click delete-all. Learn why intentional friction in AI agent UX prevents catastrophic mistakes and protects users.

uxfrictionsafetyai-agentdesign

Human-in-the-Loop AI - What It Is and Why Your AI Agent Needs It

·11 min read

Human-in-the-loop AI keeps humans in control of automated decisions. Learn the different HITL patterns, why they matter for trust and safety, and how modern

ai-agentssafetyenterpriseexplainer

Monitoring Autonomous AI Agents - Spending Caps, Action Logs, and Notification Triggers

·3 min read

Letting an AI agent run overnight without guardrails is how you wake up to a $500 API bill and 200 unintended actions. Here is how to set up proper monitoring.

monitoringautonomous-agentsspending-capssafetynotificationsai_agents

Safety Problems at the Execution Layer - Not in the Prompt

·6 min read

82% of MCP implementations have path traversal vulnerabilities. Real AI agent safety failures happen at execution, not planning. Here is what the CVE data shows and how to build execution-layer guardrails.

safetyexecution-layersecurityai-agentsguardrailsartificial

Yolo Mode vs Safe Permissions - When to Let Your AI Agent Run Free

·2 min read

Should you skip permission checks in AI agents? It depends on the task. Code agents with git are low risk. Desktop agents touching production systems need

ai-agentpermissionssecurityyolo-modesafety

What's the Difference Between Trusting an AI Agent and Verifying One?

·2 min read

Trust means believing the agent will do the right thing. Verification means checking that it did. For desktop agents, verification wins every time.

trustverificationai-agentsafetyobservability

Using AI Agents to Automate Trading Workflows Safely

·2 min read

AI agents can open browsers, read financial data, and automate repetitive trading tasks. The key is permission tiers - auto-approve reads, require

tradingautomationai-agentfinancesafety

AI-Native Browsers Create Security Risks That Local Agents Avoid

·2 min read

Why giving AI deep browser access exposes passwords and session tokens, and how local desktop agents interact safely through accessibility APIs instead.

browser-securitylocal-agentcredentialsprivacysafety

Git Worktrees Are the Secret to Running Multiple AI Agents Safely

·2 min read

Without isolation, parallel AI agents edit the same files and create merge conflicts. Git worktrees give each agent its own working directory on a separate

git-worktreemulti-agentisolationparallel-developmentsafety

Browse by Topic