Ai Safety

6 articles about ai safety.

The Smart Knife Problem - Why AI Agents Should Be Tools, Not Autonomous Weapons

·2 min read

AI agents work best as tools with clear boundaries, not autonomous systems making decisions without oversight. The smart knife problem explained.

ai-safetyagent-boundariesai-agenttrustdesktop-automation

AI Agent Failure Rates and the Desktop Permissions Problem

·3 min read

AI agents fail more often than people think. When desktop agents can click anything and type anywhere, one hallucinated action can send emails or delete files.

ai-safetypermissionsdesktop-agentfailure-raterisk-management

AI Agent Security Is Backwards - Why Input Validation Matters More Than Output Verification

·2 min read

Most AI agent security focuses on verifying outputs - did the click land correctly? But unsigned, unvalidated inputs are the real attack surface.

ai-safetyagent-securityinput-validationdesktop-agentprompt-injection

Designing a Tiered Permission System for AI Desktop Agents

·3 min read

Full YOLO mode is dangerous and full approval mode is unusable. Tiered permissions with allowlists per action type hit the sweet spot.

permissionsai-safetyux-designdesktop-agentarchitecture

How to Build AI Agents You Can Actually Trust - Bounded Tools and Approval UX

·3 min read

Giving AI agents broad system access is a recipe for disaster. How bounded tool interfaces and smart approval flows make desktop agents safe to use.

ai-safetyagent-designtrustuxdesktop-agent

Prompt Injection and AI Agents - Why Browser-Based Agents Have a Bigger Attack Surface

·3 min read

AI agents that run inside the browser inherit whatever the page feeds them, including injection payloads. Native agents that interact from outside have a smaller attack surface.

securityprompt-injectionbrowser-agentsnative-agentsai-safety

Browse by Topic