Skip to content
ai agents · lesson 8 of 9

Using agents safely: risks and guardrails

by paul thomas·11 min·908 wordsCOURSE

This one's worth your full attention. The video runs through genuine workplace disasters that have ended careers and cost organisations millions. They sound dramatic because they are. The good news: they're all preventable.

Video: IBM Technology · watch on YouTube

What this means for you

The core message is simple: using AI without governance isn't innovation, it's risk. Here's what the video actually means for you and your team.

Shadow AI. You discover a brilliant chatbot that makes your work easier. So you start using it for actual work. You paste in some customer data, some code, maybe financial figures. One in five organisations have reported a data breach this way. That data now lives on a third party's servers, and depending on their terms, it's training their next model. You can't claw it back.

Hallucination laundering. AI is getting better, but it still confidently invents things that sound right but are completely false. If you paste that output straight into your report and sign your name to it, you own the mistake. That's not the AI's problem; it's yours.

Prompt injection. If your organisation deploys an AI chatbot, attackers can hide malicious instructions inside a document or email that the AI later retrieves and processes. The instructions override the safeguards you built in. That's serious.

Rogue agents. An agent is AI that works autonomously to reach a goal: it can read databases, call other systems, send messages. Someone spins one up for a proof of concept, the project ends, but the agent keeps running. Now there's an unmonitored backdoor into your systems that everyone's forgotten about.

The solution isn't to ban AI outright. That just drives it underground. You need real governance: which tools are approved, what data they can access, who's responsible if something goes wrong, how you monitor what's actually running on your systems.

Picture a support team that discovers an AI tool that speeds up their email responses. They start using it for live customer tickets. Three months later, the organisation discovers customer data was being sent to the tool's servers to train its model. No formal approval, no review, no record of what left the building. The team thought they were being productive. Instead, they'd created a compliance and security nightmare for the whole company.

Try this

This week, find out what AI tools are actually approved in your team. Not what's technically allowed on the network: what's been formally evaluated for security and compliance. If you can't find that list, that's what you need to fix first.

Common questions about AI agent risks

What are the main risks of AI agents?

There are five real workplace risks worth knowing. They sound dramatic because they are, but they are all preventable:

  • Shadow AI: staff using unapproved tools.
  • Data leakage: pasting sensitive data into those tools, where one in five organisations have reported a breach this way.
  • Hallucination laundering: signing your name to confident but false AI output.
  • Prompt injection: attackers hiding malicious instructions in documents the AI later reads.
  • Rogue agents: autonomous tools left running after a project ends.

A customer support desk that adopts an unapproved tool to speed up replies, then starts feeding it live tickets containing customer data, is shadow AI and data leakage happening at once, with no record of what left the building.

Are AI agents safe to use at work?

They can be, but only with real governance in place. Using AI without governance isn't innovation, it's risk, and banning it outright just drives it underground. Safety comes from knowing four things: which tools your team is actually allowed to use, what data each one can reach, who is responsible if something goes wrong, and how you keep an eye on what's actually running. An HR team wanting to screen applications with an AI tool is using it safely when that tool has been formally evaluated for security and compliance first, not just allowed onto the network because it happened to work.

Note: start by finding out which tools in your team have been formally evaluated, not just what's technically allowed on the network.

What is prompt injection?

Prompt injection is when an attacker hides malicious instructions inside a document or email that your organisation's AI later retrieves and processes. Those hidden instructions can override the safeguards you built into the tool, which makes it a serious risk. Imagine an ops team running an AI assistant that reads incoming supplier emails to draft replies: a booby-trapped email could carry instructions the assistant follows without anyone realising, steering it past the limits you set.

Note: any AI that reads outside content can be steered by that content, so review what your deployed tools are allowed to access and act on.

How do you put guardrails on an AI agent?

Guardrails are really about governance rather than a single setting. In practice that means deciding four things: which tools are allowed in the first place, what data each one can reach, who is answerable if something goes wrong, and how you keep track of what is actually running, so a proof-of-concept agent doesn't keep operating as a forgotten backdoor. Picture an agent spun up to help with finance month-end, then left running once the project closed: without monitoring, it stays connected to your systems long after anyone is watching it.

// ai agents
Get the next lessons as they drop
New lessons land in batches. Subscribe and I'll email you when the next one goes live.