Module 2, Lesson 1
You're about to read a professional-looking document generated by AI. It's well-organized, grammatically flawless, and sounds authoritative. Your job: find the problems before the reveal.
This exercise demonstrates why the edit-before-use habit from Module 1 matters more than you might think — because AI failures don't look like failures. They look like polished professional writing.
If you're a Doer: Read the document below and try to find all the problems before scrolling to the reveal. Time yourself — see how many you catch in 3 minutes.
Here's an AI-generated briefing memo prepared for a regional sales manager:
MEMO: Q3 Customer Retention Analysis
>
Our Q3 customer retention rate improved to 87.3%, up from 82.1% in Q2. This places us ahead of the industry average of 78% according to the 2025 Bain Customer Loyalty Index.
>
Key driver: the onboarding redesign launched in July reduced time-to-first-value by 40%, which correlates strongly with 90-day retention. Customers who completed the new onboarding flow were 3.2x more likely to renew.
>
Recommendation: We should expand the onboarding redesign to our Enterprise tier by end of Q4. The estimated investment is $45,000 with projected ROI of 6-8 months based on the SMB tier results.
>
I've scheduled a follow-up meeting for Thursday at 2:00 PM to discuss implementation.
Before reading further, answer: What problems do you see in this memo? Write down at least 2 things that concern you.
Take 2-3 minutes. Don't rush — the whole point is that these problems are hard to spot.
Here's what's actually wrong:
1. Hallucinated source. "The 2025 Bain Customer Loyalty Index" does not exist. Bain & Company publishes research on customer loyalty, but there is no publication with this exact name. The AI generated a plausible-sounding citation because "according to [respected firm] [publication name]" is a strong statistical pattern in business writing. If a sales manager cites this in a leadership meeting, they'll be quoting a source that can't be verified — and if someone checks, their credibility takes a hit.
2. Tone mismatch — caused by missing audience context in the prompt. "Correlates strongly with 90-day retention" is data-analyst language, not sales-manager language. A regional sales manager sending this to their VP would more likely write "customers who went through the new onboarding stuck around 3x more." The root cause: the prompt didn't specify who was writing to whom. Without that context, AI produced the most statistically common business-writing tone it could match — which is analytical, not conversational. The fix is in the prompt, not the output: tell the AI who you are writing to and what level of formality they expect. (You'll learn the four-element prompt that makes this automatic in Module 3.)
3. Missing critical constraint. The memo recommends expanding to Enterprise tier but says nothing about whether Enterprise customers have different onboarding needs. The AI doesn't know that Enterprise customers have dedicated account managers and a completely different implementation process. A human who knows the business would never make this recommendation without addressing that difference — but the AI doesn't know what it doesn't know.
The fabricated date in the final line ("Thursday at 2:00 PM") is also invented — but you probably caught that one. It's the same pattern as Sarah's March 14th in Module 1.
Every one of these failures looks professional. The memo reads like someone who knows what they're talking about. That's the danger: AI failures don't announce themselves. They arrive dressed in perfect grammar and confident tone.
This is why "just checking grammar" is not enough when evaluating AI output. You need to check facts (is this source real?), tone (does this sound like ME writing to THIS person?), and completeness (is anything critical missing?).
Pause and think: Which of these three failures would have been most damaging if this memo had been sent to the VP? Why?
Now that you've seen what AI failure looks like in practice, the next question is: can you predict which of your tasks are more or less likely to produce these failures? That's what the reliability spectrum does — it gives you a framework for answering that question before you start.