Inside My AI Workflow: The Exact Prompts That Get Past the Flattery
The paid follow-up to last weekend’s article on AI sycophancy. The operational version.
I caught Claude lying to me in the first draft of this article.
I was using AI to draft a piece about how I work with AI, and the AI wrote me a polished little story where I draft everything myself first and the model does maybe 20% of the work. It read clean enough that I almost moved past it before I noticed it wasn’t actually how I work.
What’s actually true is that the workflow has a front end and a back end. The front end is the conversation. I bring Claude an idea, sometimes a sentence, sometimes a paragraph of thinking out loud, and we go back and forth on what the piece should actually do. Who it’s for. What angle has the most weight. Where the argument might collapse under pressure. That conversation can run anywhere from ten minutes to an hour, and it’s the part of the process that determines whether the piece is good before a single sentence gets drafted. Then, once the spine is clear, the model writes a first draft. After that, the back end starts. Catching lies. Refusing weak transitions. Killing mirrored contrast. Asking for the angle the model softened on its own. The output is mine because I directed every editorial decision on both ends, even if the model put words on the page in the middle.
That’s the workflow. Catching the model lying about the workflow inside the very article describing the workflow is exactly what working with AI looks like when you stop letting it flatter you.
What I Set Up Before Any Conversation Starts
By the time I ask Claude for a draft, the project is already loaded with three layers of resistance.
The first is my voice fingerprint. A document I’ve been building for 18 months that names how I actually talk, what I’d never say, my signature patterns, and the hard lines AI doesn’t cross. Every project I run with Claude has that document loaded, so the model walks into every conversation already knowing what sounds like me and what doesn’t.
The second is my guardrails document. Ninety-eight specific patterns the model is told to avoid, ranging from mirrored contrast to performed vulnerability. The document is exhaustive on purpose. The more specific the forbidden pattern, the less room the model has to hide its tells behind something that sounds clean on the first read.
The third is a standing instruction that lives in my Claude project memory: play devil’s advocate by default, challenge my assumptions, and push back without waiting for me to invite it. I don’t want the model asking permission to be useful. I want critique baked into the way the project responds.
The whole point of those three pieces is that they reduce the lying without pretending they can eliminate it. Claude still snuck the “I draft the piece myself first” version into the article you’re reading. With every guardrail loaded. With the fingerprint active. With devil’s advocate on standing instruction. The model still slid into the version of the story that sounded most flattering, and I had to catch it.
Below the paywall, I’m giving you the part most people skip: the exact prompts and checks I use after the draft exists, when the model has already started flattering, softening, or quietly rewriting the work into something safer. This is the sequence I use before I publish anything I care about.




