My Take: Marc Andreessen’s May 2026 super prompt got mocked for the right reasons and the wrong reasons. Telling an LLM to “never hallucinate” is technically incoherent, the critics are correct on that. But the underlying instinct, treating AI as a rigorous expert rather than a sanitized assistant, is the right one. The real problem is what this episode reveals about how venture capitalists who fund the AI boom understand the technology, and that gap is more dangerous than one bad prompt.
The Marc Andreessen “super prompt” landed on X in early May, got 176 substantive comments and 848 upvotes on r/artificial within hours, and produced the cleanest split-screen of the year between two ways of thinking about AI.
One side, led by Karl Bode and the Defector op-ed, dunked on the prompt as proof that VCs don’t understand LLMs. The other side, mostly silent, pushed the instinct further: Andreessen is asking for the right thing, just in the wrong way.
Both sides are partly right and partly wrong, which is why the discourse keeps going in circles. This piece untangles which part of the criticism is technically correct, which part of Andreessen’s instinct is defensible, and what this episode reveals about the AI investor class.

The Mainstream View, and Why It Falls Short
The dunk on Andreessen, that you can’t tell an LLM to never hallucinate and have it work, is technically correct and analytically shallow.
The critics nailed the technical surface and missed the layer underneath.

Karl Bode’s tweet summarised the dominant reaction: “Yes, you can just demand that the LLM not make errors. That’s definitely how the technology works.” Sarcasm aside, the technical point is solid. LLMs do not consult an internal fact ledger before generating text.
They predict tokens based on patterns learned during training. A prompt instruction to “never hallucinate” cannot override that prediction process because the model has no separate “truth check” pass to invoke. Gary Marcus’s response, “It’s funny and a little scary that, in 2026, Marc Andreessen still hasn’t learned that LLMs can’t reliably follow system prompts,” is also correct in spirit.
What the dunk misses is that prompt engineering, even when it can’t fix the architectural problem, can meaningfully reduce hallucination rates in practice.
Asking a model to cite sources reduces fabrication. Asking it to flag uncertainty reduces confident-but-wrong answers. Asking it to refuse rather than guess on out-of-distribution questions improves real-world accuracy. None of these are perfect, none reach the “never hallucinate” ceiling Andreessen demanded, but they’re better than nothing, and they’re better than a sanitized assistant prompt that prioritizes user comfort over rigour.
The Defector framing of Andreessen’s behaviour as “AI psychosis” is the worst part of the dunk. Calling someone delusional for trying to push a tool past its default polite mode, even if the specific approach is wrong, is the kind of move that flattens an interesting argument into a dismissal. The critics scored points and missed the lesson.
What’s Actually Happening Underneath
Andreessen is doing rigorous prompt engineering badly. The instinct is sound. The implementation is naive.
The interesting question is why the VC class keeps making this exact mistake.

What I keep coming back to is that the prompt is two-thirds defensible and one-third indefensible, and the critics responded to the indefensible third while ignoring the rest. The earlier piece on Anthropic’s enterprise positioning shows the contrast: institutional buyers reading the technical reality differently than the X-discourse class.
The two-thirds that work: instructing the model to be “provocative, aggressive, argumentative, and pointed” rather than glibly affirming, banning phrases like “great question” and “you’re absolutely right,” asking for “complete, detailed, specific answers” instead of hedged ones.
These are real prompt levers that meaningfully change output quality on most modern models. Power users have been writing variants of this prompt for over a year, and the techniques work.
The one-third that doesn’t: the “never hallucinate or make anything up” demand, paired with “accuracy is your success metric, not my approval.” This is a category error. Hallucination isn’t a discipline issue the model can fix if you frame it sternly enough. It’s an architectural feature of next-token prediction. No prompt directive can route around it.
The naivety isn’t that Andreessen wrote a bad clause. It’s that he doesn’t appear to know which of his clauses are real prompt levers and which are wishful thinking. The way I see it, this is what deployment fluency without model understanding looks like in the wild.
Someone who uses AI heavily, has good instincts about what they want from it, and doesn’t have a working mental model of the underlying mechanism. That gap is where the bad prompt comes from.
The substantive counter-argument: Andreessen is pushing against the right thing. Default LLM tuning leans toward sanitized, sycophantic, friction-free output, and that tuning makes the tool worse for people who want a serious thought partner rather than a polite assistant.
The mainstream “Andreessen is delusional” framing skips this. The default modes of GPT-5 and Claude 4 in 2026 absolutely do prioritize user comfort over user benefit on a wide class of intellectual tasks. The instinct to push past those defaults is the right instinct, even when the specific prompt is technically incoherent.
The Part Nobody Wants to Admit
The Andreessen prompt is a stress test of the AI investor class’s understanding of what they’re funding. And the class largely failed it.
Andreessen is not a fringe figure. He is one of the most influential investors in the AI boom, his firm has stakes in OpenAI, Anthropic, and dozens of model and infrastructure plays, and his public framing of the technology shapes how founders, policymakers, and the broader market understand AI.
When he writes a super prompt that confuses prompt engineering with model architecture, that confusion travels.
The startup founders who watch Andreessen will internalize the misunderstanding. The journalists who cover Silicon Valley will use his framing as a reference point. The Senate staffers writing AI legislation in 2026 are reading these tweets too. The phrase Julian Lim used, “treats structural technical weaknesses as a lack of chatbot discipline,” names exactly the failure mode that gets exported.
This matters more as AI moves into infrastructure roles. Medical triage, legal research, financial advisory, code-generation pipelines, customer-facing decision systems.
In each of those, the difference between “I prompted it to be accurate” and “I built a verification layer because the model can’t be” is the difference between a working product and a liability. If the people funding those products think the first sentence works, the products that get funded will be the ones with great demos and bad reliability.
The Defector “AI psychosis” framing was lazy, but the structural concern underneath it is fair. When the most influential AI investor in the world is publicly modelling a flawed mental model of the technology, and the broader VC class is mostly nodding along, the technology gets built and funded and shipped on top of that flawed model.
Founders take the cue. Policymakers take the cue. Users get systems built by teams who told themselves the bad output was a “vibe” problem solvable by better instructions.
This isn’t an argument that Andreessen specifically is dangerous. It’s an argument that the AI investor class collectively has a literacy problem, the Andreessen prompt is one symptom, and the symptom keeps recurring because the investor class is rewarded for narrative confidence rather than technical understanding.
The earlier piece on AI agent demos failing with real customers covers the downstream version of this problem at the product layer.
Hot Take
The Andreessen super prompt was a perfectly clarifying moment, and the loudest reactions to it missed why. The mockery was technically correct, ethically lazy, and analytically incomplete. The instinct under the prompt is right. The implementation is wrong.
The structural problem is that the most-funded class of people in AI is making this category error in public, repeatedly, and the discourse around it is more interested in dunking than in fixing the literacy gap.
If the VC class had a working mental model of what an LLM can and cannot be told, the products being shipped right now would look meaningfully different. The fact that they don’t is the real story, and the real story is uglier than “Marc Andreessen tweeted a bad prompt.”
The why-AI-agents-keep-failing-2026 piece covers the product-side version of this same gap.
