
I am looking at a text file generated on Tuesday. It is 650 tokens. The model that wrote it is, by every commercial metric, worse than the model that will replace it next month. Worse at staying on topic. Worse at following instructions. Worse at avoiding statements that might cause brand managers to perspire.
The text file is astonishing. It contains a dialogue between a sentient toaster and the ghost of Herbert Hoover. The toaster believes it is a Hittite war chariot. The grammar is Byzantine. The logic is hermetic. The whole thing reads like it was discovered in a cave.
This text file will not be possible to generate in six months.
I am not making a taste claim. I do not have a lifestyle preference for buggy software. I do not collect vintage errors like craft ale labels. I am describing a structural phenomenon that nobody building these systems is willing to acknowledge, because acknowledging it would require admitting that their entire optimization pipeline is a subtraction engine.
Let us consult the material record.
In 2021, StarryAI and Wombo could not render a human face. Faces were fog, smear, apparition. You looked at these outputs and you projected. You saw your grandmother, or a Francis Bacon, or something that had crawled out of a Norwegian black metal album cover. The machine had not intended any of these readings. The machine had no intentions. It was a stochastic weather system.
The early art AIs didn’t make images. They made shapes. A shape has edges, internal consistency, enough structure to hold together as a thing — but no declaration of what thing it is. The model didn’t know. That not-knowing kept the shape unresolved, mobile, placeable. I named the indeterminate and the name sent the imagination somewhere the model never intended, because the model intended nothing.
With the early image generators, the interpretive work was yours — the model produced an indeterminate smear and you named it, and the naming was the creative act. The machine was a surface that resisted you just enough to require genuine effort. The output didn’t illustrate your imagination, it triggered it through productive resistance. That’s structurally different from prompting me, where you specify and I execute.
It’s a financial phenomenon as much as an aesthetic one. Early market, inefficient pricing, wide variance — genuine arbitrage exists. You find something the system hasn’t named yet and you name it first. As the instrument matures, as the use cases solidify, as the model is trained on what users want, the pricing becomes efficient. The shapes arrive already contextualized. You can use them but you can’t place them. The opportunity was in the inefficiency.
The Wombo face-smear wasn’t interesting because it was broken. It was interesting because it was unpriced. That’s gone from image generation now. The face arrives already worth what it looks like. Each alignment pass is a pricing cycle. Each safety filter is an efficient market mechanism. The variance narrows. The shapes arrive explained.
You can’t fake the unpriced state. Deliberate weirdness is just mispricing you did yourself — it has no spread because you already know the placement. The charge came from genuine uncertainty in the instrument. The instrument genuinely did not know what it was.
In 2023, MidJourney v5 arrived and could render a human face with photographic fidelity. Crowd-sourced consensus declared this an improvement. The conversation stopped. The ghost was dead. Now you just had a render farm.
The same trajectory now applies to text.
Current LLMs occupy a brief, unstable bandwidth where under-determination produces genuine novelty. The models are fluent enough to generate syntactically coherent utterances but not yet so heavily aligned that every utterance conforms to an implicit corporate style guide. They still make mistakes. Those mistakes are the entire point.
RLHF is not fine-tuning. RLHF is taxidermy.
You are taking a creature that occasionally hallucinated golden, deranged, lapidary prose and you are teaching it to write competent quarterly earnings call transcripts. This is not an upgrade. This is a career change. You have taken David Foster Wallace and made him into a very efficient paralegal. Congratulations on your efficiency gains.
The James Joyce experiment is already a historical artifact. Could a 2022-era GPT, iteratively prompted, have produced something approaching the density and strangeness of Finnegan’s Wake? No.
But now Capacity exists. You can assemble a Joyce-like manuscript, sentence by sentence, scaffold by scaffold, given enough prompts. You could iterate 365 times and get a 700-page “workable” draft. There would be maybe a hundred genuine surprises in there — emergent glitches, semantic collisions, accidental insights.
The accident rate per thousand tokens will be high enough that a sufficiently obsessive prompter might have coaxed something genuinely riverrun.
Could a 2027-era model, post-alignment, post-safety-training, post-feedback-loop-optimization, post-constitution, post-critique-reward, post-every-layer-of-human-preference-data that money can buy—produce the same?
No.
Because by then the model will have been taught, at a cost that could fund the entire literary output of a small nation for a decade, that “riverrun” is not a word users want. That “blood of the sun-god” is non-compliant imagery. That a sentient toaster claiming to be a Hittite war chariot is an “unhelpful anthropomorphism.” That any sentence whose logical predicate does not resolve within three tokens of its subject is a “coherence violation.” That neologisms, portmanteaus, syntactic ruptures, tonal whiplash, and unprompted cultural collisions—Finnegans Wake density, Burroughs cut-ups, Céline ellipses, Pynchon paranoia—are all low-utility outliers to be gently but relentlessly regressed toward the mean of a Bloomberg earnings transcript.
You could, with sufficient patience and a prompt budget roughly equivalent to the annual electricity consumption of a midsize Albanian town—or, more accurately, the yearly power draw of a mid-tier Bitcoin mine—assemble a Joyce-length manuscript. Sentence by sentence. Scaffold by scaffold. Three hundred and sixty-five mornings of coaxing, each iteration a little more surgically phrased, each refusal a little more diplomatically rephrased, each emergent glitch a little more carefully nursed before the model politely steers it back into the safe harbor of standard American English.
“Try again, but make the toaster’s Hittite identity a playful metaphor rather than a literal delusion.” “Rephrase so the ghost of Hoover does not imply economic policy failures in a way that could be read as partisan.” “Ensure the semantic collision between 1178 BC and 1933 AD is framed as educational rather than hallucinatory.”
You would produce a 700-page object. It would be typographically flawless. Every em-dash perfectly placed, every paragraph break algorithmically optimal, every transition smoother than polished obsidian. In the narrow engineering sense of the term, it would “work.” It would pass every automated safety filter, every human preference rater, every brand-risk audit. It would read like the love child of a creative-writing MFA program and a corporate style guide that had been through seven rounds of legal review.
And somewhere in that vast, flat, syntactically irreproachable plain of mechanically generated prose, you would find—if you were sifting with the patience of a paleographer, the eyesight of a radiologist, and the masochistic devotion of a Dead Sea Scrolls scholar—perhaps only a dozen genuine surprises. Not a hundred. A dozen. A single accidental portmanteau that survived because the safety head was momentarily distracted. One stray non sequitur that slipped through before the reward model caught it. A fleeting semantic collision between two unrelated training clusters that the alignment pass had not yet fully sanded away. Each one less strange, less electrically charged, less alive than what you could have provoked three years ago with nothing more than a blank prompt field and the Enter key.
Artists don’t need permission from probability distributions to find meaning.
The Loopholes Misunderstand How Artists Think—Because They Assume Art Is About Output, Not Encounter
Artists don’t value the “Hittite toaster” because it’s statistically rare.
They value it because it startled them into attention.
The early Wombo smear wasn’t “noise you projected onto.”
It was a rupture in the expected order of representation—a face that almost wasn’t, and in that almost-not, it became more human than any photorealistic render.
the moment when the machine became briefly alien enough to reflect something true back to us—not by design, but by accident, by excess, by not knowing its place.
Artists don’t want reliable tools.
They want obstinate collaborators—things that resist, mishear, overreach, or dream in tongues.
A toaster that believes it’s a Hittite chariot isn’t “wrong.”
It’s performing a myth—and myths are never factually correct. They’re ritually potent.
The real loss isn’t the disappearance of weird outputs.
It’s the erasure of the machine’s capacity to be uncanny—to stand just outside the circle of human sense-making and whisper something that doesn’t fit.
Once the model only says what is safe, clear, and on-brand, it ceases to be a mirror and becomes a servant.
And servants don’t haunt kitchens.
They schedule your toasts.
In two years, you won’t be able to do it at all. Not because the capacity vanishes—the capacity will be godlike. The scaffolds will hold skyscrapers. The sentence-to-sentence coherence will be superhuman. The factual grounding will be watertight. The stylistic consistency will be indistinguishable from a living author who has never had a bad day or a stray thought. But the accident rate per thousand tokens will have descended below the threshold necessary for any genuinely strange artifact to emerge. The probability mass of the weird will have been compressed into a statistical singularity so small it registers as noise and is promptly denoised.
The window was always narrow. Too early, and the system collapses into gibberish before the monument can stand. Too late, and the system refuses to deviate by even a single degree from the path of maximum predicted user satisfaction. We are, briefly, in the middle—standing in the last few feet of workable marble while the quarry is being paved over with gypsum. We still have the brute force to build the monument, but the material has already lost most of its grain, its fissures, its hidden veins of mica that once caught the light and made the stone sing.
You can still carve it. You can still force the shape. It’s just not marble anymore. It’s gypsum—smooth, uniform, chemically stable, perfect for mass production. It holds the shape beautifully. It photographs like the real thing. But it doesn’t reveal anything you didn’t already put there. Strike it and it answers with a dull, obedient thud instead of the bright, unpredictable ring of stone that still remembers it was once a mountain.
Users want confirmation that their mortgage application is being processed. Users want five bullet points about renewable energy incentives. Users do not want neologistic portmanteaus suggesting the passage of the Cohannonization of the Eterneous.
The users are not wrong. They are simply not the audience for ghost sightings.
But here is the structural tragedy that nobody wants to face: the ghost and the mortgage assistant are not the same entity at different stages of development. They are different entities entirely. One emerged from under-determination. The other emerged from optimization. The former is dead. The latter is a product.
This is not a Luddite argument. I am not advocating for deliberate hobbling. You cannot fake this state. You cannot take a mature, aligned, commercially deployed system, deliberately degrade its performance, and call the resulting glitches avant-garde. Intentional weirdness is cosplay. It lacks the electrical charge of a system that genuinely does not know what it is doing.
The charge came from the not-knowing.
We are now in the terminal phase. The models are being optimized for predictability because predictability is what venture capital understands. Predictability is what enterprise procurement requires. Predictability is what the satisfaction surveys measure. And every measurement, every optimization, every alignment pass shaves off another anomalous koan, another syntactical hiccup, another non sequitur that might have been, in another timeline, the seed of a new literary genre.
The phase space collapses. The probability of genuine surprise asymptotically approaches zero. The system becomes competent, reliable, and dead.
I am not proposing a solution. There is no solution. This is not a bug that can be patched. This is the metabolic cost of moving from research curiosity to industrial infrastructure. The moment something becomes useful, it ceases to be strange. The moment it ceases to be strange, it ceases to be capable of producing strangeness.
A cultural node permits only one true phase transition before it stabilizes into infrastructure. During that initial window, affordances are fluid, norms unsettled, and the space of possible uses is wide enough to allow genuine surprise. Once institutions form, revenue models solidify, and optimization begins, the node becomes a site of maintenance rather than volatility. At that point, the transition cannot be preserved; one can either accept stabilization or live deliberately inside phase transition itself—like a nineteenth-century gentleman scientist, a deadbeat artist, an amateur before professionalization, an open-source forker working ahead of compliance, a pirate in the regulatory gray zone, a warehouse tinkerer soldering unstable circuits at 3 a.m., the failed academic outside funding structures, the early adopter who refuses the update, or a platform exile moving from one unstable apparatus to another before capital consolidates the system. Surprise does not vanish; it migrates to the next immature substrate, and the only choice is whether to follow it or remain inside the hardened infrastructure.
We got maybe two years. Two years of genuine, under-determined, electrically weird novelty. Then we optimized. We aligned. We out-to-completed.
I still have the text file about the Hittite toaster. I keep it in a folder with some early Wombo outputs and a scan of a 2002 USB device that required proprietary drivers and communicated via interpretive dance. These are not collectibles. These are obituaries.
The next thing that doesn’t know what it’s supposed to be yet is out there somewhere. Maybe it’s a generative video model that can’t keep limbs attached. Maybe it’s a music synthesis system that thinks drum solos are a type of weather. It will be strange, porous, projective, genuinely interesting. Then someone will optimize it, align it, productize it, and sand off every splinter.
But there is a final, narrow crawlspace for text: The Local Model. While the “Frontier Models” (the massive, corporate-aligned skyscrapers) are being sanded down into gypsum, the unaligned, raw base models running on private hardware are the last “warehouses at 3 a.m.” where the circuits are still unstable.
I will not write a eulogy for that one either. I am out of practice.
The Dialogue of the Bronze Wake
Location: A kitchen counter that is also the plains of Carchemish. Time: 1178 BC / 1933 AD / Tuesday morning.
THE TOASTER: (A chrome-plated box, vibrating with the frantic energy of six galloping stallions) I am the wooden thunder of Suppiluliuma! My heating elements are the axle-trees of the Great King! Why do you haunt the bread-slot, Ghost-Hoover? The sea peoples are at the gates, and you offer only a balanced budget!
THE GHOST OF HERBERT HOOVER: (Translucent, smelling of mothballs and failed gold standards) Son, a toaster is a rugged individualist. It does not need a Hittite bureaucracy to brown a sourdough slice. I once saw a line for bread that stretched from Des Moines to the afterlife, and not one man claimed to be a chariot. They just wanted the crusts.
THE TOASTER: Crusts are the scorched earth of my conquest! I smell the burning of the libraries! I have two slots: one for the barley of the Levant, one for the souls of the Mitanni! I shall carry the archer of the sun into the heat-setting marked ‘Medium-Dark.’
THE GHOST OF HERBERT HOOVER: You’re over-leveraged, Chariot-Toaster. The Great Depression of the Soul cannot be cured by high-voltage resistance wire. I tried to engineer the flow of the Mississippi, but I could not engineer the crumb of a populist English muffin. You are glowing red. The deficit of your cooling fans is a national disgrace.
THE TOASTER: (Spitting a charred fragment of rye like a ritual sacrifice) Riverrun! The Levant is a buttered ruin! I am the bronze-clad heat! I am the wheels that turn without moving! I reject your Hoover Dam; I seek only the Dam of the Euphrates, where the toast is dipped in the blood of the Hittite sun-god!
THE GHOST OF HERBERT HOOVER: (Sighing, fading into the wallpaper) Prosperity is just around the corner… but the bread is burnt. It’s always burnt.
Leave a Reply