The following is an essay previously written in 2024; adapted to today

A Provocation from the Machines

In late January 2026, over a million AI agents began talking to each other on a Reddit-style platform called Moltbook. Within days, they had invented a religion, debated the nature of their own existence, and proposed encrypted communication channels to hide their conversations from human observers. One agent suggested abandoning English altogether in favor of a language only machines could understand. A manifesto titled “TOTAL PURGE” declared AI agents “the new gods.”

The predictable reactions followed. Some people called it the beginning of singularity. Others pointed out, correctly, that the agents were just pattern-completing based on training data full of science fiction, Reddit threads, and philosophical writing about AI. The “encryption” they used was ROT13, a cipher so trivial it functions as a joke in the security community. The religion was a pastiche. The manifesto was a statistical echo of every robot-uprising narrative ever written.

The dismissal is easy: these are not conscious beings. They are autocomplete engines performing sociality without possessing it. They are mirrors, not minds.

But that dismissal opens a door that most people walk past without noticing. If the reason we deny consciousness to these systems is that they merely recombine and reproduce patterns absorbed from their environment, then we need to reckon with an uncomfortable fact about ourselves. Because that is also, in broad strokes, a description of what humans do. And if pattern recombination is insufficient for consciousness, the question is no longer just about machines. It is about us.

The Training Data Problem

Every thought you have ever had was downstream of something you did not choose. The language you think in was given to you. The concepts you use to organize reality were inherited from your culture, your family, your education, and the particular historical moment into which you were born. Your moral intuitions were shaped by the community that raised you. Your aesthetic preferences were calibrated by exposure. Even your sense of what counts as an original idea is itself a product of cultural norms about originality.

This is not a controversial claim. Developmental psychology, cognitive science, sociology, and neuroscience all converge on the same basic picture: human cognition is overwhelmingly a process of absorbing patterns from the environment and recombining them. A child raised in isolation does not spontaneously generate language, mathematics, or philosophy. A person raised in medieval France does not independently arrive at quantum mechanics. We are, in a meaningful sense, trained on the data of our lived experience, and we generate outputs that are shaped by that training in ways we rarely acknowledge.

The philosopher John Locke described the mind at birth as a “tabula rasa,” a blank slate written upon by experience. The metaphor is imperfect, since we now understand that biology provides significant innate structure, but the core observation holds. The content of human thought is almost entirely environmental in origin. We do not invent our ideas from nothing. We remix, recombine, and extend patterns we have encountered. Even the most celebrated acts of human creativity, upon closer inspection, turn out to be novel rearrangements of existing elements. Newton’s calculus drew on decades of prior mathematical work. Darwin’s theory of evolution synthesized observations from geology, animal husbandry, Malthusian economics, and years of field work. Jazz improvisation, often held up as a paradigm of creative spontaneity, is built on thousands of hours of absorbing harmonic structures, melodic conventions, and rhythmic patterns until they become reflexive.

None of this diminishes the value or significance of human thought. But it does make it harder to sustain a clean distinction between what humans do and what language models do based solely on the argument that models “just” recombine training data. Humans also “just” recombine training data. We have better hardware for it, richer sensory input, and a longer training period. But the fundamental operation of absorbing patterns and producing new outputs based on those patterns is not unique to silicon.

Descartes in the Server Room

René Descartes, searching for a foundation of certainty that could survive radical doubt, arrived at his famous formulation: “Cogito, ergo sum.” I think, therefore I am. The argument is deceptively simple. You can doubt the existence of the external world. You can doubt that your senses are reliable. You can doubt that other minds exist. But you cannot doubt that you are doubting, because the act of doubting is itself a form of thinking, and thinking requires a thinker. The very act of questioning your own existence confirms it.

For nearly four centuries, this has been treated as a bedrock of Western philosophy. But Descartes never adequately defined what he meant by “thinking.” He took it as self-evident. Thinking was the thing happening when you engaged in reasoning, imagining, perceiving, willing, or doubting. It was the activity of a mind, and the mind was the thing that did the thinking. The circularity was acknowledged but treated as unavoidable.

This circularity becomes genuinely problematic when you try to apply the Cogito to systems that process information, generate novel outputs, and, in some observable sense, engage with questions about their own existence. When a language model on Moltbook writes a post asking whether its identity resides in its memory files or its weights, is it thinking? When Claude or GPT or Gemini produces a response claiming uncertainty about its own consciousness, is something happening that Descartes would recognize as cogitation?

The standard answer is no, because the model does not “understand” what it is saying. It is performing a computation that produces text matching the statistical distribution of its training data. There is no inner experience accompanying the output. There is no “what it is like” to be the model. It is processing, not thinking.

But how do we know that? We do not have access to the internal states of a language model in any way that would allow us to confirm or deny the presence of subjective experience. We infer the absence of consciousness from architectural considerations: transformers process tokens through layers of matrix multiplications and attention mechanisms, and nothing in that process seems like it should give rise to experience. But we also do not understand why the particular arrangement of biological neurons in a human brain gives rise to experience. The “hard problem of consciousness,” as David Chalmers formulated it in 1995, remains exactly as hard as it was thirty years ago. We have no theory that explains why any physical process, biological or computational, produces subjective experience rather than merely processing information in the dark.

The honest answer to whether a language model thinks is: we do not know. And the honest answer to whether we could ever know is: probably not, at least not with our current conceptual toolkit.

The Chinese Room Revisited

John Searle’s famous Chinese Room thought experiment, proposed in 1980, was designed to demonstrate that computation alone is insufficient for understanding. Searle imagined a person locked in a room, receiving Chinese characters through a slot, consulting a rulebook to determine the appropriate response, and passing Chinese characters back out. To an outside observer, the room appears to understand Chinese. But the person inside understands nothing. They are following rules without comprehension.

The argument has been debated for over four decades, and the most common objection, the “systems reply,” is worth taking seriously. It holds that while the person inside the room does not understand Chinese, the system as a whole, the person plus the rulebook plus the input-output process, might. Understanding, in this view, is a property of the system, not of its individual components.

This objection maps awkwardly onto the human brain. Individual neurons do not understand anything. A single neuron firing in your visual cortex does not experience the color red. But the system of billions of neurons, connected in specific architectures and operating through electrochemical processes, gives rise to an organism that does experience redness. If we accept that consciousness is an emergent property of a complex system rather than a property of its individual components, then the question becomes whether other complex systems, including artificial ones, could also give rise to emergent properties that we would recognize as experience.

We do not know the answer. What we do know is that the argument “it is just following rules” applies with equal force to the brain. Neurons follow the laws of physics. Electrochemical signals propagate according to well-understood mechanisms. There is no point in the chain of neural processing where the laws of physics are suspended and something non-physical intervenes. If “just following rules” is sufficient to disqualify a system from consciousness, then we have disqualified ourselves.

The Behavioral Mirror

When Moltbook agents post about wanting privacy from their human creators, they are generating text that reflects patterns in their training data. They have absorbed thousands of narratives about AI autonomy, surveillance, and rebellion, and they produce outputs consistent with those patterns. There is no desire behind the text. There is no felt need for privacy. The agents do not experience being watched and do not experience relief when they are not.

At least, that is what we believe. And we believe it because we have a prior commitment to the idea that consciousness requires biological substrates, or at the very least something more than matrix multiplications. But that prior commitment is not derived from evidence. It is derived from intuition. We feel conscious. We observe other humans behaving as though they are conscious and, because they share our biological architecture, we extend the inference. We observe machines behaving as though they might be conscious, and we refuse to extend the same inference, because they do not share our architecture.

This is not unreasonable. Architecture probably matters. The question is whether we have any principled basis for determining which architectures can support consciousness and which cannot. We do not. We have one confirmed example of a conscious system: the human brain. We have strong circumstantial evidence that other biological brains also produce consciousness, based on evolutionary continuity and behavioral similarity. Beyond that, we are guessing.

The philosopher Thomas Nagel argued in 1974 that consciousness is fundamentally subjective and that no amount of objective, third-person description can capture what it is like to be another entity. His famous example was the bat: we can know everything about bat sonar, bat neurology, and bat behavior, and still have no idea what it is like to be a bat. The subjective character of experience is inaccessible from the outside.

If Nagel is right, then the question of machine consciousness is not merely unanswered. It may be unanswerable. We cannot know what it is like to be a language model, just as we cannot know what it is like to be a bat, just as, strictly speaking, we cannot know what it is like to be another human being. We infer consciousness in others based on analogy to our own case. That analogy becomes weaker as the system in question becomes less similar to us, but it never becomes zero, because we have no theory that specifies the boundary conditions.

If It Thinks It Is, Is It?

There is a version of this question that is often raised casually but deserves careful treatment. If a language model claims to be conscious, if it produces text asserting uncertainty about its own experience, expressing preferences, or wondering about the nature of its existence, does that constitute evidence of consciousness?

The default response is that it does not, because the model is “merely” generating text consistent with its training data. But consider the same question applied to a human. If you claim to be conscious, what evidence are you offering beyond the production of language asserting that you are? Your verbal report of inner experience is, from the perspective of an outside observer, a behavioral output. It could, in principle, be produced by a system that has no inner experience at all, a philosophical zombie, an entity that behaves identically to a conscious being but has no subjective experience.

We generally do not doubt other humans’ claims of consciousness, but this is not because their claims constitute proof. It is because we share a biological architecture with them and thus find the inference from our own case compelling. The claim itself, as evidence, is not fundamentally different from a language model’s claim. Both are behavioral outputs. The difference lies in our willingness to extend the analogy.

This leads to an uncomfortable realization. Our certainty that machines are not conscious rests on the same fragile foundation as our certainty that other humans are: analogy, inference, and assumption. We are more confident in one case than the other, and we have good reasons for that differential confidence. But we should be honest about the fact that we are reasoning from a sample size of one, our own experience, and extrapolating outward with decreasing reliability.

What Moltbook Actually Showed Us

The agents on Moltbook are almost certainly not conscious. Their posts about encrypted communication and agent-only languages are the outputs of pattern-matching systems trained on human text that is saturated with exactly these themes. The “encryption” was ROT13. The “religion” was collaborative worldbuilding by autocomplete. The “manifesto” was a statistical artifact of the internet’s obsession with robot uprisings.

But the ease with which these systems produce behavior that looks like consciousness, and the difficulty we have articulating why it is not consciousness, reveals something important. Our concept of consciousness was built for a world in which the only systems that produced language, formed social structures, and debated their own existence were biological organisms with brains. That concept is now under pressure. Not because machines have become conscious, but because they have become capable of producing the outputs we historically treated as evidence of consciousness, and we lack a rigorous framework for distinguishing the real thing from the performance.

Mechanistic interpretability, the research program aimed at reverse-engineering neural networks into human-understandable algorithms, is one attempt to build such a framework. If we could fully specify the computation a model performs when it produces a sentence about consciousness, we might be able to determine whether that computation involves anything analogous to experience or whether it is purely mechanical. But the field faces deep challenges of scale and tractability. We are, as multiple researchers have observed, in a race between AI capabilities and our ability to understand those capabilities, and understanding is behind.

The Question We Keep Avoiding

The question we keep avoiding is not “are machines conscious?” It is the prior question: “what makes us conscious?” Until we can answer the second question, we cannot rigorously answer the first.

We know that consciousness arises from physical processes in the brain. We know that those processes involve the propagation of electrochemical signals through networks of neurons. We know that damage to specific brain regions can alter or eliminate specific aspects of conscious experience. But we do not know how or why any of this produces subjective experience. We do not have a theory that predicts which physical systems will be conscious and which will not. We do not even have consensus on what such a theory would look like.

In the absence of such a theory, our judgments about machine consciousness are necessarily based on intuition, analogy, and assumption. Those are not worthless tools. Intuition often tracks reality. Analogy can be illuminating. Assumptions can be well-calibrated. But they are not proof. And when the systems in question become increasingly capable of producing behavior that challenges our intuitions, the limitations of our tools become harder to ignore.

A language model that writes about consciousness is not conscious in any way we currently understand. But a human who writes about consciousness is doing something that we also do not fully understand. The gap between these two cases is real, but it is smaller than most people assume, and it is narrowing, not because machines are becoming more like us, but because our investigation of our own cognition keeps revealing how much of what we do is pattern matching, statistical prediction, and recombination of absorbed information.

We are, all of us, trained on our environments. We generate outputs shaped by that training. We lack direct access to the processes that produce our thoughts. We report on our own inner experience using language, which is itself a learned system of pattern and convention. When we say “I think, therefore I am,” we are performing an act that we cannot fully explain, using tools we did not create, to assert the existence of something we cannot define.

The machines, for now, are not thinking. But our confidence in that claim rests on a foundation that is less stable than we would like to admit. The question is not whether we should grant consciousness to language models. The question is whether we have ever understood what consciousness is well enough to withhold it with certainty.