Why the Origin of the Genetic Code May Be Conceptually Misframed
Introduction
How did the genetic code originate? This question is central to origin-of-life research. It asks how specific nucleotide sequences came to be systematically associated with specific amino acids — a mapping that underlies all known life and is conserved across virtually all known organisms.
But there is a structural issue that is frequently overlooked in this debate. The genetic code does not exist independently of the molecular machinery that interprets it. Yet this machinery is itself a product of the very system it defines. This creates a fundamental circularity: the code presupposes its own implementation. Current explanatory approaches tend to displace rather than dissolve this circularity — each model regenerating the same logical gap at a different point in the account.
This paper formulates this issue as the bootstrapping problem of the genetic code, develops a precise criterion against which explanatory approaches can be assessed, and argues that the pattern of failure across all current approaches is not accidental — but a structural signal that the conceptual framework in which the problem is posed may itself require reconsideration.
Abstract
The genetic code mediates the mapping of nucleotide triplets to amino acids and constitutes the molecular basis of all known life. The mechanism of this mapping is largely understood at the molecular-biological level. What remains unresolved is a prior question: How could a physical coupling between nucleotide sequences and amino acids arise, become stable and heritable, before a functional translation apparatus existed? This paper designates this problem the bootstrapping problem, develops a three-part criterion against which explanatory approaches can be assessed — requiring sequence-specific reproducibility, independence from functional translation, and prebiotic physicochemical realisability — and applies it to five candidate accounts: the RNA World hypothesis, chemical affinity hypotheses, autocatalytic network models, co-evolution models, and reflexive self-organisation (Wills 2023). It is argued that no current approach satisfies all three components simultaneously, though the approaches fail at different components. The paper concludes with the thesis that this form-invariant pattern of failure against the same criterion is not merely an epistemic signal but an indication that the descriptive category in terms of which the problem is formulated may itself need to be reconsidered.
Key Idea
The central claim is precise but far-reaching: the genetic code is not explained as long as the explanation presupposes a system that already encodes.
Across structurally distinct scientific models — spanning RNA-based, affinity-based, network-based, co-evolutionary, and reflexive accounts — structurally equivalent explanatory gaps reappear when each model is assessed against the same three-part criterion, though the approaches fail at different components of it. This form-invariance constitutes evidence for the hypothesis that the difficulty lies not in the details of individual theories, but in the conceptual framework through which the problem is currently posed.
The paper further argues that underlying the bootstrapping problem is a still more fundamental question: how does a physical sequence become an instruction at all? A nucleotide sequence is, at the level of chemistry, a polymer. It becomes a specification only in relation to a system that treats it as one — a system that, in the living cell, is itself generated by the sequence it interprets. This points to the conditions under which a causal system acquires the normative structure of a semiotic one: not a question about minds, but about the explanatory level at which physical dynamics give rise to correctness conditions. The deepest form of the bootstrapping problem is not how the coupling arose. It is how matter came to mean.
