Ask HN: Can we think of DNA as Infrastructure as Code
And is there any study on the parallels between human biology and computer science? I've both worked with infrastructure as code, Pulumi, and was a grad student researcher in bioinformatics for several years, and I've developed the following take: Biology is messy, tangled, and sloppy system built over a billion years under evolutionary pressure. There's no clear analogy to intelligently designed software, and anytime you make an analogy, like DNA == Source code, there's a mechanism which would destroy it's predictive power to explain biological phenomena. Like with DNA, computer software doesn't create the machine it's executed on, code is 1d, while DNA is definitely multi-dimensional, where it's folding, epigentic modifications, and other modifications matter a lot. All the interesting biology for complex animals happens during the first few stages of development. There's no computational equivalent to this recursively constructive process. Additionally, biology has a single guiding principle through which we understand everything: evolution, and using computer analogies really diminish that. Therefore, biology is biology. It's not analogous to a Von Neuman architecture machine, or any other computing device we've created. The first principles are simple different. Thank you @wespiser_2018, spot on. My career is in both fields + PhD in one. Tortured analogies of biology as computers make me cringe, they're misleading at best. Sure the ribosome superficially looks like a FSM, but that gets you basically nowhere. Comp-sci people: If you're curious about biology, spend some quality time at Kahn or edX, watch some university intro-bio lectures on youtube, read an intro-level undergrad biol textbook, etc. I was reading "The Origin of Knowledge and Imagination" and the author was reflecting on the same thing. The mind is not a computer, you do not think like a computer. In fact, there is no separate concept called "the mind". The whole body is part of our perception and action ecosystem. It makes sense as software, while imperfect, is too orderly and simplistic. Chromatin, not just DNA.
Anyway, the "code" category is too simple to describe what is actually happening.
There is a sort of nanoscale industrial cyber-bio-physical-chemical plant/factory/machinery/complex within each cell.
If you consider CS-stuff as an analogy, you might miss the 'physical-chemical' aspect of it all. Like, you know, you would not call an autonomous android with embedded micro chemical facility just a computer or server. Yes it is also a computer, yes it runs some code, but it is moving, it is chemically very active, it has sort of initiative capabilities.
cs analogies are way too barren. You might be interested in reading some of Bert Hubert's articles, specifically "DNA seen through the eyes of a coder": https://berthub.eu/articles/posts/amazing-dna/ There is no direct parallels. And guess what - the brain is not even a computer. Everything this was just metaphors for non experts, irrelevant contexts and making money on pseudo science books. I think that very much depends on how you define "computer". the brain is doing computations, so I think it's safe to think of it as a computer. it's just a completely alien computer to what we build ourselves By computation, do you mean problem solving? The most simplistic model I have of the brain (not a psychologist, just from thing I read) is that the brain is an advanced pattern matcher. I don't think we even do linear "computation", just because of the quantity of data we are ingesting, storing and retrieving. how would you do pattern matching without computations? I think anything that could be described through logic gates or turning machines would be a "computation". And any object or organ that processes information would be a "computer " Because things are just are. An analogy I can come with is a graph like irrigation canals. You fill them with water and they just are. Or something like an adder circuits. You "input" the numbers and the result is just there. There isn't an algorithmic process with loops, conditions and recursions. We don't store information, we store its patterns and link it to other patterns. We don't process information as much as we filter it through patterns we've stored. Like logic gates do not process electricity, our brain do not process information. (This is very much armchair psychology theory) I'm not seeing the distinction. All computation "just is". computation happens through the physical world. adder circuits are performing a computation. an x86 CPU is nothing but a bunch of dumb circuits. water canals can do computations just like electrical circuits. as can biological neurons. This is a computer with all the abstractions stripped away [1]. It's hard to see the loops and conditionals from the falling marbles, but it is processing information. And electrical computers are essentially the same thing, just scaled up to an unimaginable degree At every point in history, people thought that they could describe the brain using whatever advanced technology they had. <https://aeon.co/essays/your-brain-does-not-process-informati...> Read the comments on that link for a summary of my thoughts. > But neither the song nor the poem has been ‘stored’ in it. The brain has simply changed in an orderly way that now allows us to sing the song or recite the poem under certain conditions Oof > the brain is doing computations Nope. We don't really know yet what exactly the brain is "doing". But again, in _some_ contexts it is better than nothing. Unfortunately good metaphors stick, regardless of their relevance and accuracy :( there are a lot of mysteries around the brain, but at a high level, I think you'd have a hard time finding a neuroscientist who doesn't believe the brain processes information. what's the alternative? Processing information is pretty vague and generic idea. You can label virtually anything as “processing information” and be more or less true. But I doubt i find any neuroscientist worth anything who say “brain is doing computations almost like a computer”. Like I said, it depends very much on how you define computer. To me "processing information" == "computation". And a physical system that performs "computations" is a computer. Therefore, a brain is just one type of biological computer. In a computer science class, the first thing they should teach is that the word "computer" in the course name is abstract, and not just about the metal slab on your desk. There are fundamental laws that apply to all information processing systems, whether electrical, mechanical, or biological. So it makes sense to put them in the same category at times. But obviously words mean different things in different contexts, and "computer" might mean something entirely different to a neuroscientist than a computer scientist. But I don't think a neuroscientist would disagree that a brain is a computer using the loose definition I described above Of course, you can set up a definition space so that a puppy is a steam engine. But if we stick to commonly adopted definitions, like https://en.wikipedia.org/wiki/Computer or "turing machine" ones - then NO, sorry, brain and computer has barely anything in common. Besides being made of atoms and being complex ) So it sounds like we both agree the brain is at least a physical object that performs computations. I'm surprised you don't see similarities with other physical systems that perform computations, but sounds like a terminology difference Here is the heart of our disagreement: > perform computations Brain doesn’t perform computations in any CS accepted sense (neither it is a turing machine nor it has any encoded program to execute any defined algorithm). It processes information, yes. But anything more specific than that is full of unknowns, unconfirmed hypothesis and speculations. Would you call an ants colony a computer? A tree? A government? But they all obviously processes information and seemingly perform computations, don’t they? > Brain doesn’t perform computations in any CS accepted sense (neither it is a turing machine nor it has any encoded program to execute any defined algorithm). That's a big statement. Anything that can be effectively described with math, can be described using Turing machines and algorithms. Any Turing-complete system is equivalent. As far as we know, all of physics can be described by equations, therefore is (theoretically) computable. What makes you think brains are special? Are there any other physical systems that you think are uncomputable? The only arguments I've found for why the brain can't be described by algorithms go into unconvincing pseudo-scientific arguments about the magic of quantum mechanics, which I find very unconvincing (quantum algorithms are still algorithms, and describable through math after all). Do you have a better one? > Would you call an ants colony a computer? A tree? A government? But they all obviously processes information and seemingly perform computations, don’t they? Yes, absolutely. You can use ant colonies or slime molds as biological computers to solve real-world finding problems: https://www.youtube.com/watch?v=BZUQQmcR5-g&t=1s Some of your other examples are more complicated, but all of them can be described through the computational lens, and modeled as Turing machines. you will find many scientific papers filled with equations trying to describe the algorithms behind each of them okay, we came to conclusion that almost any living organism can be called a computer then what's the point in using this term if it doesn't differentiate anything? it becomes meaningless, therefore there is no value in calling the brain a computer - it gives zero information about how brain actually works case closed ) Because calling it a computer means you can apply computer science concepts to it. That's the interesting part. You can talk about the classes of algorithms the brain uses to solve difficult problems, the ways it tries to conserve energy, the architecture trade offs, how to model it with math, etc. That stuff is non-obvious if you treat it like a magic black box. It's not a meaningless insight Can the brain solve NP problems in polynomial time? The brain itself is a mystery, but we know some things about computation, so it's pretty safe to say no, it cannot. Maybe on HN it's consensus that all living things are computers in this way, but plenty of people think the brain is literally supernatural. By saying "the brain nothing in common with a computer. We don't know what it's doing", those people will be nodding their heads. But we do know some things about computation, so we can apply those insights to the brain > You can talk about the classes of algorithms the brain uses to solve difficult problems > But we do know some things about computation, so we can apply those insights to the brain You can. But it is a snake oil, it doesn't give you any actual insight on how brain really works. To be able to model something with computer doesn't mean it is like a computer. But at the same time such thinking can really confuse unprepared mind that brain is really modeled by nature as some sort of sophisticated turing machine. This is misleading. At the same time, pretending that brains don't obey the laws of computation has its own implications, which can confuse unprepared minds into thinking: 1) That brains are something supernatural, instead of physical information processing systems that can be analyzed and understood as such. 2) That digital turing-complete computers are limited and can't possibly do the kinds of complex processing that biological computers can do Which is also misleading A brain is not a Turing machine, but a Turing machine can be a brain. Biological brains are just a subset in the space of all possible computers But I don't think either of us is confused, so this is just a conversation about the semantics, which I'm not very interested in > A brain is not a Turing machine, but a Turing machine can be a brain. At this point I regret I wasted my time explaining what's wrong with your reasoning :) Peace! Not as Infra as Code but you could think of it as another stack with its own architechture and programming language. Also runs on jelly instead of electronics. Have we figured out how to program it? Nope :'(. And malware is literally viruses. What always surprised me is not DNA, but how other non-living things can exist at all without DNA. It makes sense to me that I have green eyes since that's what my DNA says... but where is defined that a rock has a particular color or weight or texture? Whenever I think of how planets move around a star, I always think "Ok. So, I imagine that at some point the universe should know the distance between planet X and star Y, and also, probably, maybe, it should know their masses... but that data is nowhere defined (that we know). So, is the data computed 'on the fly' every $minimal-unit-of-time?). Perhaps the universe doesn't need to know distances nor masses, though. I think you might be trying to form a model based on false premises here. The universe doesn't know anything. Or to put it another way, the universe doesn't know anything because it is everything. You seem to be assuming a sort of director and actor based model wherein the universe (the director) needs to be telling the planets and stars (the actors) what to do, but that's not the case. The planets and stars interact with each other via the various forces that they're subject to and their state may or may not change as a result. I guess, to generalize, my point is that things are as they are as a consequence of the physics that they're subject to. A rock has its particular color, weight, and texture due to its elemental composition. A molecule doesn't need to know what it is in order to reflect or absorb certain wavelengths of light, that's what naturally occurs when those wavelengths interact with particles that have a certain composition and state. There's no conscious direction happening at any point, nor is there any data being computed. The universe is essentially a medium that exists with certain properties and everything in it is stuff that also exists with certain properties. We call the manner in which these properties interact physics. So in summation I'd say that there are intrinsic properties and emergent phenomena based on those properties, that's where a rock gets its color, weight, and texture. Yes, but the computer that runs the code is the body that grows it, and each one is different. Computation is everywhere. Read about CRISPR and tape based Turing machines. We can if we'd like to but remember that it's an analogy, no matter how useful, inspiring or beautiful the connections might be. I don't like this analogy too much because it collapses "DNA" into a singular thing that's somehow related to classical computing, even though there's no indication that DNA and the multiple layers of systems interacting with it are limited in the same ways as classical computers. I prefer to think of the DNA as a medium for memory, one of many mediums for memory. Memory is everywhere, whether or not anyone or anything remembers it. >Memory is everywhere, whether or not anyone or anything remembers it. That one is beautiful. A better analogy for DNA in computer science would be LLMs. Each organism's DNA represents an experiment being performed in service of training a model. If that experiment manages to procreate, its successful mutations graduate into another round of experiments. This has gone on for several billion years, resulting in a largely stable model within which experiment are continuing to be run. As with LLMs, DNA doesn't know anything about the data it is being trained on ("Nature"). And that model continues to change even as the experiments are run. So DNA is a "blockchain" record of previous and current hypotheses on which traits enable an organism to live to viability. Some of these hypotheses are "dead code," as the environment no longer contains the pressure which made them critical. Some of them are essential to viability. Some of them are experiments whose value has not yet been determined. Assuming your question is whether IaC could learn from patterns in DNA, I think that's a very interesting idea. Certainly we desire that every loadbalancer, database, and iam policy be capable of self-defense, and be the hardiest, most fit version of itself possible. Where the analogy struggles is that people writing IaC are more in the business of designing "natures" than they are designing individual organisms which would survive a chaotic and hostile "nature" being enforced on them. And people who write IaC might be unhappy to hear that getting to a "viable" database would require launching several thousand databases in an environment and, after some period of changes, seeing which one is performing best so they can clone that "best" database configuration when new databases are needed. I apologize in advance for criticism you only half deserve. I know this is HN, but not every single topic needs to be compared to LLMs or Blockchain. Concepts like fitness functions, selective pressures, feedback mechanisms, etc. -- as applied to both embodied organisms and constructs like cultures and ideologies -- predate not just the last two hype cycles but artificial computation altogether. Referring to such deeply established foundation is normal -- I'm sure many of us remember when novel AI approaches like perceptrons and evolutionary algorithms were described by analogy to biology -- but skipping over the established literature to talk about nascent concepts makes you sound less like you're contributing to an educated discussion and more like you're courting angel investors with buzzwords. At the very least it could be "here's the analogy to <established thing>, and to help round out the concept, <new thing> is also analogous to <established thing> with <important differences>". That makes both a stronger argument and a more educational essay than skipping straight to the new thing with no foundation. For example, one very important difference is that DNA can be edited and lose history, unlike blockchain, but an intact fossil record could be used to infer how those edits came in over time and space, kind of like an incomplete distributed ledger. > For example, one very important difference is that DNA can be edited and lose history, unlike blockchain, but an intact fossil record could be used to infer how those edits came in over time and space, kind of like an incomplete distributed ledger. That is a very good point. It is true that editing blockchain completely destroys the chain, and editing DNA in very specific ways does not. I was thinking (when I wrote it) that because of the way the genes interact it can be completely destructive to take only a single piece of DNA. But, as you correctly point out, we do that all the time. So, yes, the blockchain analogy only partially fits. Thank you. You could, but DNA is a terrible "language." It's unreliable , degrades over time, accumulates and perpetuates errors, and if you use instructions in an order the compiler doesn't like, you'll get origami (loops, hairpins) instead of a program. It replicates for millennia, it uses air to build plants, it self repairs, it contains the code for its operating system. Unfortunately, the world didn't need a first Gödel, Escher, Bach pseudoscientific conspiracy theory pin board that falls in love with the idea of beautiful ideas over the nuances of reality and the incomparable contrasts of entirely different things. I learned about DNA transcription and translation while learning about mRNA vaccines in 2020. Here's how I explained the process to myself: 1) DNA = source code on disk 2) RNA polymerase = disk read head 3) RNA = source code / functions loaded to memory 4) Ribosome = JIT compiler 5) Proteins = small, single purpose executables (like unix commands) 6) Proteins once outside the cell = execution If you think of the body as the hardware, then yes, there is some merit to thinking of DNA as infrastructure as code, operating system and application software. I think Michael Levin would disagree: genetic algorithms were stolen by the latter from the former definitely not, go and look up epigenetics DNA is a form of code, but it doesn't encode programs. Instead, like an STL or STEP file it encodes HARDWARE designs. While you could think of it as encoding infrastructure AND code (as in IaS) you'd need to go beyond that to include the hardware for computing AND physical function (like a whole car + computer) in that conception, which is not what IaS means. The hardware side of DNA is easy to overlook since we don't yet have the necessesary (CAD) design tools to easily understand the shape and mechanics of proteins just from reading a DNA sequence like we do for macroscale 3D models. But there are hard technological reasons for this. DNA encodes information, but instead of binary organized into 8-64 bit bytes (10010110) it uses four base pairs (ATCG) organized into 3 letter codons, each of which represents one amino acid. The cell assembles chains of amino acids which are then placed in an "oven" where the string of molecules folds back on itself to assemble a complicated and functional 3D shape. When we say complicated, we really do mean complicated. Even the fastest modern super computers are unable to determine the shape of these protein based only on the DNA sequence input. Further, we are unable to simulate the way that a folded protein will interact with other molecules reliably. Fortunately these kinds of problem will someday be easily solved by quantum computers, but for now we are stuck with approximations of questionable accuracy. But there are very computer code-like elements to how cells work. Unfortunately it is all spaghetti code. One section of DNA often codes for proteins which bind to one or more other sections of DNA either increasing or decreasing the activity production of the proteins from those locations. Additionally, some DNA sections code not for protein but RNA strings which are used mechanically by themselves or as part of proteins like CRISPER. RNA is always created as an intermediate step between DNA and Protein, but in this case it is used directly as fRNA (functional RNA). RNA can even fold on itself and act similar to proteins though it is much more fragile. The many interactions between protein, DNA and RNA perform a kind of computation but it is very obfuscated. The following are generalized interactions that take place in a cell (perhaps analogous to machine instructions) written in a kind of pseudocode, to help illustrate the recursive functions involved. DNA + Protein = RNA; RNA + Protein = Protein; Protein = Protein++; Protein = Protein--; Protein = RNA++; Protein = RNA--; RNA = RNA++; RNA = RNA-+; RNA = Protein++; RNA = Protein--; Protein + RNA = DNA; Any protein or fRNA can have multiple functions in a cell and affect the production other proteins and fRNAs by interacting with DNA or RNA or with other Proteins involved in the production chain. In addition to this, proteins and fRNA also physically move around other proteins and molecules and make up the structure and machinery of a cell. Untangling it all is close to impossible currently. There is several billion years worth of tech debt and zero documentation. This looks like it was written by generative AI but I can't really say for sure. BTW: protein structure prediction didn't need supercomputers (in the traditional sense) and the PSP problem wasn't solved using supercomputers applying a high quality physics function to simulate folding- instead, it was solved using a combination of ML supercomputers, a really good algorithm (transformers), and a couple of really good data sets- the known structures of proteins, and the known relationship of proteins. Instead of simulation on a huge supercomputer so they could predict a single strucfture, they trained a model which approximates structure well enough to beat every competitor. From what I can tell, most of the resulting quality doesn't come from their force field but from the distance constraints that are mostly derived from historical relationships between proteins, and the coevolution of their sequences. Came here to say this. It is extremely over-simplistic to think of DNA as Infrastructure as Code.