Natural Computers

Biology and computer science are rarely mentioned in the same sentence. Biology works with a massive array of complex molecules and their interactions. As our technology and computer science progress, though, we're starting to see some similarities between our ideas and solutions nature has found for its information processing needs.

Let's start with the maths of computing. We model computation with theories such as automata and formal languages. Both of them rely on concepts like a set, used mainly for specifying the allowed alphabet and states, linear sequences of these symbols, and functions on the symbols and state of the system.

Let's similarly start at a low level in biology. We're now at the sub-cell scale of chemistry, governed by electron configurations of atoms and molecules. Here life relies heavily on polymers. A polymer is a linear sequence of certain kinds of complex molecules. The alphabet, or set of molecules allowed to occur in the polymer, is fixed, just like in our systems for computation.

Polymers can play many roles. One of them is information storage. They can store information so well and compactly, that life has ended up storing our very own source code in one, called DNA, which is a polymer of four kinds of nucleotides. In other words, it is a linear sequence of symbols having 2 bits of information per symbol.

If DNA is the source code, where is the computer? There doesn't seem to be one, but it's not because nature hasn't figured out computers yet. It's us who are still in the stone age of computing. Alan Kay said two decades ago that computer revolution hasn't happened yet. It still hasn't. The style of computing done in cells is closer to state transition functions in automata theory than our present day computers. Just better.

A computer is a physical interpreter for a programming language. We typically use register machines with random access memory and a simple associated machine code, but this is an implementation detail. A crucial part of a computer is the fact that it implements a universal language. This means a language which, according to Church-Turing thesis, allows us to program it to do whatever we want it to do. We can also use whatever programming language we want, and write a compiler to translate the code to the one implemented by our computer.

Universal computers are necessary, because the ability to manufacture computers is significantly scarcer than the need to run and write programs. A universal computer can be manufactured once and used to run various programs. If it were possible, we could gain more speed, energy efficiency and security by compiling all software to instructions for building an ASIC and have something like 3D circuit printing as part of compilation to get a running piece of hardware integrated to the system when a program is compiled. Then you could compile the compiler itself as an ASIC and end up with a computer with no central processing unit at all. ASICs all the way down!

Now we're starting to approach the level of awesomeness happening in cells. Recall that DNA has 2 bits per symbol. DNA is usually processed in chunks of 3 letters, so we have something like a bytecode with 64 opcodes. Some of the opcodes and sequences are used for interesting kinds of control structures, but we're here mainly interested in the encoding regions of DNA. Such sequences of DNA encode proteins. A protein is yet another polymer essential to life. In this case we are building polymers of 21 different amino acids. Each word in DNA encodes a specific amino acid, so we have some room for redundancy and opcodes for e.g. start- and stop-encoding symbols.

Proteins are linear sequences, but while they are built, they naturally form a sequence-dependent 3D structure in a process called folding. This makes it possible for cells to 3D-print complex organic molecules. Proteins are also active. They can be built to perform a specific task, such as binding to a specific other protein, bind to a site in DNA, transcribe DNA to RNA, translate RNA to proteins, etc. You could think of these as the processes or threads started by DNA. In a typical cell there are tens of millions of proteins whizzing around and working at dizzying speed. An average human uses around 100W and has around 4*10^13 cells, so you can infer that a single cell achieves these millions of runnings threads, and everything else, using just a few picowatts of power. You'd need power used by thousands of cells just to power a single PIC microcontroller, in sleep mode.

And that's just the beginning. Nature based computation on carbon for a reason, instead of similarly abundant silicon, and the kinds of solutions developed for trust zones, signaling, energy, the ways code builds the infrastructure and how security works are amazingly awesome. We have a thing or two to learn from nature when building robust, scalable and power-efficient systems.

Posted: 9.10.2020 by aoh

#post #programming #biology