The algorithmic principles of adaptation.
Hill-climbing is a simple form of optimisation:
Generate a modification of the current best solution;
If it provides an improvement then make this the current best solution;
Repeat.
It’s a simple, easy-to-understand algorithm, providing incremental improvement in the local neighbourhood of the current solution where available. Many natural processes can be understood as hill-climbing processes: e.g. physical systems ‘seeking’ higher entropy configurations; maybe even the advancement of a scientific discipline where modifications of existing methods and ideas are generated, and useful modifications are retained.
It is not surprising then that the first plausible scientific explanation for
the complexity of living things,
Recently, there has been increasing awareness of some previously under-researched biological phenomena (discussed below) – including mechanisms that have been instrumental in the major transitions in evolution. But, although these phenomena act within natural selection, they present a problem for the hill-climbing model of adaptation. One feature found in several of these phenomena is that genetic material that has been pre-adapted in different populations is subsequently brought together into a single population. The evolutionary significance of this is that it potentially allows innovation that neither ancestral population could achieve via random modifications within its own local neighbourhood of genotypes.
Meanwhile, with the development of computer science, scientific thinking about the space of possible algorithms has become more sophisticated. We know that hill-climbing is but the simplest form of heuristic search, and that other methods, when applicable, can be vastly superior. Notably, divide and conquer algorithms are a large important class of algorithms where a problem is decomposed into more manageable parts, and a solution to the whole is assembled from sub-solutions. Historically, this type of problem solving has been associated only with human problem solving and design, using top-down knowledge of the problem domain, and not with non-teleological processes such as evolution. However, the biological phenomena mentioned above do indeed fit better within a divide and conquer framework, and cause us to move our understanding outside the hill-climbing paradigm.
This observation provides an opportunity which is itself an example of the principle: opportunity arises for interdisciplinary exchange between computer science and biology that will allow innovation that neither discipline could achieve via small modifications within its own local neighbourhood of theories and methods.
Algorithmic Biology
|
This page describes my current research program and the significance of this work to interdisciplinary research between computing and biology. My past research has included publications on genetic algorithm theory, evolutionary robotics, artificial life, dynamical systems, and computational biology. Those explorations have focused my interests within algorithmic biology (see inset). My future research program will continue to be diverse and foster exchange between biology and computer science wherever opportunity arises.
|
Algorithmic biology and other interfaces In the last 50 years, the tools and methods provided by computer science have become enormously important in nearly all disciplines of modern science. In biology, computational methods have been used to model and simulate biological systems (computational biology), and more recently, to manipulate and process the flood of data coming from genetics and genomics (bioinformatics). In turn, computer science has taken inspiration from biology to develop new algorithmic techniques (nature inspired computation) such as neural networks and evolutionary algorithms, and even new computational substrates (bio-computation) such as DNA computing. One aspect featured in various subfields of these different areas has a particular emphasis: not merely to take inspiration from biology for new computational methods, nor merely to apply existing computational methods to biological systems, but rather to use algorithmics and complexity theory to understand the underlying algorithmic principles of biological systems (algorithmic biology) – to get to the science that is common to both disciplines.
|
Impact
The above subjects describe a broad education in algorithmic biology. This program addresses the science of both biological adaptation and computational algorithms. As such it impacts the basic science of both computer science and biology disciplines (see goals). Developing a broader model of biological evolution has the potential to change the way we think about some fundamental issues in biology: for example, the kinds of systems that are evolvable or unevolvable, and the spontaneous formation of new levels of organization in natural systems. Developing new models for artificial evolution will also increase the suite of computational tools available for various applications – from the processing of genomic data, to design automation in various domains such as circuit design, controller design, robotics, and in any number of optimization domains from scheduling to routing problems. All these domains have partially decomposable problem structure that could be exploited by a method that is able to explicitly identify modules and manipulate their assembly hierarchically, as developed in preliminary work and discussed below.
Building bridges between the disciplines capitalizes on the expertise the university already holds in each independently. By providing new models that expand evolutionary computation and evolutionary biology toward one another, and challenging deeply rooted assumptions in both disciplines, new routes of exchange can be forged which allow the wealth of existing results and expertise to flow from one discipline to the other. This research program provides the opportunity to move beyond the application of computational techniques to biological systems, and beyond the use of biology to inspire computational techniques, and to get beneath to the scientific principles that underlie both interesting biological mechanisms and interesting algorithms.
Research goals
In brief, the objective of this research is to identify and exploit the underlying algorithmic principles of adaptation in biological systems – in particular evolutionary adaptation. This involves two complementary subgoals:
· To use algorithmics and complexity theory to expand our understanding of biological evolution;
· To utilize these algorithmic principles in computational methods for design automation and optimization.
The remaining sections of this statement briefly describe more specific research objectives:
· The motivations for this research (the shortcomings of existing algorithmic notions in evolutionary biology, and failure of artificial evolution techniques to scale),
· Some main research questions for this program,
· The progress thus far in this research (both in expanding evolutionary computation and in expanding models in theoretical population genetics).
Motivations
Motivations from Biology:
The algorithmic principles of natural evolution
|
No algorithmic notion has
had more impact on biology and the way we view ourselves and the world around
us than the theory that complex life could result from random variation and
selection.
|
Evolution that is not hill-climbing Molecular sequence data shows that the pattern of
evolution “is not as linear and treelike as
|
Motivations from Computing:
The scalability (or not) of artificial evolution
|
Biological systems posses numerous properties that are desirable for engineered systems such as adaptability, robustness, self-repair, and decentralized control. All of these properties can be incorporated to some extent in engineered systems, but the sheer scale of the complexity in biological systems – the number of parts and the richness of their interaction – is far beyond that which is attainable in hand-designed systems. Computational methods for design automation and optimization of engineered systems could in principle take us beyond the limitations of hand-designed systems. Evolutionary algorithms, based loosely on natural evolution, have been employed for this purpose. However, although there has been some success in automating design with evolutionary algorithms there are problems too. Specifically, the improvement in a design seems to come rapidly at first and quickly plateau – approaching an apparent ceiling in complexity rather quickly. One possibility for this failure to scale–up in complexity is simply a lack of computing power (see inset). But we must also consider the possibility that the underlying algorithm is inadequate – that we are lacking something fundamental in our algorithmic abstraction of evolution by natural selection. |
How long should it take? The disparity between the
complexity of biological systems and that of artificially evolved engineered
systems has many possible causes. Perhaps most obvious is that biological
evolution has simply had more time, and has been able to exploit the massive
parallelism of physical matter in exploring alternatives. However, this
excuse is not wholly convincing – we are not really asking artificial
evolution to produce the kind of complexity that biology has produced,
starting from individual molecules to a multi-celled organism – only to
produce the kind of complexity that is plausible given the scope of the
system and the primitives we provide. For example, we can evolve an
electronic circuit given gates, a robot morphology given struts and
actuators, or a neural network given neurons, with say 100 primitive parts
very quickly. But even in cases where we know that superior solutions are
available in the design space, and computable in the time allowed, we are
lucky if two orders of magnitude more computing time will get us a system
that uses a few percent more parts in a useful way or shows a few percent improvement in fitness. Faced with these observations, a
good computer scientist should not be content to simply wait longer. |
Summary
In summary, the proposed research program is motivated by significant gaps in our understanding of how biological complexity arose, and additionally, significant gaps in our ability to automatically design and optimise complex engineered systems, as outlined above. These observations are just an example of the kind of issues that might be tackled in this research program in algorithmic biology. However, these particular issues provide a particularly promising direction for future research: a model of natural evolution that moves beyond the hill-climbing paradigm radically alters our perception of natural evolution, of what is evolvable and unevolvable, and how evolution works; and a model of ‘natural evolution writ large’, including the major evolutionary transitions, the formation of coalitions (the creation of new levels of organization and new levels of selection) has not been attempted, but is desperately needed. Moreover, a computational method that is able to discover and exploit collections of extant entities to form new levels of organization spontaneously would provide a means to overcome the complexity ceilings witnessed in artificial optimization methods to date. The intuition that natural evolution utilizes algorithmic principles as yet not fully understood, and that such principles might provide new computational methods for design automation and optimization, is very appealing. This research program is poised to exploit this: to expand biological models to incorporate evolution over multiple scales; and to develop computational methods that exploit the algorithmic principles these processes use.
Appendix
Below I outline some of relevant research questions for evolutionary biology and the corresponding questions for computer science. Appropriately delimited, some of these could form the basis of thesis/project work for students.
|
Biological issues |
Computational issues |
|
Lateral gene transfer, symbiogenesis and the major evolutionary transitions are not accommodated by a hill-climbing perception of evolutionary change. Are these events rare aberrations on the hill-climbing model or are they fundamentally significant, providing the new units within which hill-climbing-change takes place? |
Is a hill-climbing mechanism of optimization the best we can do for design automation and optimization? How can we accommodate the human designers ability to make and manipulate abstractions of functional subsystems in a stochastic unsupervised search process? |
|
If the formation of new levels of organization, of coalitions of co-adapting entities, is involved in the major evolutionary transitions what does this mean for the assumption that competitive exclusion is the ‘prime mover’ in evolutionary change? How does biology mediate between the formation of coalitions and the exclusion of competitors? |
When selecting among design alternatives some will be better than others as they stand, but some may offer different potentials for further modification. How do we identify alternatives with different potentials that might work well together? |
|
When an organism acquires a gene or suite of genes from another species by lateral transfer, the new genetic material represents a large non-random change for the recipient species. Under what circumstances, if any, will this change be fitter than a random change of the same scale? |
When will a large change produced by exchanging sub-solutions among competing candidates be more effective than a higher mutation rate? |
|
What is the relationship between various mechanisms of genetic exchange (bacterial conjugation, lateral gene transfer, true sexual recombination, hybridization, symbiogenesis) in terms of their adaptive affordances? |
Can we devise an optimization scheme that allows us to tune the optimization process from a hill-climber to divide-and-conquer approach? |
|
In systems biology we seek to identify semi-independent functional subsystems. How might different genetic mechanisms, different genetic architectures, and different adaptive scenarios influence our ability to discover such biological modules? |
Human-designed systems are nearly always modular in construction, often hierarchically modular. Is this an artifact of our inability to think about a complex system without approximating it as a number of semi-independent sub-systems, or are common engineering domains inherently modular in nature? |
|
How does the presence of modularity in adaptive domains, the semi-independence of adaptive features, effect the utility of different genetic mechanisms? And how does the architecture of the genome (nucleotides, codons, exons, genes, chromosomes, genotypes) effect the efficacy of these mechanisms? |
Exactly what features of a problem domain affect the suitability of a hill-climbing vs divide-and-conquer approach, and how would we identify them? How does the choice of representation interact with the efficacy of different approaches? |