In: Riegler, Alexander & Peschl, Markus (Eds.)

Does Representation need Reality? ­ Proceedings of the International Conference 'New Trends in Cognitive Science' (NTCS 97)

Perspectives from Cognitive Science, Neuroscience, Epistemology, and Artificial Life, pp. 87­94,

Austrian Society for Cognitive Science, ASoCS Technical Report 97­01, Vienna, Austria, May 1997.


Rethinking Grounding

Tom Ziemke

Connectionist Research Group, Dept. of Computer Science, University of Skövde1

Box 408, 54128 Skövde, Sweden





The 'grounding problem' poses the question of how the function and internal mechanisms of a machine, natural or artificial, can be intrinsic to the machine itself, i.e. independent of an external designer or observer. Searle's and Harnad's analyses of the grounding problem are briefly reviewed as well as different approaches to solving it, based on the cognitivist and the enactive paradigms in cognitive science. It is argued that, although the two categories of grounding approaches differ in their nature and the problems they have to face, both, so far, fail to provide fully grounded systems for similar reasons: Only isolated parts of systems are grounded, whereas other, essential, parts are left ungrounded. Hence, it is further argued that grounding should instead be understood and approached as radical bottom­up development of complete robotic agents in interaction with their environment.





The purpose of this paper is to re­examine the problem of 'grounding'. To the fields of artificial intelligence (AI) and cognitive modelling the grounding problem poses the question of how function and internal mechanisms of an artefact (e.g. internal representations referring to external objects) can be (made) intrinsic to the artefact itself, as opposed to being imposed by an external designer and/or dependent on the interpretation through an external observer. Thus, it is rather obvious that your thoughts and actions are in fact intrinsic to yourself, whereas the operation and internal representations of a pocket calculator are extrinsic and ungrounded, i.e. their meaning is parasitic on their interpretation through an external observer/user. Nevertheless, the fact that the lack of grounding poses a serious problem for synthesis and modelling of intelligent behaviour has been somewhat underestimated, not to say ignored, for a long time. Recent interest in the issue has been mainly triggered by the arguments of Searle (1980) and Harnad (1990).

The following section will briefly recapitulate Searle's and Harnad's formulations of the grounding problem.


  Different approaches to overcome the problem are then reviewed, in particular Regier's (1992) work on perceptual grounding of spatial semantics and Brooks' (1989, 1991, 1993) work on 'physical grounding'. It will be argued that none of these approaches offers a satisfactory solution to the grounding problem since all of them address only part of the problem. The notion of radical bottom­grounding of complete agents will then be discussed as a route towards the development of fully grounded artefacts, i.e. systems whose function and all of whose elements and internal mechanisms are in fact intrinsic to the whole and in dynamical coherence with their environment.



The Grounding Problem


In 1980 Searle put forward his 'Chinese Room Argument' in order to contradict to the notion (which he referred to as 'strong AI') of intelligent behaviour and mind as the product of purely computational, i.e. formal and implementation­independent, processes in physical symbol systems, as put forward in the 'Physical Symbol Systems Hypothesis' (PSSH) (Newell & Simon, 1976; Newell, 1980).


  In particular Searle considered work by Schank and Abelson (1977), who claimed their programs (using so­called 'scripts') to be models of story understanding. To validate these claims Searle suggested a thought experiment: Imagine a person sitting in a room, who is passed (e.g. under the door) sequences of, to him / her meaningless, symbols. The person processes these symbols according to formal rules which are given in his / her native language (e.g. written on the room's walls), and returns a sequence of resulting symbols. As Searle pointed out, the symbols could, unknown to the person in the room, in fact be a story, questions and answers in Chinese written language. Hence, Chinese­speaking observers outside the room could very well conclude that the person in the room in fact does understand Chinese (the symbols do have meaning to them, and the answers returned from the room might be fully correct), whereas in reality he / she does of course not.


  Hence, Searle concluded that AI programs, operating in a purely formally defined manner, similar to the person in the room, could neither be said to 'understand' what they are doing or processing, nor to be models of human story understanding. According to Searle, this is mostly due to their lack of intentionality, i.e. their inability to relate their arbitrary internal symbols to external objects or states of affairs. Nevertheless, Searle did not suggest to give up on the idea of intelligent machines, in fact he concluded


... that only a machine could think, and indeed only very special kinds of machines, namely brains and machines that had the same causal powers as brains. And that is the main reason strong AI has had little to tell us about thinking, since it has nothing to tell us about machines.  (Searle, 1980)


Harnad (1990) basically extended and refined Searle's analysis of the problem, but also proposed a possible solution (cf. following section). In his formulation of the 'Symbol Grounding Problem' Harnad compared purely symbolic models of mind to the attempt to learn Chinese as a first language from a Chinese­Chinese dictionary. Accordingly, he also concluded that "cognition cannot be just symbol manipulation" since the symbols in such a model, as the symbols processed in Searle's Chinese Room, could very well be


... systematically interpretable as having meaning ... [b]ut the interpretation will not be intrinsic to the symbol system itself: It will be parasitic on the fact that the symbols have meaning for us [i.e. the observers], in exactly the same way that the meaning of the symbols in a book are not intrinsic, but derive from the meaning in our heads.  (Harnad, 1990, p. 339)


A number of authors have pointed out that the grounding problem is not limited to symbolic representations, and therefore referred to it more generally as the problem of 'representation grounding' (Chalmers, 1992) or 'concept grounding' (Dorffner & Prem, 1993), or the 'internalist trap' (Sharkey & Jackson, 1994).



Approaches to Grounding


A number of approaches to grounding have been proposed, all of which basically agree in two points:


·         Escaping the internalist trap has to be considered "crucial to the development of truly intelligent behaviour" (Law & Miikkulainen, 1994).

·         In order to do so, machines have to be 'hooked' (Sharkey & Jackson, 1996) to the external world in some way, i.e. there have to be causal connections, which allow the internal mechanisms to interact with their environment directly and without being mediated by an external observer.


  The question of what exactly has to be hooked to what and how, however, divides the different approaches, as will be discussed in this section. For the purpose of this paper different approaches to grounding can be categorized into two groups according to whether they follow the cognitivist or the enactive paradigm in cognitive science. This rough distinction basically follows that made by Varela et al. (1991). It should however be noted that the enactive paradigm is to a large extent compatible with the dynamical hypothesis in cognitive science (e.g. van Gelder, 1995, 1997; Port & van Gelder, 1995), constructivist views such as Piaget's genetic epistemology (cf. Rutkowska, 1996), as well as much of the recent work on autonomous systems, behaviour­based robotics, artificial life, etc.


  For the discussion in this paper it will further be useful to view both types of grounding approaches as being based on transductions of (parts/elements of) the outside world/environment onto what is to be grounded.



Cognitivism vs. Enaction


Cognitivism, as exemplified by the aforementioned PSSH, can be said to be "dominated by a 'between the ears', centralized and disembodied focus on the mind" (Rutkowska, 1996). In particular, cognitivism is based on the traditional notion of representationalism (Fodor, 1981; Fodor & Pylyshyn, 1988), characterized by the assumption of a stable relation between manipulable agent-internal representations ('knowledge') and agent­external entities in a pre­given external world. Hence, the cognitivist notion of cognition is that of computational, i.e. formal and implementation­independent, processes manipulating the above representational knowledge internally.


  The enaction paradigm (Varela et al., 1991) on the other hand, emphasizes the relevance of action, embodiment and agent­environment mutuality. Thus, in the enactivist framework, cognition is not considered an abstract agent­internal process, but rather embodied action, being the outcome of the dynamical interaction between agent and environment and their mutual specification during the course of evolution and individual development. Hence, the enactive approach


... provides a view of cognitive capacities as inextricably linked to histories that are lived, much like paths that only exist as they are laid down in walking. Consequently, cognition is no longer seen as problem solving on the basis of representations; instead, cognition in its most encompassing sense consists in the enactment or bringing forth of a world by a viable history of structural coupling. (Varela et al., 1991)


  This de­emphasis of representation in the traditional sense, in particular Brooks' (1991a) paper 'Intelligence without Representation', has often been interpreted as denying any need for representation. Enaction is however compatible with the notion of 'indexical­functional' or 'deictic' representations (e.g. Agre & Chapman, 1987, Brooks, 1991b), i.e. representations of entities in terms of the their functional or spatial relation to the agent.

Hence, this notion of representations as "behaviour­generating patterns" (Peschl, 1996) without a stable relation to environmental entities is


... quite compatible with viewing representation in terms of mechanisms that establish selective correspondence with the environment, rather than as internal models that substitute for things in the world in the overlayed traditional sense of re­presentation. (Rutkowska, 1996)2


Cognitivist Grounding


Typical for the cognitivist paradigm is a perception­cognition distinction (cf. Rutkowska, 1996), such as Fodor's (1980, 1983) distinction into input systems (e.g. low­level visual and auditory perception) and central systems (e.g. thought and problem solving). Input systems are typically considered responsible for transducing percepts onto internal representations, whereas the central systems manipulate/reason with the representational model / knowledge only.


Grounding Atomic Representations: In general, cognitivist grounding approaches typically focus on input systems grounding atomic representations in sensory / sensorimotor invariants. That means, here the required causal connection between agent and environment is made by hooking atomic internal representations to external entities or object categories. Such grounded atomic representations are then considered to be the building blocks from which complex representational expressions ('inheriting' the grounding of their constituents) can be constructed and a coherent representational world model can be built.


Harnad's Proposal: Harnad (1990) himself suggested a possible solution to the symbol grounding problem which mostly fits into the cognitivist framework3. Harnad proposed a hybrid symbolic/connectionist system in which symbolic representations (used in the central systems, in Fodorian terms) are grounded in non­symbolic representations of two types: Iconic representations, which basically are analog transforms of sensory percepts, and categorical representations, which exploit sensorimotor invariants to transduce sensory percepts to elementary symbols (e.g. 'horse' or 'striped') from which again complex symbolic representations could be constructed (e.g. 'zebra' = 'horse' + 'striped'). As a natural 'candidate component' for this bottom­up transduction (from real world objects via non­symbolic representations onto atomic symbolic representations) Harnad mentions connectionist networks (1990, 1993).


  A number of approaches to grounding have followed similar lines as those proposed by Harnad. Some of them, however, deny the need of symbolic representations (e.g. see Lakoff's (1993) interpretation/evaluation of Regier's (1992) work), and accordingly transduce sensory percepts onto non­symbolic (typically connectionist) representations. For a detailed account of the differences between symbolic and connectionist

computational engines and grounding approaches see (Sharkey & Jackson, 1996). The symbolic/connectionist distinction will not be further elaborated in this paper, since the more relevant distinction here is that between cognitivism and enaction (connectionist approaches can be found on both sides), and the associated types of representation (traditional representations vs. behaviour generating patterns).


  Although Harnad's grounding theory is based on a 'robotic functionalism' (1989, 1995) rather than pure cognitivism, and he has repeatedly pointed out (1993, 1995) that categorical invariants have to be grounded in robotic capacity, i.e. in sensorimotor interaction with the environment, most cognitivist approaches follow the tradition of neglecting action and attempt to ground internal representations in sensory invariants alone. Hence, most cognitivist approaches aim at grounding object categories (and thereby the crucial atomic representations) in perception.


Regier's Perceptually Grounded Semantics: A typical example is the work of Regier (1992) (cf. also La­ koff's (1993) and Harnad's (1993) discussion of Regier's work), who trained structured connectionist networks to label sequences of two­dimensional scenes, each containing a landmark and an object, with appropriate spatial terms expressing the spatial relation of the two (e.g. 'on', 'into', etc.). Or, in Regier's words: "the model learns perceptually grounded semantics".  Another example is the work by Cottrell et al. (1990) who trained connectionist networks (a) to label visual images (associate faces with names), and (b) to associate simple sequences of visual images with simple sentences.


  This transduction of percepts onto manipulable internal representations, could be argued to solve the problem of representation grounding (at least partly) since it does offer a pathway from real world objects to internal representations, thereby giving grounding to the latter.  


  Let us have a closer look at Regier's system though (very similar observations can be made in the case of Cottrell et al. (1990)). Do we have a fully grounded system here, i.e. a system whose function and all of whose internal mechanisms, elements, etc. are intrinsic to the system itself? Of course, we don't. Anything that goes on in the system, except for the produced labels, is still completely ungrounded: the system has no concept of what it is doing or what to use the produced labels for. That means, for Regier's system to be considered fully grounded, there are at least two things missing, which will be discussed in the following.


  Firstly, the created labels (i.e. the results of the transduction) could possibly be considered grounded (see however Harnad's (1993) argument that a feature detector alone cannot provide semantics). The act of labelling (transduction) itself however, since it does not have any functional value for the labelling system, sure cannot be intrinsic to it. That means, a semantic interpretation of the system's behaviour is of course possible ('this system labels spatial scenes'), it is however definitely not intrinsic to the system itself, it is just parasitic on the interpretation in our (i.e. the observers') heads.


  Hence, for a system's behaviour, whatever it is the system does, to be intrinsic to the system, the behaviour has to be grounded in agent­environment interaction, just as it was argued earlier (following Harnad) representations had to be. Accordingly, for the above labelling act to make sense to an agent, that agent would have to be able to at least use its spatial labels in some way, to profit in some way from developing the capacity to do so, etc.  (note that 'etc.' here stands for a rather long list of requirements).


  Cognitivists could of course (rightly) argue that the functional value of the transduction /labelling act, and thereby its intrinsicality to the overall system, lies in its support of hypothetical central computational systems which could make use of the resulting representation of the labelled object/scene. In Regier's system however, as discussed above, there just is no such overall system to which the labelling could be intrinsic.


  Secondly, and more importantly, assuming there were such central systems, that made the act of transduction intrinsic to the overall system (consisting of central systems and transducing input system), could we then speak of a fully grounded system? No, we still could not, since the transducer (Regier's labelling system) itself (its structure, organization, internal mechanisms, etc., basically all of it except for the networks' connection weights) is not grounded in anything but Regier's design ideas (however good and psychologically or neurobiologically plausible they might be).


  In this particular case the transducing labelling system is a structured connectionist model using two topographic maps dedicated to processing input for the two objects, and a number of further layers/networks to process the output of these maps. Regier (1992) himself argues that his system is a pre­adapted structured device that basically finds itself confronted with a task similar to that an infant is facing when acquiring lexical semantics for spatial terms. There is however one major difference, and that is the fact that the corresponding subsystem in humans (to the extent that it is innate) has been preadapted, i.e developed and tested, during the course of evolution, such that it very well could be said to be intrinsic to the human species (or genotype), and thereby to the individual (or phenotype) as an 'instantiation' of it. Obviously, this natural pre­adaptation is very different from the type of pre­adaptation that Regier's system has. This point will be further elaborated in the discussion of 'complete­agent­grounding'. In summary, Regier's transducer could be compared to an artificial heart: its use could still be intrinsic to an overall system, i.e. a human (to the extent that it offers the same functionality as a natural heart), itself however could probably never be.


  It should be noted that the point of the above argument is neither that cognitivism is wrong nor that cognitivist grounding along the above lines is impossible (as Harnad (1993) points out: (symbol) grounding is an empirical issue). A cognitivist grounding theory can, however, not be considered complete as long as it only explains the grounding of individual atomic representations but neither the transducing input system itself, nor its interdependence with its environment and the computational central systems.



Enactivist Grounding


In contrast to cognitivism, the enactivist framework is characterized by its focus agent­environment mutuality and embodied action, which Varela et al. (1991) explain as follows:


By using the term embodied we mean to highlight two points: first, that cognition depends upon the kinds of experience that come from having a body with various sensorimotor capacities, and second, that these individual sensorimotor capacities are themselves embedded in a more encompassing biological, psychological, and cultural context.  By using the term action we mean to emphasize ... that sensory and motor processes, perception and action, are fundamentally inseparable in lived cognition. (Varela et al., 1991)


Hence, the preferred objects of study are typically robotic agents, situated in some environment and causally connected to it via sensory input and motor output ("immediately grounded representations" (Dorffner & Prem, 1993)), instead of dealing with abstract models of it ("The world is its own best model." (Brooks, 1991)).


Grounding Behaviour: Robotic agents can be considered to have a certain degree of 'physical grounding' (Brooks, 1993) due to the fact are physically connected to their environment by means of sensor and actuators.  Physical grounding, however, only offers a pathway for hooking an agent to its environment, it does, by itself, not ground behaviour or internal mechanisms4.

  Instead of the central modelling and control typical for the cognitivist paradigm, enactive systems typically consist of a number of behavioural subsystems or components working in parallel from whose interaction the overall behaviour of a system emerges. Hence, each of these subsystems (as well as the overall system) can be viewed as transducing sensory input onto motor output, typically more or less directly, i.e. without being mediated by internal world models. Accordingly, it is not traditional representations, but rather an agent's behaviour5 that has to be grounded in its environment (cf. e.g. Law & Miikkulainen, 1994; Beer, 1996).


  This lack of traditional representations in enactive system might at first appear to simplify grounding, since it is exactly these representations that require grounding in the cognitivist framework. This does however also pose a serious problem, since 'knowledge' in the enactivist paradigm is typically considered to be embodied in a distributed fashion (body, sensors, actuators, nervous / control system, structure, etc.) or partly even lie in the environment (cf. Brooks, 1991; Varela et al., 1991). If an agent's behaviour requires grounding, then obviously the 'behaviour­generating patterns' it results from do so too.  The list of elements of elements, however, that participate in generating behaviour basically contains all mechanisms which, in one way or another, participate in the flow of activation from sensors to actuators. Hence, the question here is where to start grounding and where to end it?


  Most commonly the grounding of behaviour is approached as a matter of finding the right agent function, i.e. a mapping from sensory input history to motor outputs that allows effective self­preservation. There are basically two different ways of achieving this, which will be discussed in the following:


·         engineering/designing the agent function

·         learning the agent function


Engineering Agent Functions: The classical example for the engineering of agent functions is Brooks' subsumption architecture (1986), in which the overall control emerges from the interaction of a number of hierarchically organized behaviour­producing modules, e.g. the control of a simple robot that wanders around avoiding obstacles could emerge from one module mak­ ing the robot go forward and a second module which, any time the robot encounters an obstacle, overrides the first module and makes the robot turn instead.


  In Brooks' own work (see (Brooks, 1989) for a detailed example) typically each of the behavioural modules is implemented as a finite­state­automaton (FSA), and behavioural competences are carefully and incrementally layered bottom­up in a process which is sup­ posed to mimic, to some degree, the evolution of biological organisms. Less euphemistically speaking however, this approach to constructing the agent function could as well be characterized as incremental trial and error engineering, bringing with it, no matter how carefully it is carried out, the limitations of designing / engineering the transducer which we already noted in the discussion of Regier's work: The result of the transduction (i.e. the system's actions) could be considered grounded, the transducer itself however (i.e. the agent function as composed of the behavioural modules and their interconnection) is in no way intrinsic to the system.  


  The same problem was noticed earlier in Regier's case, the consequences are, however, more dramatic here: The ungrounded transducer in Regier's case was an input system of (arguably) peripheral relevance, whereas here the ungrounded transducer is the complete agent function itself. Hence, the problem here is analog to that in the case of the Chinese room (as well as that of the pocket calculator): the system might exhibit the 'right' behaviour, its internal mechanisms (its modularisation and the resulting task decomposition, the FSA, etc.) however are not intrinsic to the system, but are only grounded in careful engineering.


Learning Agent Functions: The typical approach to learning an agent function is to connect sensors and actuators with a connectionist network (e.g. Tani, 1996).  The approach has some obvious advantages, since the agent function can now be learned through adjustment of connection weights instead of having to be programmed entirely, i.e. the weights in a trained network and the resulting behaviour­generating patterns could be considered grounded. The problem of design, however, remains to some degree, since by choice of architecture (including number of hidden units, layers, etc.) the designer will necessarily impose extrinsic constraints on the system, in particular when designing modular or structured connectionist networks (cf. Ziemke, 1996b).  


  An approach that partly addresses the latter problem is the author's work on 'self­adapting' recurrent connectionist robot controllers (Ziemke, 1996a, 1996c) in which the sensorimotor mapping (the 'function net', or to be exact: the connection weights establishing this mapping) is actively (re­) constructed in every time step by a second connectionist net (the 'context net'). This enables the overall controller to exhibit an emergent (grounded) task decomposition (cf. also Nolfi, 1996) and autonomously acquire an according intrinsic virtual modularisation (in the form of different state subspaces with corresponding behaviours as evoked by the context net). This allows the controlled robot to exhibit different behaviours at different points in time, without these behaviours or their relation / organization being built into the system.  Experiments with 'infinite' (Ziemke, 1996a) and finite state automata behaviour (Ziemke, 1996c) have given a proof­of­concept, although at toy model scale (see also (Ziemke, 1997) for a discussion of the relation to other approaches).


  Another approach to ensure grounding of robotic control while limiting restrictions imposed through design to a minimum is the work by Law & Miikkulainen (1994), who let connectionist architectures (to be exact: the connectivity in a given architecture) evolve, thereby grounding the actual network architecture (i.e. structure of the transducer) to some extent. Law and Miikkulainen argue that


... the agents that are the product of this system will be undeniably grounded in their simulated world, since they will have begun from ground zero, knowing [almost] nothing at all.6 (Law & Miikkulainen, 1994, footnote added)





This paper has so far given a 'guided tour' around the grounding problem and a number of approaches aimed at solving it, all of which however, at least in the author's opinion, have their problems and shortcomings. Hence, let us briefly recapitulate the major points so far.


Summary So Far


Searle's (1980) and Harnad's (1990) analyses of work in traditional, purely computational AI showed that programming knowledge into a system alone can never make a system intelligent, since the knowledge will always remain extrinsic to the system, i.e. only be actual 'knowledge' to external observers. Hence, a natural conclusion is that knowledge must only enter a system from its environment in a grounded fashion.  


  In the cognitivist framework, where (a) 'knowledge' by definition consists of explicit, manipulable, internal representations, and (b) a distinction is made between perceptual input systems (transducing sensory percepts onto internal representations) and central systems (manipulating internal representations), this means (cf. Harnad's (1990) proposal) that any new internal representation must be


·         either definable by sensory or sensorimotor invariants (in the case of atomic representations such as 'striped', cf. above)

·         or constructable from already existing atomic or complex representations (in the case of complex representations such as 'zebra', cf. above).


  Typically cognitivist grounding approaches, here exemplified with Regier's (1992) work, therefore count on transducing sensory percepts, typically through connectionist networks, onto categorical representations which can then have a 1:1 relation to internal symbolic representations (cf. also Harnad, 1990). Problems typically ignored in this approach are that


·         the transducing input system, since alone it cannot provide grounding to more than the result of the transduction, has to be embedded in its usage through central systems, and

·         more importantly, it cannot be denied that in Re­ gier's a lot of his knowledge went into the design of his transducer (a structured connectionist net), which therefore (according to the above line of reasoning) has to be said to be extrinsic to the overall system.


In the enactivist framework, where the agent as whole must be considered to embody the transformation knowledge for the transduction of percepts onto actions, it is more difficult to pin down what exactly has to be grounded. Some degree of 'physical grounding' can be said to come with the sensorimotor embedding of robotic agents in their environment. Further grounding of (effective) behaviour is achieved by adequately transducing sensory percepts onto motor output. Here (transformation) knowledge needs to be embodied in the transducing agent function in order to ensure adequate action: In Brooks's subsumption architecture (1986,1989) this knowledge is designed/programmed into the system (resulting in the disadvantages discussed above), whereas using connectionist networks it can partly be learned in a grounded fashion, i.e. acquired in interaction with the environment.


Grounding Complete Agents


If we aim for fully grounded systems, i.e. systems in which every aspect, element or internal mechanism is intrinsic to the whole, then we have to start looking at systems which as a whole have developed in interaction with their environment.


  In fact, the only truly intelligent systems we know of are (higher) animals, i.e. biological systems whose genotype has evolved over millions of years, and who in many cases undergo years of individual development before achieving full intelligence. Thus, animals are grounded in their environments in a multitude of ways, whereas most grounding approaches rather aim for hooking pre­given agents to pre­given environments, by means of representations or effective behaviour.  


  AI and cognitive science, in their attempt to synthesize and model intelligent behaviour, have always been based on high­level abstractions from the biological originals (disembodiment, the 'information processing metaphor', the 'brain metaphor', etc.). The grounding problem, in its broad interpretation as discussed in this paper, seems to suggest, that in fact


·         we have to be very careful about such abstractions, since any abstraction imposes extrinsic (which however does not necessarily equal 'wrong') design constraints on the artefact we develop, and

·         we will have to re­examine some of the 'details' which perhaps prematurely have been abstracted from earlier.


One of these 'details' is what might be called physiological grounding as provided through the co­evolution and mutual determination of agents / species and their environments. Two simple examples:


·         As Varela et al. (1991) note, there is a perfect match/correspondence between the ultraviolet vision of bees and the ultraviolet reflectance patterns of flowers.

·         Similarly, the sounds your ears can pick up are exactly those sound frequencies which are relevant for you in order to be able to interact with your environment (e.g. including those that other people's vocal chords produce).


Compare this natural pre­adaptation (sensorimotor capacities which are the way they are and contribute to the interaction of agent and environment due to their intrinsic value to the agent/species) to that of the typical robot which is rather arbitrarily equipped with ultrasonic and infrared sensors all around its body, because its designers or buyers considered that useful (i.e. a judgement entirely extrinsic to the robot).


  Despite the emphasis on embodiment in the enaction paradigm, this type of physiological grounding through co­evolution/development and mutual determination of body, nervous/control system and environment has been largely neglected so far. One of the few approaches in this direction is the work of Cliff & Miller (1996), in which co­evolution of 'eyes' (optical sensors) and 'brains' (connectionist control networks) has been applied (in simulation) to pursuing and evading agents.





Both cognitivist and enactive approaches to grounding, although in different ways, to some extent follow Searle's conclusion that intelligence is a property of machines, i.e. embodied systems, causally connected with their environment, rather than disembodied programs.  The enactive approach certainly follows this route more wholeheartedly, with embodiment and agent­environment interaction being at the core of the enactive view of cognition. In the cognitivist approach on the other hand, grounding is rather considered to supply the necessary interface between the external world and the internal cognitive processes, which, if cognitivist grounding worked, could still be purely computational.  


  The question whether cognitivism or enaction is 'right' is, of course, beyond the scope of this paper. In their respective approaches to grounding however, despite their differences, a number of points of 'convergence' can be noted:


·         Both approaches require fully grounded systems to be 'complete agents': In the cognitivist approach grounding requires input and central systems embedded in their environment. In the enactive framework full grounding requires agents to have developed as a whole in interaction with their environment.

·         Both approaches require a certain degree of bottom­up development / evolution: In the cognitivist approach both development and evolution are required to account for grounding of both innate and learned representations, input systems, etc. In the enactive framework radical bottom­up development at both individual and species level are essential to grounding.

·         Both grounding approaches require their agents to have robotic capacities: In the cognitivist framework these are somewhat peripheral but necessary. In the enactive view robotic capacities (embodied action) are at the core of cognition.


Hence, the natural conclusion of the above points and the arguments put forward in this paper is that the most promising path toward successful synthesis / modelling of fully grounded and truly intelligent agents, will probably be what might be called 'evolutionary and developmental situated robotics', i.e. the study of embodied agents/ species developing robotic intelligence bottom­up in interaction with their environment, and possibly on top of that a 'mind' and 'higher­level' cognitive capacities.





1. The author is also with the Neurocomputing & Robotics Group, Dept. of Computer Science, University of Sheffield, UK.


2. See also Peschl's (1996) account of a "system relative concept of representation" in (recurrent) neural systems, and Globus' (1992) non­computational, non­cognitivist account of neuroscience.


3. In fact Harnad's proposal has been referred to as "a face­saving enterprise" (Sharkey & Jackson, 1996) for symbolic theories of mind.


4. cf. Searle's (1980) 'Robot Reply'.


5. Note however, that, if 'behaviour­generating patterns' are considered representations (cf. earlier quote from (Peschl, 1996)), then 'behaviour grounding' also amounts to 'representation grounding', although of a different type.


6. Note however that some knowledge of available sensors and motors is still built into the network architecture.





Agre, P. E. & Chapman (1987). Pengi: An Implementation of a Theory of Activity. Proceedings of AAAI­87.  Menlo Park, CA: AAAI, pp. 268­272.  


Beer, R. A. (1996) Toward the Evolution of Dynamical Neural Networks for Minimally Cognitive Behaviour.  From Animals to Animats 4 ­ Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, pp. 421­429. Cambridge, MA: MIT Press / Bradford Books.


Brooks, R. A. (1986) A Robust Layered Control System for a Mobile Robot. IEEE Journal of Robotics and Automation vol. 2, 14­23.


Brooks, R. A. (1989) A Robot that Walks: Emergent Behavior from a Carefully Evolved Network. Neural Computation vol. 1:2, 253­262.


Brooks, R. A. (1991a) Intelligence Without Representation. Artificial Intelligence vol. 47, 139­160.  


Brooks, R. A. (1991b) Intelligence Without Reason.  Proceedings of IJCAI­91, pp. 569­595. Sydney, Australia.


Brooks, R. A. (1993) The Engineering of Physical Grounding. Proceedings of the Fifteenth Annual Meeting of the Cognitive Science Society, pp. 153­154.  Hillsdale, NJ: Lawrence Erlbaum.


Chalmers, D.J. (1992) Subsymbolic computation and the Chinese room. In: Dinsmore, J. (ed.) The Symbolic and Connectionist Paradigms: Closing the Gap. Hillsdale, NJ: Lawrence Erlbaum.


Cliff, D. & Miller, G. F. (1996) Co­evolution of Pursuit and Evasion II: Simulation Methods and Results.  From Animals to Animats 4 ­ Proceedings of the Fourth International. Conference on Simulation of Adaptive Behavior, pp. 506­515. Cambridge, MA: MIT Press/Bradford Books.


Cottrell, G. W., Bartell, B. & Haupt, C. (1990) Grounding Meaning in Perception. Proceedings of the German Workshop on Artificial Intelligence (GWAI), pp.  307­321.


Dorffner, G. & Prem, E. (1993) Connectionism, Symbol Grounding, and Autonomous Agents. Proceedings of the Fifteenth Annual Meeting of the Cognitive Science Society, pp. 144­148. Hillsdale, NJ: Lawrence Erlbaum.


Fodor, J. A. (1980) Methodological solipsism considered as a research strategy in cognitive science. Behavioral and Brain Sciences vol. 3, 63­110.  


Fodor, J.A. (1981) Representations: philosophical essays on the foundations of cognitive science. Cambridge, MA: MIT Press.


Fodor, J. A. (1983) The Modularity of Mind. Cambridge, MA: MIT Press / Bradford Books.


Fodor, J. A. & Pylyshyn, Z. (1988) Connectionism and cognitive architecture: A critical analysis. Cognition vol. 28, 3­71.


Globus, G. G. (1992) Toward a Noncomputational Cognitive Neuroscience. Journal of Cognitive Neuroscience vol. 4(4).


Harnad, S. (1989) Minds, machines and Searle. Journal of Experimental and Theoretical Artificial Intelligence vol. 1, 5­25.


Harnad, S. (1990) The Symbol Grounding Problem. Physica D vol. 42, 335­346.


Harnad, S. (1993) Symbol Grounding is an Empirical Problem: Neural Nets are Just a Candidate Component. Proceedings of the Fifteenth Annual Meeting of the Cognitive Science Society, pp. 169­174. Hillsdale, NJ: Lawrence Erlbaum.


Harnad, S. (1995) Grounding Symbolic Capacity in Robotic Capacity. In: Steels, L. & Brooks, R. A. (eds.) The "artificial life" route to "artificial intelligence" ­ Building Situated Embodied Agents, pp. 276­286. New Haven: Lawrence Erlbaum.


Lakoff, G. (1993) Grounded Concepts Without Symbols.  Proceedings of the Fifteenth Annual Meeting of the Cognitive Science Society, pp. 161­164. Hillsdale, NJ: Lawrence Erlbaum.


Law, D. and Miikkulainen, R. (1994) Grounding Robotic Control with Genetic Neural Networks. Tech. Rep.  AI94­223. Austin: Dept. of Computer Sciences, The University of Texas at Austin.


Newell, A. (1980) Physical Symbol Systems. Cognitive Science vol. 4, 135­183.


Newell, A. & Simon, H. (1976) Computer science as empirical inquiry: Symbols and search. Communications of the ACM vol. 19, 113­126.  


Nolfi, S. (1996) Using emergent modularity to develop control systems for mobile robots. Tech. Rep. 96­04.  Rome, Italy: Dept. of Neural Systems and Artificial Life, Institute of Psychology, National Research Council.


Peschl, M. F. (1996) The Representational Relation Between Environmental Structures and Neural Systems:

Autonomy and Environmental Dependency in Neural Knowledge Representation. Nonlinear Dynamics, Psychology and Life Sciences vol. 1(3).  


Port, R. & van Gelder, T. (1995) Mind as Motion: Explorations in the Dynamics of Cognition. Cambridge, MA: MIT Press.


Regier, T. (1992) The Acquisition of Lexical Semantics for Spatial Terms: A Connectionist Model of Perceptual Categorization. PhD Thesis / Tech. Rep. TR­92­ 062. Berkeley: Dept. of Computer Science, University of California at Berkeley.


Rutkowska, J. C. (1996) Reassessing Piaget's Theory of Sensorimotor Intelligence: A View from Cognitive Science. In: Bremner, J. G. (ed.) Infant Development: Recent Advances. Hillsdale, NJ: Lawrence Erlbaum.  


Schank, R. C. & Abelson, R. P. (1977) Scripts, Plans, Goals, and Understanding. Lawrence Erlbaum.  


Sharkey, N. E. & Jackson, S. A. (1994) Three Horns of the Representational Trilemma. In: Honavar, V. & Uhr, L. (eds.) Symbol Processing and Connectionist Models for Artificial Intelligence and Cognitive Modeling: Steps towards Integration, pp. 155­189. Academic Press.


Sharkey, N. E. & Jackson, S. A. (1996) Grounding Computational Engines. Artificial Intelligence Review vol.  10, 65­82.


Searle, J. (1980) Minds, brains and programs. Behavioral and Brain Sciences vol. 3, 417­457.  


Tani, J. (1996) Does Dynamics Solve the Symbol Grounding Problem of Robots? An Experiment in Navigation Learning. Learning in Robots and Animals ­ Working Notes. AISB'96 workshop, Brighton, UK.  


van Gelder, T. J. (1995) What might cognition be if not computation? Journal of Philosophy vol. 91, 345­381.  


van Gelder, T.J. (1997) The Dynamical Hypothesis in Cognitive Science. Behavioral and Brain Sciences. To appear.


Varela, F., Thompson, E. & Rosch, E. (1991) The Embodied Mind ­ Cognitive Science and Human Experience. Cambridge, MA: MIT Press / Bradford Books.  


Ziemke, T. (1996a) Towards Adaptive Behaviour System Integration using Connectionist Infinite State Automata. From Animals to Animats 4 ­ Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, pp. 145­154. Cambridge, MA: MIT Press / Bradford Books.


Ziemke, T. (1996b) Towards Autonomous Robot Control via Self­Adapting Recurrent Networks. Artificial Neural Networks ­ ICANN 96, pp. 611­616. Berlin/ Heidelberg, Germany: Springer Verlag.  


Ziemke, T. (1996c) Towards Adaptive Perception in Autonomous Robots using Second­Order Recurrent Net­ works. Proceedings of the First Euromicro Workshop on Advanced Mobile Robots (EUROBOT '96), pp. 89­ 98. Los Alamitos, CA: IEEE Computer Society Press.  


Ziemke, T. (1997) Adaptive Behaviour in Autonomous Agents. Presence. To appear.