Thursday, November 24. 2016
1. Distinguish scientific/mathematical/engineering creativity, whose outputs are “objective” from artistic/musical//aesthetic creativity whose outputs are “subjective,” i.e., they depend on the feelings and judgment of people (human brains).
We call successes in either field — objective or subjective — “creative” if they are done by human brains. But the subjective ones can only be judged by people’s senses. We don’t care how scientific advances come, from a person or an algorithm: the result is just as good and valid, if it works. But for artistic works, one of the features of our aesthetic tastes is that we dislike or quickly tire of something that is detectably mechanical or algorithmic. (Just as my cat tires of commercial “cat toys.”)
2. Deep learning algorithms are very promising, but so far they have not yet duplicated ordinary (“noncreative”) human capacity, so it’s a bit premature to expect them to be creative. So far, their mechanical nature is eventually obvious, just like the style-checker algorithm that can improve bad prose to average, but that also reduces good, creative prose to average. A lot of creativity (both objective and subjective) involves rule-breaking (i.e., violation of algorithms, rather than following them). Algorithms can produce mediocre Bach-like work, but not masterpieces — and, like my cat, we eventually detect and tire of the algorithms...
Rule-breaking can of course be dictated by rules too, but that’s still mechanics. What isn’t? Randomness, chance. And some have emphasized that factor in human creativity. But it’s not the whole story and it’s not enough. And here subjective creativity is a better model: Another way to makes something feel non-mechanical is to make it more organic, more like the movements and sounds and feelings of a real biological body rather than a computational, algorithmic machine.
Now I don’t doubt that the body itself, including the brain, are causal systems of some sort, but not necessarily just algorithmic ones. (They’re certainly partly algorithmic: reasoning, for example.) But the brain is also a dynamical system. Dynamics includes things like heat and liquidity, which are not computational. They don’t follow computational algorithms; they obey differential equations, of the kind that describe a waterfall rather than the solution to a quadratic equation (the algorithm/recipe we all learned in high school, -b+/-(SQRT (b**2 - 4ac)/2a) which is certainly not creative — though the one who first discovered it was creative).
Harnad, Stevan (2006) "Creativity: method or magic?" Hungarian Studies 20, no. 1 (2006): 163-177.
Monday, July 7. 2014
Koubeissi, M. Z., Bartolomei, F., Beltagy, A., & Picard, F. (2014). Electrical stimulation of a small brain area reversibly disrupts consciousness. Epilepsy & Behavior, 37, 32-35.This center cuts off awakeness, not (just) consciousness. Inactivating the claustrum seems to put the subject into an immobile trance that is not sleep (which is an active dynamical state) but a kind of “suspended animation."
But consciousness means feeling — feeling anything at all. It is not (just) awakeness.
If something could "cut off" feeling while leaving “doing” intact (moving, talking, etc.), then it would make us into the Zombies that we would have been if we were not conscious. (Now that would be a real “on-off” switch!)
But there is no such center, or switch. Because consciousness is much more fundamental and pervasive than mere awakeness.
And for some reason that no one can understand or explain, there (probably) cannot be Zombies — at least not with human-scale (or probably even any biological-scale) doing-capacity. To be able to explain how and why that is the case would require solving the mind/body problem (the “hard” problem).
By the way, like claustrum inhibition, general anaesthesia too cuts of awakeness but it also induces a lot of other accompanying changes in state along with it. (Maybe, if it is not harmful, claustrum inhibition could be used for surgery instead of pharmacologically inducing sleep or coma?)
And local anaesthesia merely cuts off sensation (which also happens to be felt): It makes the stimulation of the anesthesized location unfelt (but of course it leaves all other feeling intact).
Sunday, June 8. 2014
It has been reported that the Turing Test has been passed, 64 years after it was first proposed by Turing, because 32% of judges mistook a computer programme for a real 13-year old Ukrainian boy called Eugene Goostman in a 5-minute test.
Nothing of the sort. Really passing the Turing Test would require designing a system that has real, lifelong verbal capacity, indistinguishable from a real pen-pal, not just fooling an arbitrary percentage of interrogators in a series of 5-minute exchanges!
At the beginning of his 1950 paper Turing had written:
Turing: “[A] statistical survey such as a Gallup poll [would be] absurd [as a way to define or determine whether a machine can think]” (Turing 1950)Taking a statistical survey like a Gallup Poll — to find out people's opinions of what thinking is — would indeed be a waste of time, as Turing points out. Later in the paper, however, in a throwaway remark that is merely his personal prediction about progress in attempts to pass his Test, he mentions the equivalent of a statistical survey in which 30% of interrogators will be successfully fooled for five minutes:
Turing: "I believe that in about fifty years' time it will be possible, to programme computers... [to] play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.” (Turing 1950)No doubt this party-game/Gallup-Poll criterion can be met by today's computer programs -- but that remains as meaningless a demographic fact today as it was when predicted 64 years ago: Like any other science, cognitive science is not the art of fooling some or most of the people for some or most of the time! The candidate must really have the generic performance capacity of a real human being -- capacity that is totally indistinguishable from that of a real human being to any real human being (for a lifetime, if need be!). No tricks: real performance capacity (Harnad 2008).
Turing was not only the co-inventor of the computer and the code-breaker of the Nazis’ Enigma Machine, thereby helping the Allies win World War II, but with what came to be called the “Turing Test” he also set the agenda for what would eventually come to be called “cognitive science”: the science of explaining how the mind works.
Turing’s idea was simple: Stop worrying about what the mind “is” and explain instead what the mind does. If you can design a system that can do everything that a person with a mind can do – and can do it so that people cannot tell it apart from a real person – then that system will have passed the Turing Test, and the explanation of how that system works will be the explanation of how the mind works.
But the lion’s share of the enormous research agenda proposed by Turing for cognitive science is getting the system to be able to do everything a person with a mind can do. Testing whether people can tell the candidate apart from a real person only becomes relevant at the endgame, once the system already has our generic performance capacities. And we are nowhere near having designed a system that can do everything a person with a mind can do. Not even if we restrict the test to everything a mind can do verbally. (The real Turing Test will of course have to be robotic, not just verbal, because what we can do is not just what we can do with our mouths! But lets set aside for another discussion the “symbol grounding problem” of whether computation alone can indeed do everything the mind can do.)
It should be obvious from all this that the Turing Test is not – and never was – about fooling anyone, let alone fooling some people, some of the time. It is about designing – indeed “reverse-engineering” -- a system that is really able to do anything an ordinary person can do, any time, as long as you like, indistinguishably from the way a real person does it. Nothing about 5-minute tests and percentages of judges that think the candidate is or isn’t a real person (although obviously eventual success can only be achieved by degrees).
The Turing Test has captured the imagination of the general public partly because of our interactions with computers that are able to do more and more things that only people with minds had been able to do. Another reason has been the growth in the number of science fiction books and movies about computers and robots that have – or seem to have – minds. But the biggest reason for the fascination is the “other minds problem” itself – the very problem that the Turing Test is meant to resolve:
We are not mind-readers. The only one I can know has a mind is myself; we’ve known that since at least Descartes’ famous “I think therefore I am.” For all bodies other than my own, the only way I can infer whether they indeed have a mind is if they can do what minds can do. I can’t observe other minds, but I can observe what they can do. So Turing’s real insight was that Turing-testing is -- and always has been -- our only means of mind-reading. Hence once we have designed a system that can do anything a person with a mind can do, indistinguishably from a person with a mind, not only will we be in no better or worse a position to know whether that system really has a mind than with any other person, but we will come as close as it is possible to come to having explained how the mind works.
But that’s certainly not where we are when we have a system that can fool 30% of people for 5 minutes. And Turing certainly never said, implied or intended any such thing.
Harnad, S. (1992) The Turing Test Is Not A Trick: Turing Indistinguishability Is A Scientific Criterion. SIGART Bulletin 3(4) (October 1992) pp. 9 - 10.
Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing, Machinery and Intelligence. In: Epstein, Robert & Peters, Grace (Eds.) Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer. Springer
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 433-460.
Saturday, February 18. 2012
I don't think that Bernie Baars -- in "The Biological Cost of Consciousness" -- has succeeded in explaining the biological function of consciousness -- i.e., what is it for? what does it do? what could not get done without it, and why? He has simply reaffirmed that consciousness is indeed there, and correlated with a number of biological functions -- so far inexplicably.
The problem (a "hard" one) is always the same: How and why is some given biological function executed consciously rather than unconsciously? It is "easy" to explain why and how the function itself (seeing, attending, remembering, reporting, etc.) is biologically adaptive, but it is "hard" to explain how and why it is conscious, hence how and how it is biologically adaptive that it is conscious.
In considering consciousness Bernie also falls into the very common conflation between (1) the accessibility of information and (2) consciousness of the information. Information is just data, whether in a brain or in a radio, computer, or robot. To explain the function of the fact that information is accessible (hence reportable) is not to explain the function of the fact that the access is conscious access.
What -- besides accessibility -- is the "mark" of information being conscious? The fact that it is felt: it feels like something to have access to some information. And it feels like nothing to have access to other information. The information to which a computer or robot has access, be it ever so useful to whatever it is that the computer or robot can or does do, is not conscious. It does not feel like anything to have access to that information. The same is true for the information to which our cerebellum has access when it keeps our balance upright, or the information to which our medulla has access when it keeps us breathing, or keeps our hearts beating, especially while we are in deep (delta) sleep. When we are awake, sometimes some of that information does become conscious, in that we feel it, and then usually some further functional flexibility is correlated with it too (including reportability, in the human case). But the question remains: why and how are some states of informational access felt and some not? and what further functional benefit is conferred by the fact that the felt ones are felt? What is the causal function of the (unexplained) correlation?
Limited resources are limited resources, and resource costs are just resource costs. The fact that our brains can have access to -- and can process -- only a limited amount of information and not more is not an explanation of why and how having and processing (some of) that information is felt. Access and processing limitations, in and of themselves, have nothing to do with consciousness -- except that they are correlated with it, so far still inexplicably.
That was the fact that was (and still is) to be explained.
Saturday, November 12. 2011
Letter to TLS, Nov 4 2011:David Auerbach (TLS Letters, November 4) is quite right that in his original 1950 formulation, what Turing had called the "Imitation Game" (since dubbed the "Turing Test") tested only verbal capacity, not robotic (sensory/motor) capacity: only symbols in and symbols out, as in today's email exchanges. Turing's idea was that if people were completely unable to tell a computer apart from a real, live pen-pal through verbal exchanges alone, the computer would really be thinking. Auerbach is also right that -- in principle -- if the verbal test could indeed be successfully passed through internal computation (symbol-manipulation) alone, then there may be no need to test with robotic interactions whether the computer's symbols were "grounded" in the things in the world to which they referred. But 2012 is Alan Turing Year, the centenary of his birth. And 62 years since it was published, his original agenda for what is now called "cognitive science" has been evolving. Contrary to Turing's predictions, we are still nowhere near passing his test and there are by now many reasons to believe that although being able to pass the verbal version might indeed be evidence enough that thinking is going on, robotic grounding will be needed in order to actually be able to pass the verbal test, even if the underlying robotic capacity is not tested directly. To believe otherwise is to imagine that it would be possible to talk coherently about the things in the world without ever being able to see, hear, touch, taste or smell any of them (or anything at all)."Stevan Harnad misstates the criteria for the Turing Test when he describes a sensing robot that could pass the test by recognizing and interacting with people and objects in the same way that a human can (October 21). Alan Turing’s formulation of the Turing Test specifies a computer with no sensors or robotic apparatus. Such a computer passes the test by successfully imitating a human in text-only conversation over a terminal.David Auerbach 472 9th Street, New York 11215.
Harnad, S. (1989) Minds, Machines and Searle. Journal of Theoretical and Experimental Artificial Intelligence 1: 5-25.
Harnad, S. (1990) The Symbol Grounding Problem Physica D 42: 335-346. http://cogprints.org/0615/
Harnad, S. (1992) The Turing Test Is Not A Trick: Turing Indistinguishability Is A Scientific Criterion. SIGART Bulletin 3(4): 9-10.
Harnad, S. (1994) Levels of Functional Equivalence in Reverse Bioengineering: The Darwinian Turing Test for Artificial Life. Artificial Life 1(3): 293-301.
Harnad, S. (2000) Minds, Machines, and Turing: The Indistinguishability of Indistinguishables. Journal of Logic, Language, and Information 9(4): 425-445. (special issue on "Alan Turing and Artificial Intelligence")
Harnad, S. (2001) Minds, Machines and Searle II: What's Wrong and Right About Searle's Chinese Room Argument? In: M. Bishop & J. Preston (eds.) Essays on Searle's Chinese Room Argument. Oxford University Press.
Harnad, S. (2002) Darwin, Skinner, Turing and the Mind. (Inaugural Address. Hungarian Academy of Science.) Magyar Pszichologiai Szemle LVII (4) 521-528.
Harnad, S. (2002) Turing Indistinguishability and the Blind Watchmaker. In: J. Fetzer (ed.) Evolving Consciousness. Amsterdam: John Benjamins. Pp. 3-18.
Harnad, S. and Scherzer, P. (2008) First, Scale Up to the Robotic Turing Test, Then Worry About Feeling. Artificial Intelligence in Medicine 44(2): 83-89
Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing, Machinery and Intelligence. In: Epstein, Robert & Peters, Grace (Eds.) Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer. Springer
Harnad, S. (2011) Minds, Brains and Turing. Consciousness Online 3.
Friday, November 11. 2011
The recent findings of Watanabe and Logothetis on dissociating attention and awareness are interesting
Watanabe, M, K Cheng, Y Murayama, K Ueno, T Asamizuya, K Tanaka, N Logothetis (2011) Attention but not Awareness Modulates the BOLD Signal in Human V1 During Binocular Suppression. Science 11 November 2011 DOI: 10.1126/science.1203161but a more conservative interpretation might be that when there is divided stimulation and divided attention, stimulation and attention both contribute to awareness (of stimulation), with attention selectively enhancing the effects of the stimulation.
Harnad, S. (1969) The effects of fixation, attention, and report on the frequency and duration of visual disappearances. Masters thesis, McGill University.
Saturday, April 16. 2011
(1) The Dunn et al article in Nature is not about language evolution (in the Darwinian sense); it is about language history.
(2) Universal grammar (UG) is a complex set of rules, discovered by Chomsky and his co-workers. UG turns out to be universal (i.e., all known language are governed by its rules) and its rules turn out to be unlearnable on the basis of what the child says and hears, so they must be inborn in the human brain and genome.
(3) Although UG itself is universal, it has some free parameters that are set by learning. Word-order (subject-object vs. object-subject) is one of those learned parameters. The parameter-settings themselves differ for different language families, and are hence, of course, not universal, but cultural.
(4) Hence the Dunn et al results on the history of word-order are not, as claimed, refutations of UG.
Harnad, S. (2008) Why and How the Problem of the Evolution of Universal Grammar (UG) is Hard. Behavioral and Brain Sciences 31: 524-525
Saturday, October 16. 2010
Pinker & Bloom's 1990 BBS paper was timely and influential but it begged the most controversial question about language evolution.
It is uncontroversial that "language" (whatever that may be) evolved through natural selection as well as through learning, culture and historical change. What was controversial was whether the specific capacity underlying universal grammar (UG) could have evolved in that way.
The problem P & B overlooked was the "poverty of the stimulus" -- the fact that what the language learning child hears and says does not provide enough data for either the child or any learning device to learn to recognize and produce all and only the utterances that conform to the rules of UG.
The rules of UG are not known (except by linguists) and are hence not taught explicitly to the child; and they are too complex for the child to learn by trial and error induction from the impoverished evidence available to it during its language learning years. How would evolution have "induced" the rules of UG, or the mechanism for recognizing and producing all and only the utterances that conform to those rules?
This question is especially troubling since there has so far been no evidence or argument suggesting that the rules of UG are somehow optimal or even necessary for language in principle. It seems they are just necessary (and universal) in practice.
But none of this was addressed in P & B's 1990 BBS paper. Their basic suggestion was that the evolution of language was an unproblematic combination of traits that could plausibly have evolved biologically, and other traits that evolved culturally.
Now, if worries about the problematic evolutionary status of UG were holding language researchers back from working on the unproblematic evolutionary and cultural aspects of language, then P & B performed a great service by dispelling those worries.
But they certainly did not solve or dispel the problem of the origin of UG.
Thursday, August 5. 2010
Before consulting Ernst Moerk for "solid empirical evidence against the validity of the poverty of the stimulus argument," please first make sure you know what the poverty of the stimulus means:
According to UG theorists, all human (linguistic) utterances, actual and potential, can be divided into two kinds: those that we (all) judge as grammatically well-formed, and those that we (all) judge as ill-formed.
From the ill-formed ones, we can eliminate all those that are ill-formed merely because of the arbitrary conventions of man-made grammars ("I ain't got none" -- "Between you and I", etc.).
That still leaves all the ill-formed utterances for which we don't really know why we perceive them as ill-formed, but we do.
Chomsky set out to explain the implicit rules underlying the difference between those ill-formed utterances (there is an infinity of them) and the ones (likewise infinite) that we perceive as well-formed (or just ill-formed because they violate conventions we have learned and agreed to).
These universally ill-formed utterances -- let us call them "starred utterances" because of the convention of preceding them by an asterisk (*) to distinguish them from the well-formed (unstarred) utterances -- turn out to be governed by a highly abstract set of underlying rules. These are the rules of universal grammar (UG), because they turn out also to be universal across all languages (which differ only in their [learned] parameter-settings for these rules).
Now before we even get to the poverty of the stimulus, there are those who deny the validity or the universality of our perceptions of grammaticality. They may or may not be right. But if they are right, then the problem of the poverty of the stimulus does not even come up, for it means that UG theorists are looking for rules underlying distinctions that don't really exist.
So in formulating the problem of the poverty of the stimulus, we will assume that these objections are invalid, that grammatical judgments are indeed reliable, valid, and universal, and that the rules to explain them are not arbitrary but explain a genuine cognitive distinction.
Well, it turns out that the rules that will generate and recognize all and only the unstarred utterances, and recognize and reject the starred ones, are so complicated and abstract that they cannot be learned by the child (or by any inductive learning mechanism) on the basis of the utterances that the child actually hears -- or produces, or gets corrective feedback on -- during its (brief) language learning period. That is the poverty of the stimulus (POS).
Here too there is a lot of misunderstanding to get out of the way: As already noted, we are not talking here about hearing, producing and getting corrected for conventional grammatical errors. There's plenty of inductive data available to the child for learning those. We are talking about hearing, producing or getting corrected for violations of UG: utterances that do not conform to the rules that linguists (through laborious decades of collective, cumulative induction -- very unlike what the child does in its few language learning years) have discovered, through trial and error, to be the rules underlying UG.
The methodology that linguists use, across the decades, to infer the rules of UG is to hypothesize that something is a rule of UG, and then test it to see whether it produces starred utterances. If it does, it is rejected. This is exactly what the child never does, because it neither hears nor produces (hence also never gets corrected) on starred utterances (except a very few, by chance). There are an infinity of starred and unstarred utterances, and many complex rules. The rules can be learned, but not by the child, because it lacks this crucial positive vs. negative evidence: plenty of positive evidence, but next to no negative evidence.
It is as if you had to learn what a "laylek" was (and, of course, what a laylek wasn't), and I told you that I'm a laylek, you're a laylek, things are layleks, ideas are layleks -- in fact everything you ever see, hear, taste, smell or think is a laylek. So what's a laylek? You can't say, because you haven't got any examples of what is not a laylek. Hence you have no basis for telling them apart.
That is the poverty of the stimulus.
But before you start coming up with hypothetical counterexamples in principle, remember that we are talking about UG in particular, not just some hypothetical something. So the counterexamples to POS have to be in the form of actual productions by children (and corrections by adults) that are enough to serve as the database for inferring all the actual rules of UG to date (or enough of them to show that it works). It's not enough to do it for one or a few isolated cases.
The reason no one has done this successfully, is that those who try to show the incorrectness of POS either (1) reject UG to begin with (e.g., by rejecting the reliability of grammaticality judgments) (in which case they are not really addressing the POS problem, but simply rejecting a branch of empirical linguistics out of hand) -- or they (2) try to show the incorrectness of POS without actually having mastered what the rules of UG actually are (according to its current formulation). Hence they are not showing that the inductive database is not really impoverished for UG, but merely looking at a few isolated special cases, with no evidence that more of the same would scale up to doing the whole job, namely, enabling the child to learn the rules of UG from positive and negative evidence.
A parting word about "nativism": What makes the special case of UG so hard is not the fact that the POS entails that UG has to be inborn, because the child's database is far too weak to serve as a basis for learning it. What makes it so hard is that -- unlike with other inborn traits, whether structural or behavioural -- the usual explanation for the origin of inborn traits, namely, Darwinian evolution, does not look to be plausible in the special case of UG.
For exactly the same thing that makes the rules of UG unlearnable on the basis of the child's database -- but learnable (painfully and slowly) on the basis of the collective, cumulative database of generations of UG linguists -- also makes it just as implausible that UG was "learned" via the usual trial-and-error process of biological evolution either. This is for reasons partly based, again, on POS -- but partly also on the fact that no one has yet shown that the rules of UG can be either learned or evolved consecutively, by trial and error, unguided by already inborn UG perceptions, but guided instead by the adaptive advantages of unstarred utterances over starred ones!
For, as far as can be seen so far, unstarred utterances do not have any evolutionary advantages over starred ones (except if you already have UG). Logical (but artificial) alternatives with the same expressive power as UG-governed natural languages are possible. UG is not a matter of logical necessity for the expressive power of language. But it seems it just happens to be there.
That -- and not something else, that would simply amount to changing the subject and begging the question -- is the problem of the poverty of the stimulus. And that is why UG "nativism" is such a particular puzzle (and hence such a hard problem, evolutionarily).
The only avenues for hope, as far as I can see, are either if it turns out that grammar cannot be done autonomously from form alone, without recourse to meaning ("autonomy of syntax"), and hence that UG rules turn out to be epiphenomena, superfluous for distinguishing unstarred from starred utterances, because the distinction is somehow guided by meaning rather than just formal syntactic structure. (But if this is to turn out to be true, a lot of work remains to be done to show why and how it is true.)
The other avenue would be to prove that UG is in fact a logically necessary property of any symbol system with the expressive power of natural language. (From what I understand of Chomsky's position on this, he does not go this far. He does indeed think that UG-compliant language is the only possible way for us to think, but he does not give a proof, nor does he indicate that he believes that it is a matter of logical necessity, hence in principle provable.)
[JanetK asked: "How can POS prove that UG is necessary, if you have to accept UG in order to have POS."]
POS does not prove UG necessary. POS proves UG is unlearnable by the child (given the child's actual database).
If you want to prove UG is fiction, explain how children and adults are capable of producing all and only the UG-compliant utterances, and rejecting the rest as ungrammatical.
None of this can be done at this level of vagueness. Most critics of UG haven't the faintest idea of what the rules of UG are, nor what the capacity is that those rules were adduced to explain. But that does not stop them from expressing objections to UG with vehemence and conviction. I have always found that extremely puzzling.
I am not a linguist, technically capable of saying, one way or the other. But I have become quite adept at spotting question-begging and non sequiturs across the years, on the part of others who are as ignorant of the technical specifics of UG as I am, yet think they have some sort of refutation...
The question of "necessity" only concerned whether or not UG was a logical necessity in order to have the expressive power of natural language. (It could merely be necessary as a matter of historical/evolutionary fact, rather than logical necessity.) Chomsky thinks UG is necessary in order to think, but I don't think he means logical necessity.
Yes, the origin of UG is a problem for evolutionary explanation, but it is not clear that it is part of the empirical burden of syntacticians either to explain the evolutionary origins of the rules they discover, or to restrict their research to rules that are either learnable or have a ready evolutionary explanation.
This does not mean that I do not remain perplexed by the evolutionary origins of UG…
The Power (and Poverty) of Words
Based on the latest posting of Edmund Blair Bolles on this topic thread, I think this will be my last contribution, because I am beginning to suspect that Edmund is not really serious about trying to address the problem of UG and POS.
The examples Edmund gives in his posting -- examples of uses of the word "run" -- have nothing at all to do with UG. They are examples of the constraints of conventional grammar, which are learnable, and learned.
As I said, across the decades, many, many would-be critics of Chomsky and UG have aired their opinions without ever confronting, let alone understanding the actual empirical evidence (in all the systematic starred and unstarred utterances that every human being can produce and distinguish, the ones that are actually at issue -- not arbitrary, isolated ones of our own choosing.)
The utterly pointless example of "run" is a case in point.
Nor is the question about whether the rules are "in the stimulus," to be "perceived." The question is whether the examples of utterances that the child hears and says, and the corrections that the child receives, are sufficient for the child to learn the rules of UG from them -- as they are indeed sufficient, for example, in the case of learning the rules of chess, or of arithmetic.
To show that the Brown database is sufficient for this, you have to actually show how those data are enough so that the child can figure out from them all the rules that it has been taking teams of linguists decades to begin to piece together.
The problem of POS is that the child's database (e.g., the data in the Brown corpus) is not sufficient. There is nowhere near enough there for the child -- or any learning system -- to induce the rules of UG on the basis of those data alone. (The rules of UG -- not a layman's pet proxy for that complex set of rules on which linguists are still working. Otherwise I could state with confidence that, say, both Goedel's theorem and Quantum Mechanics are false, based on my own pet examples.)
(I expressed no objection whatsoever that Edmund did not review my paper. I just said that he had missed the point about UG. And his is still missing it. I have great faith in words for bridging any and every conceptual gap -- but for there to be a way, there has to be a will!)
Saturday, June 19. 2010
Re: "Music and speech share a code for communicating sadness in the minor third"
Interesting finding, but so many questions arise, the foremost being cause/effect:
We live in a world of film-music where "emotions" are punctuated by certain musical clichés (dissonance: tension; consonance: relief; major: cheery; minor: teary).
Of course the clichés may have been chosen because of their innate expressive meaning, but the reverse is also possible, perhaps just as possible: the Frenchman for whom an olympic gold win by a countryman immediately evokes the Marseillaise...
So, yes, the cultural universality of all this remains to be tested, as the author notes; otherwise we are just dealing with self-reinforcing habitual associations in Western pop "culture."
(I would be more inclined to believe in the inborn affective connotations of simpler acoustic properties, such as volume (loud: alarm; quiet: calm), tempo (fast: agitated; slow: calm) and timbre (some vocalizations are shrill and grating, so sound urgent; others sound soothing). Also, there seems to be a lot of scope for looking at movement and dance, where some movements look intrinsically menacing, conciliatory, beckoning, agitated, etc., although there is plenty of room for effects of culture, convention and habit there too.)
I of course do believe that music expresses some deep universals in human affect, but it would take far stronger evidence than this author's findings to show that that belief is right, especially for major/minor! (By the way, I think a minor 6th is even more of a tear-jerker than a minor 3rd...)
If the affective connotations of some vocal or other bodily gestures (and states: let's not forget facial expressions, trembling and tears) are hard-wired into our brains, it is not that they are part of "language" too. Language is language even if it is written, unpunctuated by movement or intonation. The vocal and visual aspects of language are simply somatic: they come with the territory whenever we do anything with our bodies, especially inasmuch as it is being transmitted to or received by other bodies like our own.
With the globalization of Hollywood's tear-jerkers, I doubt it is still possible to test whether or not certain acoustic clichés are cultural, let alone evolutionary universals, except maybe with isolated infants and hunter-gatherers! And the "Mozart effects" we seek are likely to turn out to be just as much MoTown effects. Rather like the Schenkerian reduction of all harmony to 5/1... ;>)
Saturday, December 20. 2008
From the first time I heard it misused by computer scientists, the term "ontology," used in their intended sense, has rankled, since it is virtually the opposite of its normal meaning. (And although terms are arbitrary, and their meanings do change, if you're going to coin a term for "X," it is a bit perverse to co-opt for it the term that currently means "not-X"!)
Ontology is that branch of philosophy that studies what exists, what there is. (Ontology is not science, which likewise studies what there is; ontology is metaphysics: It studies what goes beyond physics, or what underlies it.)
Some have rejected metaphysics and some have defended it. (In "Appearance and Reality," Bradley (1897/2002) wrote (of Ayer) that 'the man who is ready to prove that metaphysics is wholly impossible ... is a brother metaphysician with a rival theory.")
Be that as it may, there is no dispute about the fact that "ontology," whatever its merits, is distinct from -- indeed the complement of -- "epistemology," which is the study of how and what we know about what exists. In fact, one of the most common philosophical errors -- a special favorite of undertutored novices and overconfident amateurs dabbling in philosophy -- is the tendency to confuse or conflate the ontic with the epistemic, talking about what we do and can know as if it somehow constrained what there is and can be (rather than just what we can know about what there can be).
Well, knowledge engineering's misappropriation of "ontology" -- to denote (in the wiseling words of Wikipedia) "a 'formal, explicit specification of a shared conceptualisation'... provid[ing] a shared vocabulary, which can be used to model a domain... that is, the type of objects and/or concepts that exist, and their properties and relations' -- is a paradigmatic example of that very confusion.
What knowledge engineers mean is not ontology at all, but "epistemonomy" (although the credit for the coinage must alas go to Foucault).
Tuesday, July 15. 2008
Re: Blondin Masse, A, G. Chicoisne, Y. Gargouri, S. Harnad, O. Picard, O. Marcotte (2008) How Is Meaning Grounded in Dictionary Definitions? TextGraphs-3 Workshop, 22nd International Conference on Computational Linguistics, Coling 2008, Manchester, 18-22 August, 2008Many thanks to Peter Turney for his close and thoughtful reading of our paper on extracting the grounding kernel of a dictionary.
Peter raises 3 questions. Let me answer them in order of complexity, from the simplest to the most complex:
-- PT: "(1) Is this paper accepted for Coling08?"
Yes. Apparently there are different sectors of the program, and this paper was accepted for the Textgraphs workshop, listed on the workshop webpage.
-- PT: "(2) How come we claim the grounding kernel (GK) words are more concrete, whereas in our example, they are more abstract?"
The example was just a contrived one, designed only to illustrate the algorithm. It was not actually taken from a dictionary.
When we do the MRC correlations using the two actual dictionaries (LDOCE and CIDE), reduced to their GK by the algorithm, GK words turn out to be acquired at a younger age, more imagable, and (less consistently) more concrete and more frequent.
However, these are separate pairwise correlations. We have since extended the analysis to a third dictionary, WordNet, and found the same pairwise correlations, but when we put them together in a stepwise hierarchic multiple regression analysis, looking at the independent contributions of each factors, the biggest effect turns out to be age of acquisition (GK being acquired earlier), but then the residual correlation with concreteness reverses polarity: concreteness is positively correlated with earlier age of acquisition across all words in the MRC database, but once the GK correlation with age is partialled out, the remaining GK words tend to be more abstract!
This obviously needs more testing and confirmation, but if reliable, it has a plausible explanation: the GK words that are acquired earlier are more concrete, but the GK also contains a subset of abstract words, either learned later in life, or learned through early abstraction, and these early abstract words are also important for the compositional power of dictionary definitions in reaching other words through definition alone.
The next step would be begin to look at what those GK the GK words, concrete and abstract, actually are, and the extent to which they may tend to be unique and universal across dictionaries.
-- PT: "(3) Does our analysis overlook the process of abstraction in its focus on acquiring meaning by composition (through dictionary definition)?"
Quite the contrary. We stress that word meanings must be grounded in prior sensorimotor learning, which is in fact the process of (senssorimotor) abstraction!
Peter writes: "we may understand 'yellow' as the abstraction of all of our experiences, verbal and perceptual, with yellow things (bananas, lemons, daffodils, etc.). When we are children, we build a vocabulary of increasingly abstract words through the process of abstraction."
But we would agree with that completely! The crucial thing to note, however, is that abstraction, at least initially, is sensorimotor, not linguistic. We learn to categorize by abstracting, through trial and error experience and feedback, the invariant sensorimotor features ("affordances") of the members of a category (e.g., banana, lemons, daffodils, and eventually also yellow), learning to distinguish the members from the nonmembers, based on what they look and feel like, and what we can and cannot do with them. Once we have acquired the category in this instrumental, sensorimotor way, because our brains have abstracted its sensorimotor invariants, then we can attach an arbitrary label to that category -- "yellow" -- and use it not only to refer to the category, but to define further categories compositionally (including, importantly, the definition through description of their invariants, once those have been named).
This is in agreement with Peter's further point that "As that abstract vocabulary grows, we then have the words that we need to form compositions."
And all of this is compatible with finding that although the GK is both acquired earlier and more concrete, overall, than the rest of our vocabulary, it also contains abstract words (possibly early abstract words, or words that are acquired later yet important for the GK).
-- PT: "The process of abstraction takes us from concrete (bananas and lemons) to abstract (yellow). The process of composition takes us from abstract (yellow and fruit) to concrete (banana)."
The process of abstraction certainly takes us from concrete to abstract. (That's what "abstract" means: selecting out some invariant property shared by many variable things.)
The process of "composition" does many things; among them it can define words. But composition can also describe things (including their invariant properties); composition also generates every expression of natural language other than isolated words, as well as every expression of formal languages such as logic, mathematics and computer programming.
A dictionary defines every word, from the most concrete to the most abstract. Being a definition, it is composite. But it can describe the rule for abstracting an invariant too. An extensional definition defines something by listing all (or enough of) its instances; an intentional definition defines something by stating (abstracting) the invariant property shared by all its instances.
-- PT: "Dictionary definitions are largely based on composition; only rarely do they use abstraction."
All definitions are compositional, because they are sentences. We have not taken an inventory (though we eventually will), but I suspect there are many different kinds of definitions, some intensional, some extensional, some defining more concrete things, some defining more abstract things -- but all compositional.
-- PT: "If these claims are both correct, then it follows that your grounding kernel words will tend to be more abstract than your higher-level words, due to the design of your algorithm. That is, your simple example dictionary is not a rare exception."
The example dictionary, as I said, was just arbitrarily constructed.
Your first claim, about the directionality of abstraction, is certainly correct. Your second claim that all definitions are compositional is also correct.
Whether the words out of which all other words can be defined are necessarily more abstract than the rest of the words is an empirical hypothesis. Our data do not, in fact, support the hypothesis, because, as I said, the strongest correlate of being in the grounding kernel is being acquired at an earlier age -- and that in turn is correlated, in the MRC corpus, with being more concrete. It is only after we partial out the correlation of the grounding kernel with age of acquisition (along with all the covariance that shares with concreteness) that the correlation with concreteness reverses sign. We still have to do the count, but the obvious implication is that the part of the grounding kernel that is correlated with age of acquisition is more concrete, and the part that is independent of age of acquisition is more abstract.
None of this is derived from or inherent in our arbitrary, artificial example, constructed purely to illustrate the algorithm. Nor is any of it necessarily true. It remains to see what the words in the grounding kernel turn out to be, whether they are unique and universal, and which ones are more concrete and which ones are more abstract.
(Nor, by the way, was it necessarily true that the words in the grounding kernel would prove to have been acquired earlier; but if that proves reliable, then it implies that a good number of them are likely to be more concrete.)
-- PT: "As I understand your reply, you are not disagreeing with my claims; instead, you are backing away from your own claim that the grounding kernel words will tend to be more concrete. But it seems to me that this is backing away from having a testable hypothesis."
Actually, we are not backing away from anything. These results are fairly new. In the original text we reported the direct pairwise correlation between being in the grounding kernel and, respectively, age of acquisition, concreteness, imagability and frequency. All these pairwise correlations turned out to be positive. Since then we have extended the findings to WordNet (likewise all positive) and gone on to do do stepwise hierarchical multiple regression analysis, which reveals that age of acquisition is the strongest correlate, and, when it is partialled out, the sign of the correlation with concreteness reverses for the residual variance.
The hypothesis was that all these correlations would be positive, but we did not anticipate that removing age of acquisition would reverse the sign of the residual correlation. That is a data-driven finding (and we think it is both interesting, and compatible with the grounding hypothesis).
-- PT: "There is an intuitive appeal to the idea that grounding words are concrete words. How do you justify calling your kernel words “grounding” when they are a mix of concrete and abstract? What independent test of “groundingness” do we have, aside from the output of your algorithm?"
The criterion is and has always been: reachability of the rest of the lexicon from the grounding kernel alone. That was why we first chose to analyze the LDOCE and CIDE dictionaries: Because they each allegedly had a "control vocabulary," out of which all the rest of the words were defined. Unfortunately, neither dictionary proved to be consistent in ensuring that all the other words were defined out of the control vocabulary (including the control vocabulary), so that is why Alexandre Blondin-Massé designed our algorithm.
The definition of symbol grounding preceded these dictionary analyses, and it was not at all a certainty that the "grounding kernel" of the dictionary would turn out to be the words we learn earliest, nor that it would be more concrete or abstract than the rest of the words. That too was an empirical outcome (and much work remains to be done before we know how reliable and general it is, and what the blend of abstract and concrete turns out to be).
I would add that "abstract" is a matter of degree, and no word -- not even a proper name -- is "non-abstract," just more or less abstract. In naming objects, events, actions, states and properties, we necessarily abstract from the particular instances -- in time and space and properties and experience -- that make (for example) all bananas "bananas" and all lemons "lemons." The same is true of what makes all yellows "yellows," except that (inasmuch as vocabulary is hierarchical -- which it is not, entirely), "yellows" are more abstract than "bananas" (so are "fruit," and so are "colors").
(There are not still unresolved methodological and conceptual issues about how to sort words for degree of abstractness. Like others, we rely on human judgments, but what are those judgments really based on?)
(Nor are all the (content) words of a language ranged along a strict hierarchy of abstractness. Indeed, our overall goal is to determine the actual graphic structure of dictionary definition space, whatever it turns out to be, and to see whether some of its properties are reflected also in the mental lexicon, i.e., not only our mental vocabulary, but how word meanings are represented in our brains.)
-- PT: "You suggest a variety of factors, including concreteness, imageability, and age of acquisition. You are now fitting a multilinear combination of these factors to the output of your algorithm. Of course, if you have enough factors, you can usually fit a multilinear model to your data. But this fitting is not the same as making a prediction and then seeing whether an experiment confirms the prediction."
I am not at all confident that the grounding kernel, extracted by our algorithm, was bound to be positively correlated, pairwise, with age of acquisition, concreteness, imagability and frequency, but we predicted it would be. We did not predict the change in sign of the correlation in the multiple regression, but it seems an interesting, interpretable and promising result, worthy of further analysis.
-- PT: "I am willing to make a testable prediction: If my claims (1) and (2) are true, then you should be able to modify your algorithm so that the kernel words are indeed more concrete. You just need to 'turn around your operation'."
I am not quite sure what you mean by “turn around your operation," but we would be more than happy to test your prediction, once we understand it. Currently, the "operation" is just to systematically set aside words that can be reached (via definition) from other words, iteratively narrowing the other words to the grounding kernel that can only be reached from itself. This operation moves steadily inward. I am not sure what moving steadily outward would amount to: Would it be setting aside words that cannot be reached via definition? Would that not amount to a more awkward way of generating the same partition (grounding kernel vs. rest of dictionary)?
Please do correct me if I have misunderstood.
Sunday, April 16. 2006
In reality the "semantic web" is, and can only ever be, a ''syntactic web''. Syntax is merely form -- the shape of arbitrary objects called symbols , within a formal notational system adopted by an agreed and shared convention. Computation is the rule-based manipulation of those symbols, with the rules and manipulations ("algorithms") based purely and mechanically on the shapes of the symbols, not their meaning -- even though most of the individual symbols as well as the combinations of symbols are systematically interpretable (by human minds) as having meaning.
Semantics, in contrast, concerns the meanings of the symbols, not their shape, or the syntactic manipulation of their shapes. The "symbol grounding problem" is the problem of how symbols get their meanings, i.e., their semantics, and the problem is not yet solved. It is clear that symbols in the brain are grounded, but we do not yet know how. It is likely that grounding is related to our sensorimotor capacity (how we are able to perceive, recognise and manipulate objects and states), but so far that looks as if it will only connect symbols to their referents, not yet to their meaning. Frege's notion of "sense", which is again just syntactic, because it consists of syntactic rules, still does not capture meaning. Nor does formal model-theoretic semantics, which likewise merely finds another syntactic object or system that follows the same rules as those of the syntactic object or system for which we are seeking the meaning.
So whereas sensorimotor grounding -- as in a robot that can pass the Turing Test -- does break out of the syntactic circle, it does not really get us to meaning (though it may be as far as cognitive science will ever be able to get us, because meaning may be related to the perhaps insoluble problem of consciousness).
Where does that leave the "semantic web"? As merely an ungrounded syntactic network. Like many useful symbol systems and artificial "neural networks", the network of labels, links and connectivity of the web can compute useful answers for us, has interesting, systematic correlates (e.g., as in latent "semantic" analysis, and can be given a systematic semantic interpretation (by our minds). But it remains merely a syntactic web, not a semantic one
(Page 1 of 1, totalling 13 entries)
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.