Jackendoff, Ray (2002) Foundations of Language. Oxford University Press.

Stevan Harnad, 9 Oct 2002

Chapter 4: Universal Grammar (UG)

With this chapter we enter into the heart of the matter, for the nature of Universal Grammar (UG) and its implications about, learning, learnability, the "poverty of the stimulus," innatness, evolution -- as well as the relation of syntax to meaning, and of linguistic research to the rest of cognitive science research -- are what all the substance, and all the controversy, are really about. This chapter is still quite general, though, for Jackendoff's own particular view on these matters has not yet been differentiated from the generic Chomskian approach.

(1) "Purging intentional vocabulary" by speaking of "having rules" rather than "representing rules" (fine: "representation" was vague and equivocal anyway) and replacing locutions like "knowing" by "f-knowing" (not fine, in fact nonsense).

(2) "[I]t is important to ask how the grammar [UG] got there." Indeed it is. Hence all the focus on learnability, learning, innateness, evolution, etc. See: http://cogprints.ecs.soton.ac.uk/archive/00000863/index.html

(3) Primary linguistic data are what the child hears (including corrective feedback on what the child says). That is the database from which the rules of UG must be learned, if they are indeed learned; otherwise, they must be innate.

(4) According to the "poverty of the stimulus" argument, the primary linguistic database is too impoverished (in fact far too impoverished) for the child to be able to learn UG from it (by trial, error, and corrective feedback, mediated by analogies and hypotheses -- the usual resources of inductive learning). For one thing, nontrivial learning requires both positive and negative examples in order to learn categories and rules, yet most of the child's data are positive only: The child virtually never hears or generates violations of UG, hence never receives (or needs) corrective feedback either.

After a brief "trigger experience" period, the child already "has" UG, as evidenced by the fact that he (virtually) never violates UG. That very capacity is what needs to be explained. The rules of UG are the explanation of the capacity. The fact that those rules cannot have been learned from what the child has said and heard are an empirical fact about the nonlearnability of UG from the available database.

Further evidence for the nonlearnability of UG from the child's database is the fact that decades of attempts by professional linguists to "learn" UG from a much larger database, and one that does include negative examples, has so far only produced a partial hypothesis of the structure of UG. Yet the child "has" UG already, from its much more impoverished database. (Jackendoff calls this "The Paradox of Language Acquisition.")

(Linguists are of course trying to learn UG explicitly, so that they can describe, test and implement it, whereas the child merely needs to learn UG implicitly, so he can use it to generate and recognize all and only UG-compliant utterances, but this implicit/explicit difference is a red herring, insofar as the validity of UG is concerned. No need to say the child only "f-learns" UG, whereas linguists "learn-learn" it.)

Anyone who has UG can judge (just as a subject in a psychophysical discrimination experiment can judge) whether an utterance is or is not UG-compliant, and can generate all (and only) the infinity of UG-compliant utterances at will. This is our behavioral capacity to make the starred/unstarred (X vs *X) distinction that Jackendoff illustrates in the first 3 chapters. This is the behavioral capacity (syntactic capacity) that Chomskian linguistics made it its empirical task to explain. The explanation -- a hypothetical set of rules constructed and tested across the decades -- is UG. It turns out to be complex, despite all efforts to make it minimal (i.e., no more complex than necessary), and far too complex to be learned from the child's database.

Jackendoff is at pains to point out that UG corresponds to whatever capacity the child must have in order to "learn" -- via the "trigger experience," based on its impoverished database -- any and every natural language, as well as the capacity to generate all and only the unstarred (UG-compliant) utterances in any of those languages. UG is the "initial state" for our full  syntactic capacity, and it turns out to be a highly structured initial state.

(5) Ecumenically, Jackendoff mentions many misunderstanding and red herrings that have occurred across the decades: gestalt laws, relevance to other areas of cognitive science and neuroscience, "meaning."

But there is still a question to be asked about something Chomsky never actually proposed, namely, the "autonomy of syntax." UG is syntactic structure, a set of rules governing the form that UG-compliant utterance must take. In a formal domain, such as mathematics and logic, syntax is most definitely autonomous: Not only are the rules of well-formedness in, say, arithmetic, a purely syntactic matter, completely independent of what an arithmetic expression means ("a+b=c" is well formed whereas *"a+=bc" is ill-formed [in the standard notational system]), but even the provability, proof, and truth of arithmetic expressions are merely formal (i.e., syntactic) matters: hence "2+2=4" is "unstarred" in standard arithmetic, whereas *"2+2=5" is "starred" (where unstarred = TRUE, and provably derivable, syntactically, from the axioms, and * = FALSE, and not derivable, syntactically, from the axioms, in fact, contradicting them).

In linguistic syntax, there is not this complete autonomy of the syntax of an utterance from its meaning. Meaning does not enter into the mathematician's derivation of the truth or falsity of a well-formed proposition (and, a fortiori, also not into the rules for its well-formedness). But meaning certainly enters into the judgment of both the language speaker/hearer (whether child or adult)  and the linguist as to whether an utterance is or is not well-formed. So natural-language syntactic capacity is not just formal; the rules of UG are based on both form and meaning. This is why Chomsky has come to refer to them as the constraints on "logical form" -- that is, constraints on the way that a meaning can take a form at all in a natural language.

I suspect that this hybrid formal/semantic nature of UG syntax might have some problematic consequences. We will return to this.

(6) Tomasello complains that UG is "very difficult to relate to cognition in other psychological domains." (!) Clark thinks UG might have arisen from "some small tweak" in the architecture of the brain that we had already -- "some additional feedback loops." (!) (Now why didn't those 3 decades of Chomskian linguists think of that! Some feedback loops; that's all it takes...) Deacon thinks language "adapted" to the brain, rather than the other way round, and that that explains UG. (!)

(7) Again, in Jackendoff's ecumenism (and desire to make linguistics [UG?] less "syntactocentric"), he mentions optimality theory (but not whether it takes any of the burden off UG, whether it makes something learnable after all, or at least plausibly evolvable).

Jackendoff, like Chomsky, points out that many other cognitive capacities "grow" out of a highly structured initial state that is inborn (e.g., vision, movement, etc.). But here a question is consistently begged (and as far as I know, no UG theorist has actually faced it squarely -- although failure to answer it would not count as empirical evidence against UG): For all these other cases of complex inborn structure (vision, movement, etc.), the evolutionary "shaping" of the inborn structure on the basis of the adaptive advantages it would confer over alternatives are quite transparent. But in the case of UG, they are not, not even to Chomsky:

"Universal grammar on his conception is so tightly organized that its incremental development through natural selection looks on the face of it unlikely."
Unfortunately, this leaves us with only the uncomfortable option of the "Big Bang Theory of the Origin of UG." And it begs the question of whether there would be any adaptive advantage at all in evolving UG-compliant rather than non-UG-compliant language capacity.

(8) A possible sample of the potential problem arising from the permeable barrier between syntax and semantics in determining the structure of UG:

No problem with the starred sentences from earlier chapters (e.g. *"What did Beth eat peanut butter and for dinner?). They feel like mostly syntactic judgments, but what about:

a. Every acorn grew into an oak.
b. Ever oak grew out of an acorn.
c. An oak grew out of every acorn.
d. *An acorn grew into every oak.
First: Is d syntactically ill-formed (i.e., ungrammatical) or just false? (I think that's a problem.)

Second, if the problem is that it does not have the paradigmatic meaning one would expect from the
formal pattern above, then what about the following rival semantic paraphrases, which could be what is
indirectly constraining it, rather than syntax:

a'. Every acorn grew into an oak.    =  Every acorn grew into SOME-OR-OTHER oak.     = True
b'. Ever oak grew out of an acorn.   =  Every oak grew out of SOME-OR-OTHER acorn. = True
c'. An oak grew out of every acorn. =  SOME-OR-OTHER oak grew out of every acorn.  = True
d'. An acorn grew into every oak     =  SOME-OR-OTHER acorn grew into every oak      = True

a''. Every acorn grew into an oak.    =  Every acorn grew into THAT PARTICULAR oak.     = False
b''. Ever oak grew out of an acorn.   =  Every oak grew out of THAT PARTICULAR acorn. = False
c''. An oak grew out of every acorn. =  THAT PARTICULAR oak grew out of every acorn.  = False
d''. An acorn grew into every oak     =  THAT PARTICULAR acorn grew into every oak      = False

So what's at issue is truth, not grammaticality, and what determines it is polysemy, with two different possible construals of "an". Is the fact that the default construal for c happens to be c' and the default construal for d happens to be d'' a property of UG? Could some intensive local usage or stage-setting switch the defaults? Would that be a parameter=setting change in UG?

(9) Jackendoff provides a good criterion for critics of UG and its awkward unlearnability/unevolvability: (i) offer a simpler alternative to UG that predicts and explains the data (capacity to learn to generate all and only unstarred utterances in any human language) at least as well as UG does, yet is learnable/evolvable or (ii) show that UG is learnable/evolvable after all. Anything else is just nondemonstrative skepticism.

(10) Poverty of the stimulus in word learning (!)

This is not UG's empirical domain, hence not its strength either. Vocabulary learning is part of cognitive development, and continuous with sensorimotor category learning. No evidence for poverty of the stimulus there -- apart from certain "prepared" parts of form perception and locomotion that were shaped by our mammalian sensorimotor and ecological history. As a working hypothesis, it is much more prudent to assume that both sensorimotor categories and the words that label them are learned and learnable for the most part. No poverty of the stimulus problem here. Plenty of positive and negative instances, trial and error, and feedback, and time.

The 8000-word vocabulary of the 6-year old ("5 words a day") is a piece of cake, considering the number of thing-sorting and corrective-feedback days and hours there are as of birth and the number of things the child has already learned to categorize before even giving them a name.

Quine's "indeterminacy of radical translation." No such thing: Just what computational learning theorists call the "credit/blame assignment problem" and developmental psychologists call the problem of "over- and under-generalization of categories." A problem, but solvable, solved, learnable, and, though underdetermined, not radically underdetermined (nor indeterminate).

Jerry Fodor on the "Big Bang Theory of the Origin of Word Meanings" (by analogy with the "Big Bang Theory of the Origin of UG"): The less said about that the better: http://cogprints.ecs.soton.ac.uk/archive/00001624/index.html

(Pinker's use of past-tense learning to illustrate poverty of the stimulus and unlearnability was also a singularly bad choice, because there is no poverty of the stimulus there, and past-tense is learnable from the available data, by machines, nets as well as people.)

(11) Genetics: Like the question of the brain implementation of UG, the question of the genetic coding of UG is a red herring. It is not incumbent on a theorist who is trying to explain the rules that give us our star/non-star capacity to also explain how they are encoded in either our brains or our genes. The rules (UG) are quite enough, thank you!

On the other hand, there is still the niggling question of the evolvability of UG. for the same things that make it unlearnable surely also make it unevolvable. And there is also still the problem of explaining what, if anything, is the adaptive advantage of UG over *UG!

(12)  Elman's learning systems learn the learnable, from sufficient data. They simply bypass the question of UG, because what they learn is not posited to be unlearnable! (Other red herrings: individual differences, brain plasticity, "overlearned capacities," etc. And Fodor's criteria for "modularity" are as arbitrary as they come; the very concept of "modularity" is probably incoherent, as no brain system or capacity is causally isolated from all the rest. Modularity can only be an approximation.)

(13) Chomsky seems ambivalent about evolution. Jackendoff quotes him as saying, on the one hand that:

"There is surely no reason today for taking seriously a position that attributes a complex human achievement entirely to months (or at most years) of experience, rather than millions of years of evolution..."
Fair enough, but then what about:
"Universal grammar on his conception is so tightly organized that its incremental development through natural selection looks on the face of it unlikely."
He still prefers:
"principles of neural organization that may be even more deeply grounded in physical law."
In other words, the "Big-Bang Theory of the Origin of UG..."
http://cogprints.ecs.soton.ac.uk/archive/00000863/index.html

(14) Other red herrings: animal language, precursors, deficits, savant skills, specific language impairment.

(15) Do adult 2nd-language-learners "use" UG? (Obviously not if they learn a *UG language!)

(16) The "overwhelming case for some degree of biological specialization for language learning in humans" is equivocal: Yes, there is biological evidence, but it has no specific bearing on UG!

The verdict on UG? It still is the only game in town -- and has been since the '50's, when Chomsky invented the game (a feat comparable to inventing a whole new branch of mathematics, but more like discovering a whole new empirical vein in science). Poverty of the stimulus (and probably unevolvability) come with the territory. When there is no (non-question-begging) alternative in sight, there is no need to resort to f-words to describe the status of the UG hypothesis: It is taken to be valid until/unless it is unseated by contrary data or a rival hypothesis.

On the permeability of syntax to semantics, nolo contendere; on the star/non-star methodology for theory-contruction and theory-testing (explicit induction from apparent regularities to starred utterences and whatever would rule them out) likewise.

And even should this remarkable edifice (built mainly by one man) one day be overturned, it will always remain true of it that "se non e vero, e ben trovato!"