Intro. to Syntax Lecture Notes

  • Doc File 625.00KByte

x G61.1310 Introductory Syntax Lectures- Mark R. Baltin

Lecture #1- Preliminaries

This course is a course about syntax- the principles by which words combine to form sentences. The study of syntax tries to answer two main questions:

(i) what are the principles particular to syntax for a particular language?

(ii) what are the principles of syntax for any human language?

The study of syntax is a branch of the field of linguistics, which has as its main goal a characterization of human language. As such, linguistics can be distinguished from the field of semiotics, which studies the properties of symbolic systems in general. For example, the system of traffic lights in the U.S. is a sort of symbolic system. There are essentially three symbols: red light, yellow light, and green light. We call these three states of a traffic light symbols because each condition symbolizes a different meaning---a red light signals that the approaching traveler is to stop, and not cross the intersection, a yellow light indicates that the approaching traveler is to stop before reaching the intersection because the light is about to turn red, and a green light indicates that the approaching traveler is free to cross the intersection.

We can say that the system of traffic lights has a grammar, which can be defined as a specification of the possible expressions in the symbolic system, together with a pairing of expressions with meanings. In this case, the grammar of traffic lights has three expressions, and three pairings with meanings.

In this case, the grammar of traffic lights is extremely simple. It can be presented as follows:

(1) [[ Green]]-----> Go

[[Yellow]]-----> Stop if before the intersection

[[ Red]]------> Stop

The pairing in (1) is a specific type of relation in mathematics, known as a function. A function is a pairing of elements from two sets, such that each element in the first set is paired with no more than one element in the second set. If the first set has additional elements that are not paired with any elements in the second set, the function is said to be a partial function. If every element in the first set is paired, the function is said to be a total function. Let us assume that the function that pairs the expressions of a natural language, such as English, Chinese, Welsh, Papago, etc., is a total function.

Question: What would it mean for the function that pairs the expressions of e.g. English, to be a partial function?

I. Can the Grammar of English Be Described as Easily as the Grammar of Traffic Lights?

The grammar of traffic lights has some noteworthy features that are useful to think about in thinking about what we think of as a human language. For one thing, you can count up the number of sentences that the grammar of traffic lights allows. There are three. For another, we cannot really say that the grammar of traffic lights has a syntax. It has a list of "words"--- red, green, and yellow, but each of these words comprises a complete expression in "traffic-lightese", and none of the expressions can be combined with any other expressions.

To introduce some jargon, we would say that (i) traffic-lightese is finite, and that (ii) the sentences of traffic-lightese are bounded in length. When we say that a language is finite, we mean that there is a fixed number to the expressions of the language. When we say that the sentences of the language are bounded in length, we mean that we can precisely define how long the sentences of the language can be.

And to complete the circle, you can see what we mean by the term language. A language is simply the set of sentences that a grammar generates.

To learn traffic-lightese, necessary ( possibly, although it's unclear if you live in New York City), you had to simply memorize the "sentences" of traffic-lightese and learn the function given in (1) that pairs each sentence with its meaning. Could you memorize the individual sentences of English the way that you memorized the sentences of traffic-lightese?

Consider the following lines from Lewis Carroll's poem "Jabberwocky":

(2) The blithy toves did gyre and gimble.

(3) The blithy toves karulized elatically.

Sequences of elements in a language are called strings. The reaction of native speakers of English to the strings in (2) and (3) is rather interesting. The strings are recognized as being English-like, so that these strings are felt to be sentences of English, even though the words in these "sentences" have never been encountered before. We have a feeling that "toves" is a plural (i.e. a form denoting more than one) of "tove", which is what we learned in school as a noun ( we will soon learn what the basis of notions like noun and verb is) . Furthermore, even though we have never encountered the "words" "gyre", "gimble", and "karulized", we perceive them to be verbs, and, furthermore, that "karulized" is the past tense of "karulize". Finally, we recognize "elatically" as an adverb.

The way in which we deal with strings such as (2) and (3) illustrates an important difference between English and other languages that are termed natural languages, on the one hand, and traffic-lightese, on the other. That difference has been termed linguistic creativity-the ability to produce and understand strings of a language that have never been previously encountered. Traffic-lightese is a language with a fixed number of sentences- what is termed a finite language. Natural languages such as English, on the other hand, are infinite languages, in the sense that there are an infinite number of sentences in each natural language. What is the source of this infinity?

Well, for one thing, the words of English seem to be grouped into classes, so that we can recognize new words coming in as members of these classes. Unlike traffic-lightese, in which there are a fixed number of words (three, to be precise), natural languages have an unlimited number of words that simply have to be fitted into word-classes. The traditional grammar term for a word-class is part-of-speech. We term a word-class a grammatical category. We will soon be examining the basis for the notion of a grammatical category, and contrasting two views of grammatical categories- the notional view, in which each grammatical category has a particular meaning, and the distributional view, in which each grammatical category has a unique distribution, but the Jabberwocky example bears on the comparison of these two views. Let’s see why.

The reason that the Jabberwocky example is so striking is that the words are nonsense words. We’ve never encountered them before, so we can’t possibly know what they mean. Nevertheless, we feel that (2) and (3) are English sentences with unfamiliar words. The basis for our feeling is that the words are in the right places for words of the appropriate word-classes (let’s call them grammatical categories from now on). To see this more carefully, let’s systematically deform, for example, (2), and see if, at each stage of the deformation, we still have the feeling that the string of words is an English sentence.

Let’s start by removing the [-s] from the example in (2), and see if the

[-s]’s removal changes our perceptions of the status of the string:

(2)’ The blithy tove did gyre and gimble.

(2)’ does seem to be English-it’s talking about a single tove, who performed a compound action in the past of gyring and gimbling. Now, let us remove the did:

?(2)’’ The blithy tove gyre and gimble.

This has a somewhat shakier status as an English sentence, and the sense that I have gotten in the past, when I’ve performed this experiment in classes, is that speakers of English are split. Some people find this sentence to be non-English, while others find it to be English if tove is taken to be an irregular plural of some sort, like children or cattle. Removing the makes the string still harder to recognize as English:

?(2)’’’Blithy tove gyre and gimble.

Finally, removing the and causes the sequence of words to be felt by all speakers as being simply a string of words, with the character of a list:

(3)’’’’*Blithy tove gyre gimble.

An asterisk before a set of words is taken by convention to mean that the sequence of words is an ungrammatical sentence.

Let us stop for a minute and think about how we dealt with this example. We couldn’t have known the words. Rather, we took some words that we knew (and parts of words, such as the [-s]), and figured out details about the unfamiliar words from how they were positioned with respect to the familiar parts of English. In this sense, the distributional account of what grammatical categories are seems to fare better than the notional account. We had to be figuring out what kind of structure to assign the string based on the sequencing of its parts, looking at the unfamiliar parts and seeing where they were relative to the familiar ones.

It is important to see what we’ve just done. We’ve taken two a priori plausible views of what a grammatical category is, and we’ve tested them by seeing what predictions they each make about phenomena in the part of the world that we’re investigating (i.e., sentences).

In any event, we’ve seen one reason for the “open-endedness” of a language such as English, as opposed to traffic-lightese, and that is the fact that natural languages (human languages, for our purposes) have a syntax- a set of rules for arranging elements into more complex units of language (i.e., words into sentences). Traffic-lightese does not have these principles-every word is a complete sentence, and there are no principles for stringing words together to form more complex sentences.

As we’ll see very shortly, there are two ways in which natural languages are infinite, meaning that there’s an infinity of sentences in the language. We have seen the first way, in which sentences are said to be made up of members of grammatical categories, and new words can enter the language to instantiate these grammatical categories.

A second way in which natural languages are infinite is that, as opposed to traffic-lightese, in which you can specify the length of each sentence ( because each sentence is composed of one word and there are no procedures in traffic-lightese for combining words), there is no specifiable bound on the length of a sentence in a natural language. To see this, consider the following:

(4) a. The teacher left.

b. The teacher’s mother left.

c. The teacher’s mother’s friend left.

d. The teacher’s mother’s friend’s sister left.

e. The teacher’s mother’s friend’s sister’s boss left.

f. The teacher’s mother’s friend’s sister’s boss’s mother left.

I could have kept going with this type of example, and the sentence would have gotten continually longer. In English, as indeed in all natural languages, the grammar must contain methods or devices to create sentences of any length. Obviously, for a sentence to be a sentence of a language, it must stop at some point, but the grammar of English must allow that point to be of any conceivable length. In technical parlance, the grammar of English must generate an infinite language.

Competence and Performance

At this point, we must step back for a minute and consider what we are trying to account for. We have been trying to account for what it means to know English ( just as an example- we could have picked any language to try to account for). However, in order to get our data for English, we have relied on the intuitions of speakers of English- how speakers of English feel about the strings that are presented to them. However, it seems that we cannot go directly from our intuitions about English to inferences about whether particular strings are in the language. The reason is that the properties of particular strings may be due to factors that are not, properly speaking, part of the language at all. For example, suppose we had continued to elaborate (4)(f) by continuing to add [‘s] plus a noun, as in (4)(f)’:

(4)(f)’ The teacher’s mother’s friend’s sister’s boss’s mother’s cousin’s sister’s doctor’s’ father’s neighbor’s daughter’s friend’s teacher’s cousin’s neice’s accountant left.

This string would be felt to be unacceptable, but not, it is usually thought, because of our knowledge of English. To understand a string such as (4)(f), and to make sense of it, we have to integrate what we know about the individual words with a structure for the whole sentence, and a run-on sentence such as this taxes our ability to remember everything that has come before when we get to the end of a sentence. There are a number of studies of memory, and one thing that we know about human memory is that it is limited (there’s a classic paper by the psychologist George Miller, entitled “ The Magic Number Seven, Plus or Minus Two: Some Limits On Our Capacity for Processing Information”, Psychological Review (1957), that proposes a specific bound on short-term memory across a wide variety of perceptual domains).

In any event, if what is wrong with (4)(f)’ is due to a memory problem in understanding the whole sentence, this problem would not be felt to be a problem with the English of the string, but rather with the fact that people put their knowledge of English to use by employing the rest of their mental resources. In other words, our knowledge of English is embedded in the rest of our capacities, such as memory, limitations on articulation (the fact that our vocal tract can do some things but not others), etc.

Chomsky, in Aspects of the Theory of Syntax (MIT Press, 1965) made a distinction between what he calls competence and performance. Competence is our knowledge of language, and performance is the mechanisms by which our knowledge of language is put to use. As linguists, particularly as syntacticians, we are interested in specifying competence in a particular language, rather than performance. However, since when we try to determine what constitutes knowledge of,e.g. English, the raw data that we start with is people’s intuitions about the language, we don’t know what status to ascribe to people’s intuitions. When somebody says that a string “sounds funny”, is it because of a property of English(competence), or because of some factor other than language (performance)?

There are really two parts to deciding to relegate the factor to the domain of performance:

Rationale for Performance Explanation

i) Putting it into competence would cause competence to have complicated restrictions that are seemingly arbitrary from the point of view of competence.

ii) One can come up with a performance account that is plausible within the domain of performance.

We’ve seen the two parts of the argument for using performance as the account already in the discussion of (4)(f)’. Putting a limit on the length of an English sentence would complicate our account of English, and we would have to explain why the account exists within our description of English. Furthermore, the explanation of the restriction in terms of a limitation on memory is a natural one. It has to be the case that we have memory limitations. Try remembering a sequence of 20 digits.

There are two other cases of strings that would seem to be unacceptable for performance reasons. One has to do with (2)”, repeated here:

?(2)’’ The blithy tove gyre and gimble.

Recall that this string was not considered to be word-salad, but was not as acceptable as (2). It was intermediate in acceptability, and its acceptability hinged upon whether tove was taken to be an irregular plural. It is instructive to consider this intermediate unacceptability further. Obviously, English contains irregular plurals, and we are free to coin new lexical items. This has happened several times in the history of English, as it has in all languages, surely. However, we don’t tend to assume that a new word is irregular in what is called its morphology ( the distribution of meaningful forms). Let us state a principle like the following:

3) Assume initially that a novel form is regular.

The question is: What status does a statement like “Assume initially…” have within a grammar? Grammars tend to rule things in or out. (2)”, however, gets better if we decide that our initial assumption was wrong.

It seems plausible to take an example such as (2)” to be due to the application of what the psychologist Thomas Bever has called a perceptual strategy ( T.G. Bever (1970), “The Cognitive Basis of Linguistic Structures”, in J.R. Hayes, ed., Cognition and the Development of Language, Wiley & Sons). A perceptual strategy is a sort of heuristic that hearers develop that enables them to assign a structure, and hence to understand, a sequence of elements that they are encountering. Bever illustrated the application of a perceptual strategy with a now-famous example. Consider a non-nonsense string such as (4):

4) The horse raced past the barn fell.

Does this sound like an English sentence? Most people would say that it doesn’t; it sounds like a main clause The horse raced past the barn, but then we have no way of integrating the word fell. This analysis of the string (called parsing- the assignment of a structure to a string) relies on analyzing raced as the past tense of race.

However, the past tense of race is homophonous with another form of race, called a participle form of race. The participle form of race is shown in examples such as (5):

5) ?? [1]The horse was raced past the barn.

Keep in mind the two uses of the word raced- the past tense form and the participle form. Now, let us alter (4) by substituting the word driven for the word raced:

6) The horse driven past the barn fell.

This sentence is perfectly acceptable, as is its paraphrase (7):

(7) The horse which was driven past the barn fell.

In (7), the sequence which was driven past the barn is an instance of what is known as a relative clause- strictly speaking, a restrictive relative clause. Restrictive relative clauses have the function of limiting the class of objects to which the noun that precedes the relative clause can refer. When a restrictive relative clause begins with a word like who or which (known as wh-words, which we’ll talk more about later on) and is followed by a form of the verb be, there is under most conditions a synonymous sentence that simply omits the wh-word plus be. This construction is known as the reduced relative construction. Further examples:

7) a. The girl who was sitting on the stoop was studying for her finals.

b. The girl sitting on the stoop was studying for her finals.

8) a. People who are angry about this issue should write their elected representatives.

b. People angry about this issue should write their elected representatives.

If we decide that (4) is ungrammatical, the question that we would ask is why. Why is there no reduced relative counterpart to (9), which is perfectly acceptable?

9) The horse which was raced past the barn fell.

We have a textbook example here of the Rationale for Performance Explanation. Obviously, (4) is unacceptable as a paraphrase of (9) because the word raced is taken to be a main clause. We could of course say that (4) is ungrammatical . However, in considering the implications of this decision, we would be saying that the reduced relative clause construction does not occur in English when the first word of the reduced relative clause would create a sequence that is homophonous with a simple main clause. This restriction is mysterious from the standpoint of grammar, but is explained naturally from the vantage point of perception, given that we, as speakers of English, must have a psychological mechanism for understanding sentences as they come in. In a sense, placing the restriction within the grammar of English would make the restriction look bizarre; the restriction as an instance of what Bever calls a perceptual strategy is quite natural.

All of this is intended as a cautionary note that, paradoxically, the raw data for syntactic analysis is speech, which is an instance of performance, but what we are trying to construct is a model of competence, reflected psychologically as our knowledge of language (as opposed to performance, which is how that knowledge is put to use one particular occasions).

As a historical note, the competence-performance distinction that Chomsky makes is a quite traditional one within linguistics, but under different names for these concepts: the late Swiss linguist Ferdinand de Saussure coined the terms langue (for ‘language’) and parole (‘speech’).

The Role of Formalism

As linguists, we are trying to mimic the task of children in learning their native languages. It is uncontroversial that children learn the rules of their languages without explicit instruction, as can be seen by, e.g. the innovations and over-regularizations (making forms regular that are irregular in the adult language, such as goed and buyed as the past tenses of go and buy respectively).

It is clear that children are constructing a grammar, but what form does this grammar take? It cannot be expressed in, e.g., English, because they don’t know English yet. To borrow a term from the philosophy of language, we would say that English in this case is the object language, the language being described, while the rules of English are being formulated by children in what is known as a meta-language, a language that is outside of the language being described.

A good deal of what we will be doing will involve discovering the nature of this meta-language. We will be posing hypotheses about the rules of grammar, and the way that they interact, by viewing the grammar as what is known as a formal system, a system in which all of the concepts have a precise definition. It is by making this assumption that we can make testable predictions about the grammar. Additionally, formalism has the advantage of ensuring that the terms that are used in an account have the same meaning to all parties.

Criteria of Adequacy for a Grammar

There is an old saying, “If you stand for nothing, you’ll fall for everything”. How do you decide whether the set of rules that you’ve proposed for some set of sentences in a language is the right one, or the best one, given that there are an infinite set of possible grammatical descriptions?

In formulating a grammar, or a set of rules that we assume models the language user’s knowledge of language, we must first decide how we are going to evaluate the grammar that we propose. Chomsky (1965) proposed three criteria of adequacy for a grammatical description, which he dubbed:

i) observational adequacy;

ii) descriptive adequacy;

iii) explanatory adequacy.

I will now discuss these concepts.

A.Observational Adequacy

Linguists who formulate grammars of languages that are not their own, and who work with native speakers of those languages, are called field linguists. They typically start by finding out the words for various concepts in the target language, and eventually ask the speaker if (s)he can put the words together in this way or that way to form an acceptable sentence in the language. After collecting the responses for some time period (say, an hour-and-a half, for example), the field linguist leaves and analyzes the responses, trying to figure out the rules that generate all of the acceptable strings and none of the unacceptable strings. A set of rules, or grammar, that achieves this, is said to be observationally adequate. Hence, observational adequacy can be defined as follows:

Observational adequacy: the ability of a grammar to generate all and only the grammatical sentences of a language in a fixed body of data (called a corpus).

B. Descriptive Adequacy

As we saw from the Jabberwocky example, natural languages are infinite, and hence a grammar of a natural language must be able to generate an infinite number of sentences. To take the example of the field linguist above, after the field linguist has formulated a grammar that is observationally adequate, he tests the grammar against the native speaker’s intuitions by asking the native speaker if some further set of sentences that are not in the original corpus are acceptable sentences in the language.

If a grammar generates all and only the set of grammatical sentences in the language, it is said to be descriptively adequate.

Descriptive adequacy: the ability of a grammar to generate all and only the grammatical sentences of the language.

However, we are not only interested in generating the right set of strings. Remember, what we are really interested in modeling is the full set of abilities that native speakers have, and one of those abilities is the ability to recognize the meanings of sentences. For example, we know that The cat is on the mat does not mean that John saw Mary. We therefore have to build in this ability as well. A traditional way of describing a grammar is as an infinite set of pairings of form and meaning. Let us therefore revise our definitions of observational and descriptive adequacy as follows:

Observational adequacy (Final Version): the ability of a grammar to generate all and only the grammatical sentences of a language in a fixed body of data (called a corpus), and to pair each grammatical sentence with its meaning.

Descriptive adequacy(Final Version): the ability of a grammar to generate all and only the grammatical sentences of the language, and to pair each grammatical sentence with its meaning.

C.Explanatory Adequacy:

The third requirement is not, strictly speaking, a requirement on grammars, but, rather, a requirement on the account that underlies the construction of a particular grammar, i.e. an account of what a possible grammar of a human language can be. This needs a little more explanation.

When we formulate a grammar, we must have, at some level, a set of assumptions as to what a possible grammar can be- there are certain possibilities for rules that don’t even occur to us. We therefore have, if only implicitly, a theory of possible grammars.

It is commonplace to view the task of a linguist, in discovering a descriptively adequate grammar of a language, as being identical to the task of a child, who is trying to discover the descriptively adequate (adult) grammar of the language of her or his community. Because linguists are trying to model the abilities of native speakers, one of their goals is to try to formulate this theory, called a theory of universal grammar, as well as the grammars of particular languages. We would therefore say that the relation of the theory of grammar to grammars of particular languages could be described as follows:

10) Theory of Grammar (Universal Grammar) ={G1,…., Gn}

In other words, a theory of grammar is a specification of the possible grammars, which we’re calling G1 through Gn (instead of French, English, Ewe, Chinese, etc.). Now, a theory of grammar that is the correct account of what a possible grammar of a natural language should predict only the set of actual possible grammars, and should not predict that some grammar is the grammar of a natural language that is never realized in fact. In other words, a theory of grammar should not over-predict.

There is a distinction between “actual grammars of human languages” and “possible grammars of human languages”. As Chomsky & Halle put it in the preface to their classic work in phonology, The Sound Pattern of English (Chomsky & Halle (1968)), “If a nuclear explosion were to wipe out everybody on earth except for the inhabitants of Tanzania, we would not want to say that pitch is a linguistic universal.” (the assumption being that the language spoken in Tanzania is what is known as a pitch-accent, or tone, language). External circumstances would cause only one language to spoken in the world, the speakers of all of the others having been wiped out by nuclear extinction, but the speakers of that language would have the capacity to learn other languages that are not pitch-accent languages.

Remember, in constructing an account of what a possible human language is, we are modeling actual human capacities, and the assumption is that humans will only consider a certain set of grammars as possible grammars as human languages. Explanatory adequacy, therefore, can be defined as a requirement that a theory of grammar only allow for the possible grammars of human languages.

Explanatory Adequacy= the ability of a theory of grammar to predict only the set of possible grammars of human languages.


The goal of the study of syntax is two-fold:

i) characterization of what it means to know a particular language (i.e. formulation of descriptively adequate grammars of specific languages);

ii) determination of the range of possible grammars of human languages.

You can’t do one in the absence of the other. Rather, syntacticians are always working heuristically (i.e., back-and-forth) between the two accounts, universal grammar and particular grammar.

Lecture #2- Parts of Speech

We can think of our task, in characterizing the syntax of English, as the task of answering the following question:

1) S=?

with S being the set of English sentences. A more usual way of expressing the question is by using an arrow, so that (1) would be expressed as (2):

2) S--(?

The difference being that the arrow expresses the idea that sentences are structured, so that the arrow means roughly “consists of”.

As we saw last time, knowing a language must be more than knowing a fixed number of sentences, since we have what is known as linguistic creativity- the ability to create and recognize new forms all of the time. We must therefore have some rules to fit the new forms into patterns that we already recognize. The Jabberwocky example last time illustrated this point. While the particular words were new, we were able to recognize the stringing together (technically known as concatenation) of them, because we were able to fit the concatenation into a pattern, and the environment of each word enabled us to recognize it as part of a word-class.

So, we can say that part of what allows us to see particular sequences of words as sentences, and not other sequences, is that we state the rules for stringing together words not in terms of particular words, but word-classes, known traditionally as parts of speech, or grammatical categories.

As I mentioned briefly last time, there are two views of the basis of grammatical categories- the notional, and the distributional. The notional view, which comes from traditional grammar, holds that each grammatical category is distinguished from the others by a particular meaning, so that each grammatical category is associated with a unique meaning. This is the basis for the view that a noun is the name of a person, place, or thing, that a verb denotes an action, etc.

If we consider this view, we quickly find that there are nouns that do not fit into this criterion- for example, what about nouns such as destruction or attempt (known as nominalizations, in that they are nouns that are thought to be derived morphologically from verbs):

1) Rome’s destruction of Carthage.

2) John’s attempt to frame Susan.

Also interestingly in this connection is a grammatical construction that was dubbed by the late Danish grammarian Otto Jespersen as the light verb construction. This construction is exemplified by a sentence-type that has the appearance of containing a transitive verb; however, the verb does not seem to carry any meaning, and the main “predicate” of the sentence is carried by the noun. English shows it in the “make the claim” construction. Note that (3) and (4) are synonymous.

3) John made the claim that he was descended from Thomas Jefferson.

4) John claimed that he was descended from Thomas Jefferson.

Japanese shows it as well ( Grimshaw & Mester (1988)). An example is (5) (their (2)(a):

(5) John-wa Bill-to AISEKI-o shita.

John-Top Bill-with table-sharing-Acc suru-Past.

John shared a table with Bill.

AISEKI (table-sharing) is a noun, and has the characteristics of nouns. For example, it takes a particle known as a Case-marking particle, Case being a flag, roughly, of the grammatical or semantic function of a noun. Semantically, however, AISEKI (capitalized here simply for typographical emphasis) functions as the main predicate in the sentence, as can be seen from the translation, and shita , a form of the element suru, has all of the characteristics of a verb in Japanese-for example, the fact that it is inflected for Tense.

If we assume that parts of speech are defined in terms of meaning, we assume, as a null hypothesis, that this form-meaning correspondence is universal (i.e., holding across all languages)-the reason being that it is not possible to figure out how different principles of form-meaning correspondence, which would have to be very abstract, could not readily be learned by children who are acquiring their first language. Therefore, we would assume that Japanese and English would have the same principles relating parts of speech to meaning.

The light verb construction is interesting because it highlights a mismatch between form and meaning. The object noun is the main predicate in the sentence, and the verb, which is usually thought to be the main predicate (usually described as “denoting action”-more on this below), does not seem to have any semantic content at all; rather, it seems to function to carry Tense and other information which is exclusively associated with verbs, and which every sentence needs. The term “light verb” is so-called because the verb is semantically light, not carrying much, if any, meaning.

When we look at verbs, as I mentioned above, verbs are often said to “denote actions”. However, what do we then make of the underlined elements here, all of which are thought to be verbs, and none of which seem to describe actions by the subject?

5) a. Germany endured a crushing defeat.

b. Jones underwent surgery.

c. Bill suffered a fatal blow to the head.

The underlined elements above all have an understood beginning and end, a characteristic of actions, but they do not denote the volition, or free will, on the part of the subject that is characteristic of actions. For example, the subject of run chooses to run, while the subject of each of the above elements doesn’t choose to perform an action that each of the elements denote. You don’t normally choose to endure something, or suffer something, and you can choose to undergo surgery (if it’s elective), but you can undergo surgery that is totally involuntary.

In fact, the philosopher Zeno Vendler came up with a classification for verbs in 1967 (Zeno Vendler, Linguistics in Philosophy, Cornell University Press). He classified verbs into accomplishments, achievements, activities, and states. Accomplishments are actions that have a definite result, which the subject intends to bring about. An example is (6):

6) John built a house.

Achievements are events that have a definite result, but the subject does not intend to bring this about, such as dying or being born:

7) John died. ( a rather dubious achievement, but an achievement nevertheless!)

Activities are actions that do not necessarily have a result, so that

the notion of success is not an integral part of the felicitous use of the verb. An example is walking:

8) John walked.

States are timeless, and, unlike the other three semantic types, don’t have a beginning and end. Examples:

9) a. John knows French.

b. John understands this point.

For our purposes, we see a real heterogeneity in Vendler’s classification, such that it is difficult if not impossible to pick out a single aspect of meaning that unites all four types of verbs.

If we drop the idea that we classify words into parts of speech based on meaning, we are left with a distributional basis for parts of speech, and this was the idea of the structuralists, who essentially founded American descriptive linguistics (Leonard Bloomfield (1933), Language, Holt, Rinehart, & Winston; Zellig Harris (1951), Structural Linguistics, University of Chicago Press). In this view, elements are classified into word classes on the basis of the environments in which they can appear in sentences. To illustrate, consider the class of elements that can appear in the slots in (10) , (11), and (12). Think of ten elements that can appear in (10), and ten elements that cannot.

10) _____interest me.

11) I talked about ___.

12) John likes ___.

Restricting our attention to single words, ten elements that can appear in the underlined slot in (10) are: ideas, people, cards, sheep, pencils, lectures, classes, dogs, journals, computers. Ten elements that cannot appear there are: at, laugh, from, angry, yellow, grow, because, incidentally, never, not. We set up classes, then, of elements that can appear in a large number of identical environments, and which are said (to introduce a somewhat technical term) to be mutually substitutable. As a matter of historical accident, we call these terms nouns, verbs, adjectives, etc., but they could have been called anything. In any event, a grammatical category is defined as follows:

13) A grammatical category = a class of elements whose members are mutually substitutable (i.e., interchangeable without any diminution of acceptability of the resulting string) in a sufficiently wide range of environments.

You will recall that we are trying to model as linguists the linguistic abilities of fluent native speakers of the languages that we are describing, and it is useful in this connection to return to the Jabberwocky example that I discussed in the first lecture. You didn’t know the meanings of the words there, by design, since they were nonsense words, and yet you were able to understand the string The blithy toves did gyre and gimble by virtue of the environments in which the words appeared, and were able to assign the words to grammatical categories (known as parts of speech). The only basis for this assignment had to be by applying the definition of grammatical category given in (13), since there was no other.

The definition in (13) has a caveat, however-namely, the italicized phrase. This is the point at which science becomes art, in that we really have no way of determining in advance which set of environments are the right ones to pick out the correct set of grammatical categories. We will now consider the pitfalls in two polar extremes in applying the mutual substitutability criterion- requiring mutual substitutability in all environments, and requiring mutual subsitutability in only one environment.

Let us first look at the consequences of requiring mutual substitutability in all environments, so that we are considering definition (14), which we will call Straw Man I, as a replacement for (13):

14) Straw Man I:

Grammatical category= a class of elements whose members are mutually substitutable in all environments.

Let us now consider the following words:

(15) like, put, elapse, dash, grow (meaning become), become, persuade.

These words all have somewhat different distributions, and do not occur in all of the same environments. The word like, for example (in the sense of being fond of), only occurs before a noun:

15) a. John likes pizza.

b. * John likes.

The word put only occurs before a noun and a preposition:

16) a. John put books on tables.

b. *John put books.

c. *John put.

The word elapse cannot occur directly before a noun, and may occur finally in the sentence.

(17) a. Time elapsed.

b. *Time elapsed the day.

The word grow can occur before an adjective, but not a noun:

17) a. John grew despondent.

b. *John grew a lawyer.

The word become can occur before an adjective, or before a noun:

18) a. John became despondent.

b. John became a lawyer.

The word persuade occurs before a noun, and, optionally, a sentence.

19) a. John persuaded Sally.

b. John persuaded Sally that Clinton should be impeached.

The word dash requires a following P, as in (20):

20) a. He dashed into the room.

b. *He dashed.

The point is that all of these words differ from one another in their distributions; none of them can be said to be mutually substitutable in all environments. Can they therefore be members of the same grammatical category if we adopt Straw Man I, which would require that members of the same grammatical category be mutually substitutable in all environments?

Obviously not, so we’d have to set up seven totally distinct parts of speech (call them Thelma, Louise, Bob, Carol, Ted, Alice, Mortimer), so that we would have the following assignment:

(21) is of category Thelma.

21. b. put “ “ “ Louise.

c.elapse “ “ “ Bob.

d.grow “ “ “ Carol.

e.become “ “ “ Ted.

f.persuade “ “ “ Alice.

g.dash “ “ “ Mortimer.

What’s wrong with having all of these words as separate parts of speech?

Recall that in our last lecture, we talked about three criteria by which we could evaluate proposed grammars: observational adequacy, descriptive adequacy, and explanatory adequacy.

Descriptive adequacy involves a grammar not only generating the right forms, but doing it in such a way as to show the regularities that speakers make about their language.

While the seven words above have differences in their distributions, they also have similarities. For example, they all agree with the preceding nouns:

22 a. John likes pizza.

b. People like pizza.

23.a. John puts books on tables.

b. People put books on tables.

24.a. John grows despondent.

b. People grow despondent.

25. a. John becomes despondent.

b. People become despondent.

26. a. John persuades us that Clinton will be impeached.

b. People persuade us that Clinton will be impeached.

27. a. John dashes into rooms.

b. People dash into rooms.

The (a) forms agree with the singular noun John, while the (b) forms agree with the (irregular) plural form people.

A second similarity that all of the words have above can be generalized as follows:

28. Every English declarative sentence must contain an element which is capable of agreeing with a noun that is at or near the beginning of the sentence.

When I say “is capable of agreeing”, I mean to say that the form does not always have to show agreement. For example, there are a class of elements that we will discuss soon, known as helping verbs, such as can, could, shall, should, may, might , have, and be, and when these forms occur, the words above do not agree. Example:

29) a. John would become despondent.

b. People would become despondent.

However, in the absence of helping verbs, the words above do show the normal agreement pattern.

There are, then, two similarities that are shared by the seven words given in (21). Suppose we were to specify our syntax of English as in (30):

(30)S----( N { Thelma N}

{Louise N P }

{Bob }

{Carol A }

{ Ted {A} }


{ Alice N (S)}

{Mortimer P}

A word about notation. The symbols “{ “and “} “ are known as curly brackets or braces. Their use signifies that one must choose one of the rows that they enclose. Parentheses (“(“ and “)”) in grammatical descriptions indicate that the element is optional-it can be present but need not.

It is clear that the use of the curly brackets in (30) to describe the structure of English sentences misses the fact that all of the elements have something in common, and we would have to use the same set of elements within curly brackets to formulate agreement in English, along the following lines:

30) The first noun in the sentence agrees with the first instance of:

{ Thelma }

{ Louise }

{ Bob }

{ Carol }

{ Ted }

{ Alice }

{ Mortimer }

Because we linguists are trying to model the knowledge and abilities of native speakers, our descriptions are really models of what goes on in language users’ minds. In other words, the notation of our grammar makes claims, and the curly brackets notation essentially makes the claim that the elements within the curly brackets are just an un-related set of elements. We are not reflecting the fact that the words have an intrinsic relationship to one another, a fact that gains prominence when we see that we would have to use the curly brackets with the same set of elements at more than one place in the grammar.

Therefore, we would say that a grammar that puts these words into totally distinct classes fails on the grounds of descriptive adequacy, because it doesn’t capture a linguistically significant generalization-in this case, that all of these elements are verbs.

We cannot require mutual substitutability in all environments, then, because to do so would be to have an extremely large number of grammatical categories, and we would have to formulate the processes of grammar in terms of large disjunctions of categories that would keep re-appearing, leading to the question of why the same list of elements keeps appearing in curly brackets. We would say, then, that a given grammatical category does not have all of its members appearing in exactly the same set of environments, but in enough of the same environments to warrant putting them in the same class.

However, we must have some way of capturing the differences in distribution that these members share.

A useful way to do so is to make a distinction in a grammar between the syntactic rules, or rules of formation, and the lexicon, or dictionary. To do so, let us set up our first syntactic rule as follows:

(31) S--( N V (N) ( { A} )

( { P } )

( {S } )

As I mentioned earlier, the arrow “--(” means “consists of”, and furthermore, the parentheses (“(“ and “)” ) and braces (“{“ and “}”) are abbreviations for optionality (parenthesis) and choice or disjunction (braces). Therefore, the rule (31) is really an abbreviation for 8 different rules, which are unpacked in (32):

(32)a.S--( N V

b. S--( N V N

c. S--( N V N A

d.S--( N V N P

e. S--( N V N S

f. S---( N V S

22. g. S--( N V P

23. h. S--( N V A

24. The technical term for a rule such as (31), which uses abbreviatory conventions to collapse a number of rules, is rule schema, an abstraction over more than one rule which has the appearance of only being one rules.

25. Furthermore, application of one of the rules in (32) will create a structured representation, so that application of, for example, (32)(b), will create a representation as in (33):

26. (33)S


And we could show that the second line was formed from the first by drawing lines from the first symbol to the elements that are introduced in the second line, as in (34):

34) S


Rules as in (32a-g) are known as phrase-structure rules, because they show how phrases are structured. We can say that the phrase-structure rules generate, or create, structured representations of sentences, which are known as phrase-markers.

27. Phrase-structure rules have at most one symbol to the left of the arrow, a necessary restriction, as we will see. The arrow is said to be an instruction to rewrite the symbol to its left as a sequence of symbols on the right, and the symbols on the right of the arrow are called the expansion of the symbol on the left.

28. However, notice that (34) does not contain any words, and we still do not have any device for registering the fact that not all verbs can, for example, occur in the phrase-marker in (34).

29. Suppose that we extend our definition of grammatical categories, based on mutual substitutability in a sufficiently wide range of environments, and say that a grammatical category can be composed of different subcategories, categories that are members of some larger category but which have some further characteristic in common.

30. We now say that phrase-structure rules or, more precisely, phrase-structure rule schemata, only take us as far as generating the phrase-markers up to, but not including, the point at which the words are inserted into the phrase-markers. The words are said to be drawn from the lexicon, or dictionary. A lexicon contains a list of lexical items (words in the dictionaries with which we are all familiar), each of which contains a lexical entry, or information about that lexical item which is not predictable by more general rule.

31. (15) like, put, elapse, dash, grow (meaning become), become, persuade.

32. So, a sample lexicon, to take account of the subcategory membership that each of the eight words given in (15) would have, would be as in (35):

33. (35) like, V, +[____N]

34. elapse, V, +[___#]

35. put, V, + [___N P]

36. grow, V, +[___A]

37. become, V, +[____{N } ]

38. {A }

39. persuade, V, +[____ N (S)]

40. The part of the lexical entry that occurs after the italicized lexical items in (35) which begins with the +sign is called the subcategorization frame.

41. We can think of the subcategorization frame, which shows the subcategory membership of the particular lexical item, as the information that encodes the restriction as to the particular environment in which the lexical item may be inserted into the phrase-marker.

42. Hence, if we add (36) to our lexicon, we can generate a sentence such as (37).

43. (36) people, N

44. pizza, N

45. (37) People like pizza.

46. The sentence would be generated as follows. First, our phrase-structure rule would generate (34), after which, we would insert the two nouns, and the verb, into the phrase-marker. On the other hand, we could not generate (38) because elapse’s subcategorization frame would only allow it to be inserted into a V position that was final in the phrase-marker.

47. (38) * People elapse pizza.

48. Lecture 3- Phrase-Structure

49. Let us consider the definition of grammatical category that was given in the last lecture as (13), repeated here:

(13) A grammatical category = a class of elements whose members are mutually substitutable (i.e., interchangeable without any diminution of acceptability of the resulting string) in a sufficiently wide range of environments.

50. Last time, we considered single words that are interchangeable in many of the same places, placing them into a single category. However, the same definition also shows us that sequences of words are mutually substitutable for single words.

51. To see this, let us again consider the set of environments that we used to test for noun-hood in the last lecture, (10-12), repeated here:

(10)_____interest me.

11) I talked about ___.

12) John likes ___.

Last time, we saw that words can be divided up into classes, so that a certain class of words can fill in the blanks in (10-12), and we call that class the class of nouns. However, just as single words can be substituted for other single words in many environments, justifying grouping them into a class, sequences of words can be substituted for single words. By parity of reasoning, we would therefore say that such sequences are members of the same classes as the single words.

For example, if we look at the environments in (10-12), we see that all of the elements in (1) are mutually subsitutable:

1) a. books

b. big books

c. the books

d. the big books

e. the big books about Nixon

We therefore must allow these sequences of words to form a single class, and we would say that, e.g., the big books about Nixon is of the same grammatical category as books, i.e. a noun. However, we must revise our phrase-structure schema in (31) of Lecture #2 in order to allow for nouns to be rewritten as sequences of words. It would seem, then, that we must add (2) to our phrase-structure component:

2) N--( (Art) (Adj) N (P N)

in which “Art” stands for the category Article (including the, a, many, some, few, etc.)

So, let us now generate sentence (3):

3) The big books about Nixon interest me.

We start with the symbol S (for sentence), and we apply one of the phrase-structure rules that is included in the phrase-structure rule schema, to yield the following sequence:

4) S


We have two options now for the symbol N. We could simply go to the lexicon and insert two Ns and a V, or we could now apply our new phrase-structure rule given in (2), to yield the sequence Art Adj N P N, so that we would have the following sequence:

5) S


Art Adj N P N V N

We now go to our lexicon, which includes the following:

the , Art

big, Adj

books, N

about, P

Nixon, N

Interest, V, +[___ N]

Me, N

Let us now introduce some terminology. We would say that the grammar generates, or creates, a set of sentences by allowing a set of derivations of those sentences. A derivation is a sequence of representations such that each representation , except for the initial representation, is formed from the preceding representation by a rule of grammar. One symbol is designated as the initial symbol of the grammar, and every derivation must therefore begin with this symbol. In this case, the designated initial symbol is S, and so every derivation must begin with S.

A symbol which appears to the left of an arrow in a phrase-structure rule is said to be a non-terminal symbol, and a symbol which does not appear to the left of any arrow is said to be a pre-terminal symbol. Lexical items, which are introduced by the lexicon, are said to be terminal symbols.

Therefore, the grammar so far, with the phrase-structure rules in (6) and the lexicon above, will generate the phrase-marker in (7):

(6)a. S--( N V (N) ( { A} )

( { P } )

( {S } )

b. N--( (Art) (Adj) N (P N)



The problem with the grammar in (6), however, is that, while it will generate grammatical sentences such as (3), it will also generate many sentences that we will want to say are ungrammatical. In this sense, the grammar is said to be too powerful, in that it does more than we want it to be able to do.

We can see this by considering the symbol N. The phrase-structure rules operate in a top-down fashion, beginning with the designated initial symbol S. Therefore, whenever we reach the symbol N, we can, by the phrase-structure schema expanding N in (6), take any of the options for expanding N that the schema permits. For instance, we could allow the first N, for example, in, e.g. Art Adj N P N V N, to be expanded as Art N, generating such strings as The big the books about Nixon interest me.

The grammar in (6) exhibits a property that is known in mathematics as recursion, the ability of a device to re-apply to its own output an infinite number of times. The recursion in the phrase-structure component of the grammar results from the same symbol that appears to the left of the arrow appearing in an expansion of that symbol.

Exercise: Generate five ungrammatical sentences using the recursive power of N.

A way to solve this problem was suggested by Zellig Harris in his book, Structural Linguistics( (1951), University of Chicago Press). Harris suggested assigning integers to the different occurrences of N, so that the symbol N that appears to the left of the arrow would be notated as N1, and the symbol of N that is the “simple” instance of N would be notated as N0. Hence, the phrase-structure rule that expands N would be formulated as in (8):

(8) N1--( (Art) (Adj) N0 (P N1)

and the rule that expands S would be reformulated as in (9):

(9) . S--( N1 V (N1) ( { A } )

( { P N1 } )

( {S } )

I should emphasize that while the recursion is unwanted in this instance, it is not always unwanted. Indeed, if we look at the phrase-structure rule schema in (9), it still contains some recursion.

Question: Where is the recursion in (9)? Can you think of an instance in which it correctly describes an aspect of English syntax?

Returning to Harris’s superscript notation, we note that it does two things: (i) it reflects the similarity between N0 and N1 by giving them the same category label (N); (ii) it differentiates them by the superscripted integer. We would interpret the superscript notation by saying that the higher level integer is a projection of the lower level integer.

This is another way of saying that there are simple nouns (N0) and noun phrases (N1). Are there higher-level projections of other categories?

It seems that there are. We will now look at the evidence for phrasal projections of V, A, and P. Let us take these in turn, utilizing our definition of a grammatical category as a class of elements whose members are mutually substitutable in a sufficiently wide range of environments.

A. The category V

Let us consider the environment in (10):

10) John___.

We can substitute the following for one another in that environment:

11) laughs, plays the harmonica, puts his coat on the rack, feels angry.

So again, we see that we can substitute sequences of words for single words (i.e. plays the harmonica for laughs). However, we need to test for substitutability in a number of different environments; a single environment will not do. Otherwise, the underlined elements in (12) would be members of the same grammatical category.

(12)a.They became angry.

b. They became lawyers.

Were we to place angry and lawyers into the same grammatical category, we would then have to complicate our grammar by having extremely complicated subcategorization frames, in order to account for why, e.g. angry, does not occur in so many of the environments in which lawyers appears, and vice versa. It seems that, for become, we need to have a disjunctive subcategorization frame as in (13):

12) become, V , +____ { N1}

{A }

We will return to this subcategorization frame in the next section, when we see the need for an A1.

With respect to verbal units that are larger than simple verbs, however, we can find environments that take verbs as well as such larger units. Specifically, consider subordinate clauses introduced by the word though, as in (13):

13) Though he may cry, it won’t matter.

The verb may also appear before the word though, as in (14):

14) Cry though he may, it won’t matter.

We now have another environment for verbs, other than the “normal” environment at or near the end of the sentence. Notice that when the verb appears before though, the position of the verb after the subject is not occupied by the verb.

The position before though can be occupied by sequences that include verbs plus additional material, and when the sequence precedes though, none of the material can appear after the subject:

(15)a. Play the harmonica though he may, it won’t matter.

b. Put his coat on the rack though he may, it won’t matter.

c. Feel angry though he may, it won’t matter.

Another environment occurs in coordinate sentences, as in (16):

(16)a.He said he would laugh, and he did laugh.

b. He said he would play the harmonica, and he did play the harmonica.

c. He said he would put his coat on the rack, and he did put his coat on the rack.

d. He said he would feel angry, and he did feel angry.

The verb can appear before the subject in the second conjunct:

17) He said he would laugh, and laugh he did.

However, the sequences that consist of the verb plus additional material also appear before the subject in the second conjunct:

18) a. He said he would play the harmonica, and play the harmonica he did.

b. He said he would put his coat on the rack, and put his coat on the rack he did.

c. He said he would feel angry, and feel angry he did.

Hence, we find the sequences play the harmonica, put his coat on the rack, and feel angry, as members of the same grammatical category as cry or laugh, and hence we would be justified in calling them verb phrases. Hence, we would revise our rule (9) further, repeated here, as (19):

(9) S--( N1 V (N1) ( { A } )

( { P N1 } )

( { S } )

(19) S-( N1 V

V-( V (N1) ({A} )

({P N1})

({ S } )

Hence, we have a major division of the sentence into two parts, consisting of a noun phrase and a verb phrase, so that a sentence such as (20) would have the phrase-marker in (21):

(20)The man read the book.

21) S

N1 V

Art N0 V N1

The man read Art N0

the book

At this point, a question arises. We established that it was necessary to distinguish levels of projection for nouns, via the superscript notation. Is it necessary to distinguish levels of projection in the same way for other categories, such as verbs?

Recall that we motivated the device of assigning integers to grammatical categories as a method of preventing unwanted recursion. If we look at the rule for expanding Vs in (19), it would permit a potentially infinite sequence of verbs, as in, for example, (22):

22) V

V N1

V N1

V N1

V N1

We can get around this problem by simply using the superscript notation for Vs as well, so that the phrase-structure component in (19) would be revised to include the rules in (23):

23) S-( N1 V1

V1--( V0 (N1) ({ P N1)}

({ A } )

( {S } )

B. The Category A1

Let us look at some environments for simple adjectives. We have seen one, following the verbs become and grow. In addition to this environment, we can find adjectives occurring after the verb consider followed by a noun phrase:

24) I consider the man crazy.

We can also substitute adjectives that are modified in that environment:

25) a. I consider him fond of chocolate.

b. I consider him partial to vanilla.

Furthermore, just as we can question simple adjectives, in which they appear at the front of the sentence (we will return to question formation later), modified adjectives can occur in that position:

(26)a. How angry are you?

b. How fond of Sally are you?

c. How partial to vanilla are you?

Hence, we can justify a phrasal projection of A as well.

C. The Category P1

In a sense, the category P1 is the easiest category to motivate, in the sense that prepositions usually occur with following Ns, as in (27):

26) John ran to Mary.

However, Joseph Emonds has argued (in “Evidence that Indirect Object Movement is a Structure-Preserving Transformation”, Foundations of Language (1972)) that , just as certain verbs, such as elapse, are intransitive (i.e., don’t take objects), or are optionally intransitive, such as eat, there are intransitive prepositions as well. For example, consider the verb put, which requires a locative prepositional phrase ( see the subcategorization frame for put in (35) of Lecture #2)). Hence, (27) is unacceptable:

(27)(a) *John put the book.

b) John put the book on the table.

However, certain single words can satisfy put’s requirement of having an element after the object:

27) John put the book on.

Aside from Emonds’ view of words such as on, which didn’t occur with a following noun phrase, there has been another view--- that of Bruce Fraser, who analyzed such words in Fraser (1965) ( An Examination of the Verb-Particle Construction, MIT Doctoral Dissertation) as particles. Hence, Fraser posited the category Prt.

Let us consider the two views more closely, and formalize them in our phrase-structure grammar. Emonds posited a set of phrase-structure rules that included (28):

28) a. V1---( V 0 (N1) (P1)

b. P1-( P0 (N1)

Fraser’s view can be described as follows:

29) a.V1-( V0 (N1) ({P1 } )

( {Prt } )

b. P1-( P0 N1

Let us compare the two views. Morris Halle, in a 1962 paper entitled “Phonology In Generative Grammar” ( published in the journal Word), proposed what he called a simplicity metric for comparing two grammars. Simplicity was measured in terms of the number of symbols in the grammar, and the idea was that if two grammatical descriptions of the same outputs were compared in terms of number of symbols in each , and Grammar A had less symbols than Grammar B, Grammar A was simpler, and hence to be preferred.

Suppose we apply the simplicity metric to (28) and (29), counting grammatical categories and abbreviatory conventions (parentheses and curly brackets). Notice that Fraser’s analysis also requires an expansion for P1. He would eliminate the parentheses around the N1 that follows P0, because he is analyzing these words as particles.

By this count, Emonds’ analysis, which has ten symbols, is simpler than Fraser’s, which has eleven. We can see this if we look at a wider fragment of both grammars. Let us consider the subcategorization frame for put in both grammars. Emonds would posit the subcategorization frame in (35) of Lecture #2, repeated here:

30) put, V, +[____ N1 P1]

and P1 could introduce a P0 that was either transitive , intransitive, or optionally transitive, depending on the subcategorization frame of the P0.

Fraser’s subcategorization frame would be as in (31):

(31) put, V, +[____ N1 { P1 } ]

{ Prt }

Clearly, (30) is a simpler subcategorization frame than (31), and is hence to be preferred.

Another piece of evidence that Emonds adduces is based on the distribution of the word right, which modifies some prepositions, as in (32):

31) He’ll send it right up the stairs.

32) I’ll send it right to you.

This word doesn’t modify all prepositions:

(33)* He was working right at Citibank.

53. (34)* He was talking right about Sally.

54. This word, however, can also modify the class of elements that Fraser calls “particles”, and, by our definition of a grammatical category, the word right and this word would constitute a grammatical category:

55. (35) He’ll send it right up.

56. We could simply say that the word right modifies words that express direction, and not have a syntactic statement of the distribution of this word. However, the contrast between (35) and (31) is instructive in the comparison between Emonds’ view and Fraser’s. Clearly, (35) should receive much the same analysis in terms of structure as (31). For one thing, the sentences mean much the same thing. Let us see what the structure of (35) and (31) would be under Fraser’s analysis (we will omit the helping verb will, expressed here as ‘ll, because we will come back to it in the next lecture, and it doesn’t affect the choice between Emonds’ analysis and Fraser’s). Fraser would assign (35) the structure in (36), and (31) would receive the structure in (37):


58. (36) S

N1 V1

N0 V0 N1 Prt1

He send N0 ? Prt0

it right up

37) S

N1 V1

N0 V0 N1 P1

He send N0 ? P0 N1

it right up Art N0

the stairs

Under Emonds’ analysis, (35) would receive the phrase-marker in (38), and (31) would receive the phrase-marker in (39):

38) S

N1 V1

N0 V0 N1 P1

He send N0 ? P0

it right up

39) S

N1 V1

N0 V0 N1 P1

He send it ? P0 N1

right up Art N0

the stairs

Notice that the view of these single-word categories as particles forces us to assign radically different structures in (36) and (37), while the view of these categories as being intransitive prepositions, in this case optionally intransitive, makes the structures in (38) and (39) as being minimally different. Notice that the analysis of these words as being optionally transitive prepositions places them on a footing parallel to that of verbs, which can , in some instances, be optionally transitive, as in the case of the verb eat:

40) a. He ate something.

b. He ate.

or adjectives:

41) a. John is angry with Sally.

b. John is angry.

Hence, we will assume that what has been called “particles” are really nothing more than intransitive prepositions.

D. The Parallelism of Grammatical Categories and How to Reflect It In the Grammar

Toward the end of the discussion of prepositions, we appealed to the notion that there is a certain symmetry in the way that grammatical categories are constructed. Zellig Harris originally noted this, and Chomsky developed this idea into what is now known as “X-bar theory” (N. Chomsky (1970), “Remarks on Nominalization”, in R. Jacobs and P. Rosenbaum, eds. , Readings in English Transformational Grammar, Ginn-Blaisdell). The idea is that grammatical categories are constructed according to a fixed template. It will be noted that all of the grammatical phrasal categories that we have discussed so far are of the form in (42):

42) X1--( (Y) X0 Z1**[2]

The asterisk is known as the “Kleene star” (after a mathematician named Kleene), and it means “from zero to infinite occurrences of the symbol to which it is asterisked”.

X, Y, and Z stand for arbitrary categories, with the only understanding of this notation being that each symbol stands for the same category in all of its occurrences in a given statement. It is what is known as a variable, in this case ranging over categories. In other words, all phrasal categories of N, V, A, and P have the same arrangement, and are constructed the same way, so that a P1 would have the same structure as an A1, for instance.

42) is another instance of a rule schema. It is an abbreviation for a number of different rules. Note that each phrase-structure rule has an obligatory element in the expansion, while all of the other elements in the expansion is optional. The obligatory element in the expansion is called the head, so that N0 is the head of N1, A0 is the head of A1, P0 is the head of P1, and V0 is the head of V1.

The notion of all categories being constructed in the same way is standard in modern syntactic theories of all stripes. It is also well-supported in studies of language typology, the study of the ways in which languages may be said to differ from one another. In a classic study by Joseph Greenberg (1963) (“Some Universals of Word Order With Reference to Meaningful Elements”, in J. Greenberg, ed., Universals of Language, MIT Press), for example, Greenberg classified languages into three main types -V(erb)S(ubject) O(bject), SVO, and SOV. Describing SOV as verb-final, and the other two as non-verb-final, he noted some striking correlations of the dimension of verb-finality with, for example, the fact that some languages have postpositions, while others have prepositions, so that SOV languages tended to have postpositions, while VSO and SVO languages had prepositions. He also noted that VSO languages always had SVO word order as an alternative, and we will discuss this more later.

However, Greenberg’s correlations have suggested to people that languages can be distinguished along the dimension of head position, so that languages such as, e.g., Japanese, which is SOV, are really head-final, while languages such as English are head-initial.

E. Some Terminology

At this point, it would be useful to review where we’ve come to.

A grammar is a sequence of rules that generates a set of phrase-markers, which are structured representations of sentences. For natural languages, the set of phrase-markers is infinite, even though each phrase-marker is finite in length. There is only a finite set of rules, and so the rules that generate the phrase-markers for natural languages must be recursive –i.e., have the ability to reapply to their own output a potentially infinite number of times, although there must also be non-recursive rules in the grammars of natural languages, otherwise phrase-markers would never terminate.

So far, we have only seen one type of grammatical rule- a phrase-structure rule, which determines how sentences are composed. Phrase-structure rules have the formal requirement that they can have at most one symbol to the left of the arrow in the phrase-structure rule (called the symbol to be expanded), and can have, in principle, any number of symbols to the right of the arrow (called the expansion).

A symbol that appears to the left of the arrow in a phrase-structure rule is called a non-terminal symbol. A symbol that is introduced by the phrase-structure rules, but does not appear to the left, is called a pre-terminal symbol. A grammar is set to generate a set of derivations. A derivation is a sequence of representations such that each representation is formed from the immediately preceding representation by a rule of grammar, except for a distinguished symbol that is said to be the designated initial symbol; every derivation must start with this symbol. For our grammar so far, the designated initial symbol is S.

We also assume that, apart from S, all phrasal categories are expanded according to a particular template, called an X-bar schema.

The levels of complexity of the various grammatical categories, such as N0 versus N1, A0 versus A1, etc., are called the levels of projection of the grammatical categories. So far, our X-bar schema has claimed that all grammatical categories project up to level 1. The level 1 projection of the category is said to be the maximal projection of the category (so far).

It is also useful to note some of the relations that are defined on phrase-markers. Phrase-markers can be represented in any one of a number of ways, just so long as the groupings are represented. For example, we have given phrase-markers as trees, but they could also be represented as labelled bracketings. I will represent them as trees, because I feel that they are easier to inspect that way, but this is simply an expository convenience.

The labelled points in the tree diagram are called nodes.A node A that is above another node B in the phrase-marker, such that A contains B, is said to dominate node B. If node A is the first node above node B, node A is said to immediately dominate node B.

Phrase-markers show the groupings, and these groupings of elements are called constituents. A constituent is a sequence of nodes that are all immediately dominated by the same node, such that the immediately dominating node exhaustively dominates the sequence (i.e. immediately dominates the sequence and nothing else).

In the next lecture, we will examine the X-bar schema more closely, and refine the structures of NP, VP, AP, and PP.

Lecture 4: Levels of Projection

So far, our X-bar schema has posited only one level of projection above the X0 level (essentially, the word level). Hence, our schema has all of the grammatical categories (other than S, to which we shall return) fitting into the template of (42) of lecture 3, repeated here:

43) X1--( (Y) X0 Z1*

This analysis makes the claim that, while the sequence Y - X0- Z1 acts as a constituent, there is no further constituency among these three elements. C.L. Baker (1978), in his textbook Introduction to Generative Syntax (Prentice-Hall), showed evidence, however, that there is evidence for X0 and Z1 forming a constituent, if we take X0 to be N0 and Z1 to be P1. The evidence comes from nominals such as those in (44):

1) a. the king of England

b. the picture of Fred

c. the destruction of Rome

An assumption that is made in constructing grammars is that grammatical processes only operate on constituents. With this in mind, Baker considered a phenomenon known as ones-pronominalization, an example of which is given in (2):

2) This picture is bigger than that one.

Notice that the word “one” refers to “picture”. It is said to be an “anaphoric element”, and anaphora is defined as in (3):

(3) Anaphora—the grammatical reflection of the identity of two elements.

We will be returning to other grammatical constructions that are said to be anaphoric later on, but the point about anaphoric elements is that they cannot be said to refer to anything on their own, but , rather, get their reference from some other element that is expressed, which does have an independent reference. So, in this case, “the picture” can be said to be referentially independent, and “that one” gets its reference from “the picture”.

It turns out that there are syntactic conditions on the anaphora relation between a word such as “one” and the element from which it gets its meaning. In particular, Baker notes that a PP headed by “of” cannot follow “one”, as in (4):

3) a. *The king of England is taller than the one of France.

b.* The destruction of Rome was more horrifying than the one of Carthage.

59. c. * The picture of Fred was clearer than the one of Bill.

60. Interestingly enough, however, “one” can be understood as the noun followed by the PP headed by “of”:

4) a. This king of England was taller than that one.

b. The prospect of a slow trial of the President is more damaging than a quick one.

c. This picture of Fred is clearer than that one.

It would seem, therefore, that one can refer to a noun and a PP headed by of. One way to account for this would be to ascribe a structure such as (5) to a noun phrase such as the king of England:

5) N2

Det N1

The N0 P1

king P0 N1

of N0


We then have a unit that consists of the noun and the following prepositional phrase, namely the constituent N1. We would then say that one must be anaphoric to an N1. Because the N1 is comprised in, e.g. (5), of both the noun and the following PP, the ungrammaticality of (3) stems from the violation of the requirement that one “replace” an N1, rather than just an N0.

Interestingly, not all sequences of nouns followed by a PP disallow one followed by a PP, as pointed out by Radford (1988), (Transformational Grammar: A First Course, Cambridge University Press). For example, (6) is perfectly acceptable:

6) The man with Sally left, but the one with Susan stayed.

However, we can also allow one to be interpreted as a noun plus a PP headed by with:

7) One man with Susan left, but that one stayed.

We can account for this if we posit a structure in which two things happen: (i) the sequence consisting of the noun itself is an N1, to the exclusion of the with-PP; (ii) the sequence consisting of the noun itself plus the with-PP is an N1.

In short, we would need the following phrase-structure rule for noun phrases:

8) N2-( (Det) N1

N1--( { N0 (P1)}

{ N1 P1 }

Hence, the structure of the man with Sally would be as in (9):

9) N2

Det N1

the N1 P1

N0 P0 N2

man with N1



In short, a recursive expansion of N1 allows, in some instances, a simple N0 to also be an N1, and allows a sequence N followed by PP to be analyzed as two N1s, a simple N1 followed by a PP, as well as a N1 consisting of a N1 and a PP.

Exercise: Show the phrase-marker for the underlined noun phrase in (a), using the grammar that we have developed so far:

(a) The picture of Sally with the green frame is bigger than the one with the blue frame.

A. Adjectives within the Noun Phrase

It will also be noted that the assumption that one replaces an N1 tells us about the hierarchical position of adjectives within the noun phrase, as in the blue car. Consider sentences such as (10) and (11):

10) The blue car was prettier than the green one.

11) This blue car was prettier than that one.

Our previous reasoning forces us to posit a phrase-structure rule as in (12) for N1s:

12) N1-( A N1

Hence, the structure of , e.g., the blue car, would be as in (13):

13) N2

Det N1

the A N1

blue N0


Exercise: What is the structure of the big blue car , given the following sentences?

b) The big blue car was faster than the small green one.

c) The big blue car was faster than the small one (can mean either the small blue car or the small car).

d) This big blue car was smaller than that one (can mean either that big blue car, that blue car, or that car).

Summary: We have motivated the following phrase-structure rules for the noun phrase:

(14) N2-( Det N1

N1-( {A N1 }

{ N1 P1 }

{ N0 (P1)}

We have not yet talked about genitives, such as John’s mother’s boyfriend’s sister’s teacher. Notice that genitives seem to occur in the same position as the determiner. Therefore, we can modify the rule for expanding N2 above as in (15), putting aside for the moment the mechanism by which the possessive ‘s is introduced:

(15) N2-( { N2 } N1

{ Det }

B. Implications of the structure of the Noun Phrase for Other Categories

In the last section, we examined the structure of the noun phrase, and argued for two levels of projection of the noun above the N0. If we assume that all categories are created by the same X-bar schema, this would indicate that our X-bar schema of (42) of the last section, repeated here, should be revised to (16):

(42) X1--( (Y) X0 Z1*

15) X2--( (Z2) X1

X1--( { Y2 X1 }

{ X1 Y2 }

{ X0 Y2*}

We might then say that Y is instantiated by articles in noun phrases (as well as possessives, to which we shall return).

Notice, first of all, that the determiner that is found in noun phrases, a word that appears before the head, is paralleled by degree words that precede adjectives in adjective phrases and adverbs that precede verbs and prepositions. Examples are as in (17):

16) a. The pictures of Sally.

b. quite fond of Sally.

c. completely lost his mind.

d. right up the stairs.

It will be noted that we have direct evidence, from ones- pronominalization, for a projection that includes the noun and a following PP ( in some instances- specifically, PPs headed by of), and we have generalized from that evidence to saying that all categories have two levels of projection. Are we warranted in leaping from evidence for two levels of projection in N to two levels of projection in all categories, when we have no evidence for the latter?

Recall that we are assuming that the grammar is as simple and as general as possible. We have no direct evidence for two levels of projection in adjectives, verbs, and prepositions, but we have no evidence against two levels, either. If we were to assume that there is only one level of projection for these categories, but two for the noun, we would clearly be assuming a more complicated grammar than if we assumed two levels of projection for all the categories. For this reason, we assume that there is a single X-bar schema for all grammatical categories, given in (15). Much of what we will be doing in the succeeding weeks is finding evidence for the various options given by (15) for particular sentences.

It will be noted that we have not fit S into the X-bar schema. We have not defined S so far as the maximal projection of any category. That will change in the next lecture.

B. Grammatical Relations and Grammatical Categories

Traditional grammar speaks of such notions as subject and object. Do these notions play a role in grammar? Notice that our grammar generates partial phrase-markers such as (18):

17) S


N” V”




V N”

The phrase-marker shows the hierarchical (up-and- down dominance) and linear (right-to-left precedence) relations of grammatical categories-in this case, S , N”, V”, V’, and V. What is the difference between grammatical categories and grammatical relations?

There is a once-and-forever characteristic of grammatical categories that is not present in grammatical relations. An element or sequence of elements either is or is not a given category depending on the nature of its head. Grammatical relations, on the other hand, can be deduced from phrase-markers depending on the positions of the various grammatical categories. Hence, in the partial phrase-marker in (17), an N” is a subject if it is immediately dominated by S, and an object if it is immediately dominated by V’. If the N”s were in different structural positions, they would bear different grammatical relations. This was the point made by Chomsky in Aspects of the Theory of Syntax (1965, MIT Press), who introduced the following formulations of subject and object (updated for X-bar theory):

18) subject = [N”, S] (meaning the N” immediately dominated by S)

object = [ N”, V’] (meaning the N” immediately dominated by V’)

If grammatical relations are predictable from the positions of the elements that bear them, the reasoning goes, they should not be represented in phrase-markers, for the same reason that we do not represent regular plurals in the lexical entries of nouns. We only represent in representations what we cannot predict from something else.

C. Some Terminology

The grammatical relations of subject and object are akin to other grammatical relations that are particular to X-bar theory. Let us take a representation as in (20), which is generable from the schema in (15):

(20) X”


Z X’


Y” X’


Y” X’


X’ Y”


X Y”

In phrase-markers, it is convenient to use the notions sister and daughter, defined in terms of immediate domination. These notions are defined as in (20):

(21) A is a sister of B if A and B are both immediately dominated by the same node.

(22)A is a daughter of B if B immediately dominates A.

We can now define the following grammatical relations:

(23)a.A is a specifier if A is a daughter of X”.

b. A is an adjunct if A is a sister and daughter of X”.

c.A is a complement if A is a daughter , but not a sister, to X’.

These notions will play a role in the next lecture, when we try to integrate S into the X-bar system. For now, however, let us note the following.

One of the original motivations for the X-bar system was the attempt by Chomsky to capture the similarity in understood semantic relations between sentences and nominalizations, as in (24):

24) a. Rome destroyed Carthage.

b. Rome’s destruction of Carthage.

We could say that, in (b), the N” that realizes the agent semantic role is a specifier, but we cannot say that the agent is the specifier of the phrase in (a), because the notion of a specifier is defined in X-bar terms, and S is not (so far) an X-bar projection. We could say that the object in (a) is really a complement, and the notion of a complement is realized in sentences (which contain V’s) and noun phrases (which contains N’s).

We will return to this in the next lecture.

Lecture 5- The Helping Verb System: The Need for Transformations

In the last couple of lectures, we have examined the phrase-structure of the projections of nouns, verbs, adjectives, and prepositions. We have posited a single schema for constructing phrasal categories-namely, the one in (16), repeated here as (1):

(1) X2--( (Z) X1

X1--( { Y2 X1 }

{ X1 Y2 }

{ X0 Y2*}

Obviously, however, the rule for expanding S, which is given in (2), does not fit this schema:

3) S-( N” V”

For one thing, there is no element in the expansion of S that could be considered the head, since both elements are obligatory. For another, both elements are X”s in the expansion.

We will now bring S into the X-bar schema fold by analyzing it as a projection of Tense. In so doing, we will introduce another type of syntactic rule known as a transformation, which converts phrase-markers into other phrase-markers.

A. The Helping Verb System of English

So far, we have analyzed sentences that consist of a subject immediately followed by a main verb, as in (4). However, sentences may also contain what are known as helping verbs, as in (5):

4) John eats steak.

5) a. John would eat steak.

b. John has been eating steak.

c. John is eating steak.

d. John would have eaten steak.

e. John would have been eating steak.

f. John would be eating steak.

g. John has eaten steak.

Have and be, when they occur as helping verbs, are said to be markers of aspect- referring to the completedness of an action or state of affairs. Have is said to mark the perfective aspect, which generally marks completion, while be is said to mark the progressive aspect, which generally marks an ongoing state of affairs. When the perfective helping verb have appears, the next verb is marked with what is known as a perfect participle, generally written as –en, while the progressive helping verb be triggers the suffix –ing on the next verb. There can be at most one perfective have, and at most one perfective be:

6) a. *John has had eaten the steak.

b.* John is being eaten the steak.

Furthermore, perfective have, and progressive be, when they both occur in a simple sentence, must occur in that order:

7) *John is having eaten the steak.

There is one set of elements that can occur before perfective have or perfective be, and this set of elements is known as the set of modals. They include the elements would, will, can, could, shall, should, may, might, or must, and again, they are mutually exclusive.

8) John {can }eat the steak.

{ could }

{ will }

{ would }

{ shall }

{ should }

{ may }

{might }

{ must }

These, then , are the facts concerning linear order of helping verbs. If we call the class of elements in (8) modals, and abbreviate this class by the symbol M, we can account for the facts (aside from the affixes on the verbs that occur with the aspectual helping verbs, which we will put aside for now), by positing, as a first approximation, the phrase-structure rule in (9):

9) S-( N” (M) (have) (be) V”.

B. Yes-No Questions In English

Let us now consider the formation of yes-no questions in English. First, consider yes-no questions in simple sentences that have all three types of helping verbs- modals, perfective have, and progressive be.

When all three helping verbs occur, the modal will appear at the beginning of the question:

10) a. Would he have been eating?

b. *Have he would been eating?

c. *Been he would have eating?

When the modal is absent, however, and have and be occur, have will appear at the beginning of the question:

11) a. Has he been eating?

b. *Been he has eating?

When the modal and have are absent, but be occurs in the simple sentence, be introduces the question:

12) Is he eating?

There is a generalization about the helping verb that appears at the beginning of a yes-no question, if one notices the corresponding order of helping verbs in the declarative. Can you guess what it is?

You’re right. It’s (13):

13) The helping verb that appears at the beginning of a yes-no question is the helping verb that would appear immediately after the first N” in the declarative version of the sentence.

We might try revising our phrase-structure rule in (9) to account for this, as in (14):

14) S-( {M N” (have) (be) } V”

{have N” (be) }

{be N” }

The phrase-structure rule in (14), while it gives us the effect in (14), doesn’t directly capture it as a generalization. The fact that the helping verb that appears at the beginning of the yes-no question is just that helping verb which would appear after the N” in the declarative is accidental. We have three separate expansions of S which gives us this result, but we could just as well have had a grammar that had (15) as the phrase-structure rule:

(15)S--( ({M } ) N” (M) (have) (be) V”

( { have } )

( { be } )

Question: What ungrammatical strings would (15) generate in forming questions?

Chomsky suggested, in Syntactic Structures (1957, Mouton) that we could get around this problem by not generating yes-no questions in English via a phrase-structure rule, but to exploit the fact that the appearance of helping verbs in questions was predictable from the order of helping verbs in the corresponding declarative sentences. Specifically, he posited a new type of grammatical rule known as a transformation, which is defined as a rule which changes a phrase-marker into another phrase-marker. In this case, the transformation, which is known as Subject-Helping Verb Inversion, is formulated as in (16):

(16) N”- { M }


{be }

1- 2 --(

2 - 1

Transformations as in (16) are thought to have two parts- a structural description, which specifies the properties that the phrase-markers must meet in order to be transformed, and a structural change, which specifies how they are to be transformed. The structural description occurs before the arrow, and the structural change occurs after the arrow. In this case, the rule is stated so that the structural description must consist of two factors. The first is the N”, and the second is the M, have, or be that immediately follows it in the phrase-marker. To see how this works, let us consider the phrase-marker for (5)(e):


In this case, the subject N” is factor 1, and the modal is factor 2. The structural description of Subject-HV Inversion is met, and the phrase-marker is altered by interchanging the two factors, resulting in (18):


The elements in braces in (16) are arranged vertically, and the braces indicate that one of the elements within the braces must be chosen, and, furthermore, that this element must be immediately after the N”, which is Factor #1. Hence, the rule could not apply to move have or be in (17), because it is the Modal which is immediately after the N”. Hence, (10)(b) and (10)(c) could not be generated because of the way that Subject-HV Inversion is formulated.

On the other hand, if the modal were absent, have, if present, would satisfy the requirements of being Factor #2, as in (19):


And then have and the immediately preceding N” would invert, yielding (20):

19) S

have N” been V”

John eating the steak

And, of course, if be is present, but not a modal or have, then be will fulfill the requirements of being Factor #2:

20) S

Be N” V”

John eating the steak

In short, the rule (16), as a consequence of its structural description, directly expresses the generalization that the first helping verb after the subject in the declarative phrase-marker is the helping verb that appears at the beginning of a yes-no question.

We now have, however, another type of syntactic rule-namely, a transformation, which has the power to add, move, or delete elements (although so far, we have only seen one transformation, which moves elements.) Hence, our grammar is organized as follows:

21) Phrase-structure Rules + Lexicon--( Phrase-marker 1


Final Phrase-marker

Of course, we have only seen one transformation so far. We will now remedy that.

C. Yes-No Questions Without Helping Verbs

How do we form yes- no questions in English without helping verbs, as in the question counterparts of (23) and (24)?

22) John visited Sally.

23) John visits Sally.

The question counterparts of (23) and (24) are (25) and (26) respectively:

24) Did John visit Sally?

25) Does John visit Sally?

It is noteworthy that the declarative forms of (25) and (26), which would be (27) and (28), respectively, are unacceptable unless emphatic:

26) *John did visit Sally.

(28)* John does visit Sally.

In short, the tense, which must be a suffix attached to the verb in the affirmative declarative, is separated from the verb in the interrogative. How can we account for this fact?

We could account for it by generating the tense away from the verb, and saying that it must attach to the verb when it is adjacent to it. First , let us revise our phrase-structure rule for introducing helping verbs from (9), repeated here, to (29):

15) (9) S-( N” (M) (have) (be) V”.

(29) S--( N” T (M) (have) (be) V”

We can then say that the tense element moves in forming yes-no questions as well, revising (16), repeated here, to (30):

(16) N”- { M }


{be }

1- 2 --(

61. 2- 1

(30) N”- T ( {M } )

( {have} )

( {be } )

1 - 2 --(

2 - 1

Finally, the rule of tense-hopping would be formulated as in (31):

(31) T - { have }

{ be }

{ V }

1 - 2 ---(

0 - 2+1

A word of explanation is in order about the “+” symbol, and what it is supposed to mean. It means that the two elements form a unit after movement. Let us illustrate with the derivation of (23), John visited Sally.

The initial phrase-marker for (23) would be (32), with the Tense generated separately:

(32) S

N” T V”

John past V’

V N”

visit Sally

The phrase-marker that results from tense-hopping is (33):

(33) S

N” V”

John V’

V N”

V T Sally

visit Past

There is a term for this type of forming a unit, as in the forming of a unit between the verb and the tense in (33). It is called adjunction, which is defined as follows:

(34)A adjoins to B iff A moves to the periphery (i.e., beginning or end of B), moves out of B, and forms a new instance of B dominating A and B.

If we look at the structural description of (31), however, it requires adjacency between Factor #1 and Factor #2. It is this requirement that the Tense be adjacent to the element that it adjoins to which seems to be violated in the formation of the yes-no question .

There are two transformations that we are looking at here, in the formation of the yes-no question. One is Subject-Helping Verb Inversion; the other is Tense-Hopping. Their formulations are repeated here:

(30) N”- T ( {M } )

( {have} )

( {be } )

1 - 2 --(

2 - 1

(31) T - { have }

{ be }

{ V }

1 - 2 ---(

0 - 2+1

Suppose the transformations are ordered, in the sense that they apply, or can only get the chance to apply, in a fixed sequence, and that (30) is ordered before (31). In that case, the application of (30) to (32) would be (35):

(35) S

Tense N” V”

Past John V’

V N”

visit Sally

Tense-hopping, given in (31), cannot apply to this phrase-marker. The structural description is not met, since Tense is not next to any possible Factor #2.

The tense is instead affixed to the form “do”. In other words, the form with “do” attached to the tense is a kind of default form, in the sense that the tense,being listed in the lexicon as an affix, needs a stem to affix to. Our final transformation is called “Do-support”, and is formulated as in (36):

36) Tense-( do +Tense, if Tense is not affixed.

So, we have the following three transformations, which apply in this order:

37) Subject-HV Inversion



The concept of ordering of these transformations is crucial here. When we say that Subject-HV Inversion is ordered before Tense-Hopping, we do not mean that Tense-Hopping applies after Subject-HV Inversion. Rather, we mean that Tense-Hopping gets its chance to apply only after Subject-HV Inversion applies, and can only apply if its structural description is met. More precisely, ordering is defined as in (38):

38) Linear Ordering= The transformations in a grammar are ordered, in the sense that if A is ordered before B, B can only apply if A has applied or had its chance to apply.

When I say “had its chance to apply”, I direct your attention to Subject-HV Inversion, and Tense-Hopping. Subject-HV Inversion is what is known as an “optional “ transformation, in the sense that , if its structural description is met, it can apply, but it doesn’t have to. Tense –Hopping, on the other hand , is known as an “obligatory “ transformation. It must apply if its structural description is met. Therefore, if Subject-HV Inversion applies, it will remove phrase-markers from the domain of phrase-markers to which Tense-Hopping can apply. In the terms of Paul Kiparsky, who discussed an analogous situation in historical linguistics ( the study of linguistic change), Subject –Helping Verb will “bleed” Tense-Hopping, in the sense that it will remove representations to which Tense-Hopping could otherwise apply.


Question: Will Subject-Helping Verb Inversion totally bleed Tense-Hopping, in the sense that it will remove any chance for Tense-Hopping to apply? Consider the following in your answer, and show how they are generated (disregard the verbal suffixes –en and -ing:

a) Has he eaten the steak?

b) Had he eaten the steak?

c) Is he eating the steak?

d) Was he eating the steak?

It is noteworthy to consider the conception of transformations that we have here, and the overall organization of the grammar. We have the phrase-structure rules, which must conform to X-bar theory, and with S as the designated initial symbol, generating phrase-markers that end with pre-terminal symbols. The terminal symbols are then inserted from the lexicon. We then have a phrase-marker, which may either correspond to a sentence of the language or not, depending on whether or not it must undergo any obligatory transformations.

In Lecture #6, we will see additional evidence for the rule of Tense-hopping, and for the analysis of Tense that posits a level at which the Tense is separated from the verb, and will see that we do not need to change the rules of Tense-hopping and Do-support that we have adopted based on the formation of yes-no questions, when we look at negation in English and Verb Phrase Ellipsis and their interaction with Tense-hopping.

Lecture #6- Additional Evidence for Tense- Hopping and a Revision of the Phrase-Structure Rule for S

In the last lecture, we needed to account for the fact that present and past tense in English, which usually are realized as affixes on verbs, can be separated from those verbs in some instances. We accounted for this by positing a level of representation at which the tense was not part of the verb, generated by the phrase-structure rule (29) of Lecture #5, repeated here:

29) S--( N” T (M) (have) (be) V”

and formulating a transformation that turned it into a unit with the verb when it is adjacent to it, i.e. the transformation of Tense-hopping, given in (31) of Lecture #5:

(31) T - { have }

{ be }

{ V }

1 - 2 ---(

0 - 2+1

We also saw that, in the event that the structural description of Tense-hopping was not met, so that it could not apply, a rule that inserts do would apply, rule (36) of the last lecture:

35) Tense-( do +Tense, if Tense is not affixed.

The analysis in which tense is generated separate from the verb and then affixed to it transformationally, unless some earlier transformation destroys the adjacency between Tense and V, was motivated by the fact that we could find an earlier stage of the derivation at which Tense and the verb were separated from one another, and some process applies at this earlier point to destroy the environment for Tense to attach to the V. We found such a process in our examination of Subject-Helping Verb Inversion. We will now find two other transformations that destroy the required adjacency between Tense and V, a transformation that places negatives and a transformation that elides verb phrases, and we will see that our rule of Tense-hopping, required by the interaction of the occurrence of Tense and the distribution of yes-no questions in the last lecture, carries over without modification to the analysis of Tense in these other two areas of English syntax. One set of rules that is motivated on the basis of the analysis of one area of grammar must be the set of rules that is motivated on the basis of other areas of grammar. We don’t have an analysis of Tense for yes-no questions that is different from the analysis of Tense for negation. If we require a different analysis for different areas, we assume that we must go back to the drawing board, and that one of our two analyses is wrong. A grammar is an inter-locking system.

A. Negation

There are two types of negation in English –sentential negation and constituent negation. Sentential negation is the negation of a sentence, and constituent negation is the negation of a part of the sentence. For example, consider two possible interpretations of (1):

1) John could not read the book.

One interpretation is that John is unable to read the book, and could be paraphrased as (2):

2) It is not the case that John could read the book.

In this case, we would say that, with reference to the semantics, or meaning, of the sentence, the negative is taking scope over the modal.

A second interpretation of (1) is that John is able to refrain from reading the book. These two interpretations correlate with non-semantic distinctions, such as phonological or morphological ones. For example, it is possible to contract the negative onto a helping verb, but only if the negative is an example of sentential, rather than constituent, negation. For example, (3) can only have the intepretation of (2), and cannot mean that John is able to refrain from reading the book:

3) John couldn’t read the book.

We will be concentrating on the distribution of sentential negation for now.

The Placement of Sentential Negation

To consider the placement of sentential negation, let us consider an affirmative sentence, i.e. one that lacks negation. First, consider a sentence with all the helping verbs present:

4) John could have been reading the book.

The negative goes perfectly well after the modal:

5) John could not have been reading the book.

It cannot occur after have:

(6)* John could have not been reading the book.

It also cannot occur after be:

(7)* John could have been not reading the book.

However, if the modal is absent, the negative will most naturally occur after have:

8) John has not been reading the book.

9) * John has been not reading the book.

If both the modal and have are absent, the negative will occur after be:

10) John is not reading the book.

It seems , then, that the generalization about where the negative will occur is the


11) The negative will occur after the first helping verb in the sentence, and there is only one sentential negation in a sentence.

Now, we will try to account for (11) within the grammar, putting Tense aside for the moment. Assume the phrase-structure rule in (12):

(12) S---( N” (M) (have) (be) V”

It is impossible to introduce negation by the phrase-structure rules and keep to (12). Let us see why. Suppose we introduced the negative directly after the modal:

(13)S-( N” (M) (Neg) (have) (be) V”

We could generate (5), but we could not generate (8) or (10). Similarly, if we generated the negative after have, we could generate (8), but we would also incorrectly generate (6), and could not generate (5) or (10). Similarly, if we generated the negative after be, we could generate (10), but we would incorrectly generate (9) and (7).

The impossible phrase-structure rules that are hypothesized in the preceding paragraph are (14) and (15):

14) S-( N” (M) (have) (Neg) (be) V”

15) S--( N” (M) (have) (be) (Neg) V”

If we accounted for the multiplicity of positions for the negation by introducing the negative and three different points in the phrase-marker, as in (16), we would run afoul of the generalization that there can only be one negative per simple sentence, and would incorrectly generate (17):

16) S---( N” (M) (Neg) (have) (Neg) (be) (Neg) V”

17) * John could not have not been not reading books.

Chomsky, in Syntactic Structures, proposed a solution which involved dropping the assumption that negatives are present in the initial phrase-marker. He proposed, instead, that negatives are inserted via a transformation after the first helping verb in the phrase-marker. This rule , called negative placement , was formulated as in (18):

(18) N”- {M }

{have }

{ be }

1- 2 --(

1- 2- Neg

As in the formulation of Subject-Helping Verb Inversion in the last lecture, we again make reference to the notion “first helping verb after the subject”, not in the phrase-structure rules, but in the structural description of a transformation. We will return shortly to the question of why this set of elements is mentioned in the structural description of two separate transformations, Subject-Helping Verb Inversion and Negative Placement, but we will assume the formulation of Negative Placement in (18), and, as we did with Subject-Helping Verb Inversion, consider the distribution in sentences without helping verbs:

18) a. John did not read books.

b. John does not read books.

Again, the tense does not appear as a suffix on the verb, but is separated from it, appearing instead on a form of do. We can account for this by generating Tense as an element separate from the verb, as in the phrase-structure rule (29) in Lecture #5, repeated here:

(29)) S--( N” T (M) (have) (be) V”

and reformulating negative placement as in (19):

(19) N”- T ({M })

({have } )

( { be } )

1- 2 --(

62. 1- 2- Neg

We would then order negative placement before Tense-hopping, repeated at the beginning of this lecture. Hence, the deep structure (initial phrase-marker) of (18) would be (20):

(20) S

N” T V”

N’ Past V’

John V N”

read N’



Negative Placement will insert the negative after T, transforming (20) to (21):

(21) S

N” T Neg V”

N’ Past V’

N V N”

John read N’



It is clear, however, that Tense-hopping cannot apply, because T and V are not adjacent, the adjacency having been destroyed by the insertion of the negative element.

Do- support will then apply, as in (36) of Lecture #5.

In short, the same analysis of Tense that we needed for the distribution of yes-no questions is needed for the analysis of negation.

B. Verb-Phrase Deletion

In English, there is a process that allows verb phrases to fail to be expressed. An example is (22):

22) John reads books, and Bill does __,too.

Which means (23):

23) John reads books, and Bill reads books, too.

Interestingly enough, verb phrases can only fail to be expressed when they follow a helping verb. There are verbs in English that take verb phrases as complements, typically the verbs of temporal aspect : start, begin, continue, stop, and keep on. Verb phrases that appear after the verb begin, for example, cannot elide (pointed out by Joan Bresnan in a 1976 article , “On the Form and Functioning of Transformations”, Linguistic Inquiry, Vol. 7):

24) *First fire began pouring out of the building, and then smoke began___.

It therefore seems as though Verb Phrase Ellipsis requires what Bresnan calls a context predicate, or trigger, to be mentioned in the structural description of VP-Ellipsis:

25) {M } - V”

{ have }

{ be }

1 - 2 ---( 1-O

There is an aspect of verb phrase ellipsis that is not specifically mentioned in (25), which is that the verb phrase that is deleted must be identical to another verb phrase that is specifically mentioned In the terminology of (3) of Lecture #4, we would say that the null element must be anaphoric to another verb phrase in the sentence, so that (22) cannot mean, for instance, (26):

26) John reads books, and Bill drinks wine, too.

Now, again notice that, in (22), the tense remains as a suffix to do, while the verb, which is part of the verb phrase, has been deleted. We can account for this by saying that the rule of Verb phrase Deletion, formulated finally as in (27), is ordered before Tense-hopping:

27) (VP-Ellipsis){T } - V”

{ M }

{Have }

{be }

1 - 2 --(

1 - 0

Applying VP-ellipsis to the deep structure of the second conjunct of (22) generates the phrase-marker in (28):

28) S


N” T

N’ Past



Tense-hopping cannot apply, in this case because there is nothing for the Tense to hop onto (i.e., no Factor #2), and so Do-support will apply, yielding (29):

29) Bill does.

In this case, again, the same rules of Tense-hopping and Do- support that we needed for Yes-No Questions and the placement of negation are needed for the distribution of tense in sentences with elided verb phrases, confirming the original analysis of Tense. This is what is meant by the grammar being an inter-locking system, with rules that are justified on the basis of one set of considerations internal to the grammar having to jibe with rules that are needed for other areas of grammar.

Lecture #7- S and the X-Bar System

So far, we have the following phrase-structure rule as the initial phrase-structure rule in the grammar:

(1) S--( N” T (M) (have) (be) V”

All of the other phrasal categories, however, are constructing according to the X-bar schema in (2):

(2) X”-( (Z) X’

X’--({ Y” X’ }

{ X’ Y” }

{ X (Y”) }

There are several differences between the phrase-structure rule for S that is given in (1) and the X-bar schema in (2), standing in the way of reformulating S as an X”:

i) There is no X’ under the S, i.e. a single obligatory category.

ii) There are a string of non-phrasal elements directly under the S as sisters, i.e. T , M, have, and be.

There is some evidence that modals, have, and , be are not sisters, but rather that have and be must head their own VPs. Consider the prediction that (1) makes with respect to VP-ellipsis if we assume, as we did in Lecture #6, the following formulation of VP-ellipsis:

(3) (VP-Ellipsis){T } - V”

{ M }

{Have }

{be }

1 - 2 --(

1 - 0

Assuming (1) would give us the phrase-marker in (5) for sentence (4):

4) Bill would have been reading the book.

5) S

N” T M have been V”

N’ Past will V’

N V N”

Bill reading Det N’

the N


If we apply VP-ellipsis to (5), however, we predict that all of the helping verbs would have to remain. They can all remain, as in (6):

6) Although John wouldn’t have been reading the book, Bill would have been__.

However, it is also possible to just leave the modal, or the modal and have, as in (7), and this is not predicted by (5), assuming (3):

7) a. Although John wouldn’t have been reading the book, Bill would__.

b. Although John wouldn’t have been reading the book, Bill would have__.

If we assume that the ellipses in (7) arise via deletion of V”s, we would have to assume that the structure of (4) is (8), rather than (5):

8) S

N” T M V”0

N’ Past will V’

N V V”1

Bill have V’

V V”2

been V’

V N”

reading Det N’

the N


This would allow for any of the numbered V”s to delete, assuming (3).

A. Are Modals Tensed?

It has occasionally been suggested that modals are generated directly under T. This would imply that modals are not themselves tensed. A competing view about modals is that they can themselves be tensed, implying that they must be generated separately from Tense, so that the rule of Tense-Hopping should really be formulated as in (9):

(9) T- { M }

{have }

{ be }

{ V }

1- 2 --(

0- 2+1

First, it can be noted that the modals, while a closed class of items, seem to contain pairs such as those in (10):

9) will-would




This by itself is not persuasive, since the closed nature of this class (there are less than ten modals in the language).

More persuasive is the evidence from idioms. Idioms are sequences of words whose meaning is non-compositional in nature (i.e., the meaning of the whole idiom cannot be predicted from the meanings of the individual words). Examples of idioms are phrases such as make headway, keep tabs on, kick the bucket, keep track of. Examples are given in (10):

10) a. John made headway. (means John progressed)

b. John kept tabs on Mary. ( John kept apprised of Mary’s situation).

c. John kicked the bucket. (John died).

d. John kept track of Mary (same meaning as (b)).

Recalling the role of the lexicon as the repository of idiosyncratic information, idioms , by the very nature of their unpredictability and irregularity, must be listed in the lexicon. Hence, an idiom such as, e.g., make headway, will have a lexical entry as in (11):

11) make headway, [V make] [ N headway]

The lexicon has unpredictable information, but recall from our discussion of plurals on English nouns ( i.e., you don’t want to specify the plural of book in the lexicon since it’s predictable), that you want to keep the amount of information in the lexicon to the bare minimum . In this connection, there are idioms that include the modal can- specifically, the idioms can help but and can afford. The requirement that help but and afford occur with can is seen in (12):

12) a. *Did John help but notice?

b. *John afforded a new car.

Obviously, the lexical entries for these idioms will have to mention can, and will look something like (13):

13) a. can help but, [M can] [V help] [Conj but] V

b. can afford, [ M can] [V afford] N”

Interestingly enough, could can replace can in these two idioms:

14) a. Can he help but notice?

b. Could he help but notice?

15) a. John can afford a new car.

b. John could afford a new car.

If we posit could as a past tense variant of can, formed by Tense hopping, we can keep to the lexical entries in (13). If we don’t, we would have to have a disjunctive lexical entry for each of these two idioms:

(16) a. {can } help but, {[M can ] }[ V help][ Conj but] V

{could } {[Mcould ] }

b. { can } afford, { [M can ] } [V afford] N”

{ could } { [M could ] }

We would then have to answer the question of why the same set of elements appears in two separate disjunctive statements (i.e., the two lexical entries in (16)), whereas if we analyze could as a past tense variant of can, we do not have to posit a lexical entry that leads to the posing of this question.

B. The Position of The Modal

Earlier in this lecture, I proposed that the phrase-structure rule for S should be revised to (17):

(17) S--( N” T (M) V”

I would now like to propose that M heads its own projection as well, perhaps as V.

First, it is time to re-consider the treatment of sentential negation in English. Earlier, we analyzed negatives as not being present in deep structures, but after Tense and the first helping verb. It was formulated as (19) in Lecture #6:

(19) N”- T ({M })

({have } )

( { be } )

1- 2 --(

63. 1- 2- Neg

However, claiming that negatives will not be present, but rather inserted after T, predicts that negatives will not be able to occur in clauses that apparently lack Tense. Such clauses exist, however. Infinitives are a case in point (we will return to infinitives in more detail later):

19) For John to leave early.

Sentential negation precedes the to:

20) For John not to leave early.

If we assume that negatives are only inserted after T, how do we then account for the presence of negation in clauses that apparently lack T?

Another case that makes the same point is gerunds:

21) a. John’s eating steak bothered me.

b. John’s not eating steak bothered me.

One proposal that has been made is due to Jean-Yves Pollock (“Verb Movement, Universal Grammar, and the Structure of IP”, Linguistic Inquiry, Vol. 20 (1989)) . He proposed that negation headed its own projection, so that there is a constituent Neg Phrase ( Neg”). If this is the case, we might adapt (17) to (22):

22) S-( N” T (M) {Neg” }

{ V” }

Neg” -( Neg’

Neg’ -( Neg V”

The deep structure (i.e., initial phrase-marker) of (23) would then be (24):

23) John would not eat steak.

24) S

N” T M Neg”

N’ Past will Neg’

John Neg V”


V N”

eat N’



Note, however, that in the infinitive, to follows the negative. Therefore, if we assume that negation is a head that is lower than Tense in the phrase-marker, to cannot be a sister to Tense, but rather must also be lower than Tense in the phrase-marker.

With this in mind, let us consider the distribution of modals in infinitives. They are absent in infinitives, and, in fact, are the only helping verbs that do not appear in infinitives. We can account for this fact if we analyze to as occurring in the same position as modals, so that the consequence of the absence of modals in infinitives is simply a consequence of the fact that, in English, we can only have one modal per simple sentence.

We might, therefore , analyze modals as heads of their own projections. Let us call them Ms. Therefore, the phrase-structure rule for S would be (25):[3]

(25) S-( N” T { Neg” }

{ M” }

{ V” }

Neg”--( Neg’

Neg’--( { M” }

{ V” }

M”-( M’

M’-( M V”

Looking at (25), however, we see two elements that must appear in every S- N” and T. N”, being a phrasal constituent, is not a possible head, but T is. We might therefore view T as being a possible head of S, so that S would really be T”. However, we would then have to posit a T’, consisting of T and a following phrasal constituent, which would be the complement of T. In other words, the structure of , e.g. (26), would be (27):

(26)John likes pizza.

27) T”

N” T’

N’ T V”

John Pres V’

V N”

like N’



We can actually find somewhat direct evidence for the constituency of T and the following phrasal unit, if we assume that only constituents can conjoin (This argument is originally due to Ray Dougherty in his (1970) article, “Recent Studies on Language Universals”, Foundations of Language, Vol. 5).

We must find an element that we know resides in T, based on our analysis so far. One such element is the do that results from Tense-Hopping being unable to apply, as in (28):

28) John does not like pizza.

As Dougherty points out, we can conjoin such sequences as those in (29):

29) John does not like pizza and does not like steak.

Assuming a T’ allows us to conjoin T’s, so that the structure of (29) would be (30):

30) T”

N” T’

N’ T’ and T’

N T Neg” T Neg”

John does Neg’ does Neg’

Neg V” Neg V”

V’ V’

V N” V N”

like pizza like steak

Hence, we have direct evidence for the constituency of T and a following phrase. If we analyze S as the maximal projection of T, we can call this phrase a T’, and analyze the phrase following T as its complement. Hence, our phrase structure rules at the clausal level are given in (31):

(31) T”--( N” T’

T’-( T { Neg”}

{ M” }

{ V” }

Neg”-( Neg’

Neg’-( Neg { M “ }

{ V “ }

M”-( M’

M’-( M V”

C. Restructuring

If we assume that negatives are generated between Tense and the main verb, and are not placed there by a transformation of negative-placement, and we assume that the helping verbs are generated lower, and to the right of, negatives, we must account for the fact that the helping verbs precede, rather than follow, sentential negatives:

31) a. John would not eat the steak.

b. *John not would eat the steak.

32) a. John has not eaten the steak.

b. *John not has eaten the steak.

33) a. John is not eating the steak.

b. *John not is eating the steak.

We might account for the ungrammaticality of (31)(b)-(33)(b) by noting that Tense-hopping is blocked by the negation, but if that were the reason for the unacceptability of the (b) examples, we would expect Do-support to be able to rescue them, contrary to fact:

34) *John did not will eat the steak.

35) *John does not have eaten the steak.

36) *John does not be eating the steak.

Rather, assuming that the negatives stay in the position in which they are generated by the phrase-structure rules, we must move the helping verbs to the left of them. One way to state this movement is as a movement of the helping verb to Tense, formulated as in (37), giving the helping verbs a feature [ +Aux] (for “Auxiliary”):

(37) Restructuring

T- (Neg) - +Aux

1- 2 - 3 --(

3+1 -2 - 0

Hence, the D-Structure of , e.g. (32), would be as in (38):


37) T”

N” T’

N’ T Neg”

N pres Neg’

John Neg V”


V V”




V N”

eaten Det N’

the N


Factoring the phrase-marker according to the structural description of restructuring gives us the following factored phrase-marker:[4]

(39) T”

N” T’

N’ T Neg”

N pres Neg’

John Neg V”


V V”




V N”

eaten Det N’

the N


1 2


Finally, restructuring yields (40):



N” T’

N’ T Neg”

N V T Neg’


John Pres Neg V”

have V’



V N”

eaten the steak

Positing restructuring, as in (37), enables us to solve a problem about the helping verb system that we have not talked about, but which is a problem of descriptive adequacy of the earlier account. Recall that we formulated Subject-Helping Verb Inversion in terms of (41):

(41) T ({ M } )

({have} )

({be } )

Negative Placement was also formulated in terms of the set of elements in (41).

We no longer have a rule of negative placement, but it is still the case that the same set of verbal elements that inverts in questions will appear before the negative, and this was formulated as an accident. Furthermore, (41) does not form a constituent. However, if we posit the restructuring operation, we can simply reformulate Subject- Helping Verb Inversion as in (42), in which case a constituent is moving:

(42) N”- T

1 - 2 -( 2-1

Lecture #8- NP-Movements

In this lecture, we will shift gears a bit and talk about another transformation in English syntax, but in an area distinct from the helping verb system. Specifically, we shall motivate a transformation that moves N”s into subject position, showing two constructions in which this transformation is operative.

I. Passives

.First,, we shall examine English verbal passives, and we will show that English verbal passives must be transformationally derived. To see this, we shall examine the consequences of not assuming that English verbal passives are transformationally derived, showing that generating English verbal passives directly via the phrase-structure rules would lead us to miss crucial generalizations.

An example of a passive is the following:

1) John was visited by werewolves.

A. Correspondence With Active Transitive Verbs

It is clear, first of all, that passives, by and large, correspond to transitive active verbs. Hence, we have pairs such as the following:

2) a. John is hated by everybody.

b. Everybody hates John.

3) a. John saw Sally.

b. Sally was seen by John.

4) a. Lee Harvey Oswald assassinated JFK.

b. JFK was assassinated by Lee Harvey Oswald.

But not pairs like these:

5) a. John laughed.

b. *John was laughed.

If actives and passives were generated independently by the phrase-structure rules, we would have an expanded lexicon, and no account of the fact that passives corresponding to active transitives are missing.

B. Thematic Roles

The noun phrases that a verb selects, as well as the subject, are said to be the verb’s arguments, and verbs differ on the semantic relations of the arguments to the verb. For example, the verbs fear and frighten are both transitive, and yet the semantic relations of the subject and object are reversed. The subject of fear is said to be the experiencer of the emotion (fear being an emotive predicate), and the object is said to be the theme. Hence, (6) and (7) are paraphrases:

6) John fears thunder.

7) Thunder frightens John.

The semantic relations that the arguments of the predicate bear to the predicate are termed thematic relations. They include notions such as agent, theme, patient, experiencer, etc. (for a lucid account of thematic relations, see R. Jackendoff (1987), “The Status of Thematic Relations in Linguistic Theory”, Linguistic Inquiry, Vol. 18.)

The thematic relations that are exhibited in passive sentences are the same as the thematic relations in the corresponding actives, but the thematic relations in passives are simply realized in different positions. There is a simple algorithm (i.e., method of computing) the positions in which thematic relations in passives are realized:

8) The passive subject bears the thematic relation of the post-verbal NP[5] in the corresponding active, and the passive object of by bears the thematic relation of the active subject.

Clearly, it would be desirable to have the grammar of English reflect (8) in some way, rather than stating (8) as a sort of post-hoc, after the fact observation.

C. Idiom Chunks

The third regularity between passives and actives concerns the form of idioms, sequences of words which have meanings that are non-compositional in nature. Examples of idioms are : keep track of , keep tabs on, make headway. Sentences with idioms include such sentences as (9):

9) a. John kept track of Sally.

b. John kept tabs on Sally.

c. John made significant headway.

By their very nature, idioms are unpredictable. Every language has idioms, and recall that there is a specific place in the grammar to put unpredictable information- the lexicon, which we have called the suppository of idiosyncratic information. One way of representing idioms is as in (10):

(10) a. track, N, +[keep____[P” of X ]]

b. tabs, N, +[ keep___ [P” on X ]]

c. headway, N, +[make ____]

Representing the idioms as in (10) reflects the fact that the sequence of words is not just an isolated list of words, but rather that the words are sequenced in a way that conforms to the general syntactic patterns of the language. In particular, NPs are generated after verbs, and keep and make in the idioms above are formally verbs, and track, tabs, and headway are nouns that are sequenced in the same way that non-idioms are sequences.

The nouns in each of the three idioms above can appear as subjects of the verbs in the passive voice:

(11)a. Careful track was kept of Sally.

b. Close tabs were kept on Sally.

c. Significant headway was made by John.

If we generated actives and passives separately, we would have to have disjunctive subcategorization frames for these nouns, so that, e.g. (10)(c) would have to be modified to (12):

12) headway, N,{ +[ make__ ] }

{+[ ___be made] }

Clearly, disjunctive subcategorization frames are missing the relationship between actives

and passives. If we allow such disjunctive subcategorization frames, we would then have to ask why we couldn’t have a disjunctive frame for headway as in (13), for instance:

13) *headway N, { + [make____ ] }

{+ [ ____ be seen ] }

Generating actives and passives separately does not predict the non-occurrence of lexical entries such as (13).

II. The Solution- Generating Passives Transformationally

Chomsky (1957), in Syntactic Structures (Mouton), after noticing the above regularities between English actives and passives, proposed to capture them by not generating verbal passives directly via the phrase-structure rules, but rather by forming passives from the phrase-markers for the corresponding active sentences. The transformation was formulated as in (14):

(14) N”- X - V - N”

1 - 2 - 3 - 4--(

4- 2 - be+en – 3- by +1

We immediately capture the fact that passives correspond to actives in which the active verbs take post-verbal NPs, because of the mentioning of N”s as Term #4 in the structural description of the passive transformation. Furthermore, if we assume that thematic roles are assigned in deep structures, the correspondence stated in (8) is accounted for. The passive subject is, in deep-structure, the NP that follows the verb in the corresponding active, and since thematic roles are assigned in deep structure, whatever thematic role the post-verbal NP got in deep structure will be retained when it moves. Similarly, if the passive object of by is generated as the deep-structure subject, whatever thematic role that the subject was assigned at deep structure will be retained if the subject is postposed in the passive transformation.

Finally, given that the distribution of idioms is stated in the lexicon, and lexical information is only accessed at deep structure, an idiom chunk that is a postverbal noun phrase will be permitted to move to subject position.

Hence, a transformational derivation of verbal passives will capture all of the regularities described at the beginning of this section.

III. On Restricting the Scope of the Passive Transformation

The passive transformation, as formulated in Section II, has a number of components: (i) it preposes the post- verbal NP into subject position; (ii) it postposes the original subject to the position after by; (iii) it inserts be +en and by. We shall now see that the postposing of the subject , component (ii), and the insertion of be +en and by, component (iii), are best viewed as not being transformations.

With respect to agent postposing, we can see that the grammar of English needs a mechanism to give the object of by, in certain instances, the thematic role that it would have received had it appeared in subject position, without movement being the way of accounting for this dependency . Norbert Hornstein first noticed nominals such as (15) in “S and the X-Bar Convention”, , Vol. 3 (1977):

14) John’s portrait of Nixon by Warhol.

Warhol is, of course, interpreted as the agent of the verb related to the nominalization portrait, i.e. portray, and John is interpreted as the owner of the portrait. However, Warhol could not have moved from any other position within the nominal, since all of the other positions are occupied. Hornstein’s conclusion is that the agent that occurs as the complement of by must be generated in that position, and that there must be a semantic mechanism that interprets agents in two places- subject position, and the object position of by. Nominalizations such as (14) point to the necessity of such a mechanism in some cases, and it would seem natural, given its necessity, to posit it for verbal passives. This sheds new light on so-called “truncated passives”, which lack by-phrases altogether, as in (15):

15) John was murdered.

It had previously been thought that there was an underlying agent there which was deleted by a transformation called “Unspecified Agent Deletion”, an optional transformation which, if it did not apply, would yield (16):

16) John was murdered by someone.

Another way of interpreting truncated passives is to say that by-phrases are adjuncts, and that the subject thematic role is present but not linked to an argument.

Notice that I say that by takes whatever thematic role the subject would take. It has occasionally been suggested that by marks agents. This cannot be right, however, in view of passives such as (17):

17) a. A crushing defeat was endured by Germany.

b. A glancing blow was suffered by John.

The subjects of the verbs endure and suffer are not agents.

Another problem exists with the idea that passive be is always inserted via a passive transformation, in that we find passives without be:

18) I want him given a book.

A rule deleting to be is occasionally invoked, so that the deep structure of (18) would be the structure corresponding to (19):

19) I want him to be given a book.

Presumably, to be deletion would assign a common deep structure to (20)(a) and (20)(b) as well:

(20)(a) I consider him to be crazy.

b) I consider him crazy.

However, we can find evidence against the rule of to be deletion if we consider the English expletive there , which needs a verb such as be for its appearance:

21) There is a valid reason for his absence.

We can have a full infinitival counterpart to (21) within a VP headed by consider, but the copula must be retained:

22) a. I consider there to be a valid reason for his absence.

b. * I consider there a valid reason for his absence.

Assuming that the expletive there requires the copula, we must ask why, if there is a rule of to be deletion, the expletive would not be licensed at deep structure by the copula, followed by the copula’s deletion, yielding (22)(b). Because (22)(b) is not acceptable, we can account for its unacceptability by not positing a deep structure for such instances of secondary predication which posits the be.

If there is no rule of to be deletion, we must conclude, then, that passives contained in such larger structures as (18) must be derived without be having been present in their formation.

Hence, the transformation involved in the formation of English verbal passives is simply a transformation that preposes the post-verbal NP, formulated as in (23):

(23) N” - X- V - N”

1 - 2 - 3 - 4 --( 4-2-3- 0

We will now see that (23), which we will call “NP-Preposing”, operates in a wider range of constructions than just passives. In the next section, we will see the operation of NP-Preposing in a class of superficially intransitive verbs that do not take passive morphology at all.

II. Unaccusatives

There is a great deal of evidence from other languages that superficially intransitive verbs differ in the syntactic position of the one argument that occurs with the verb. The sole argument generally acts as a surface structure subject, but for some verbs, there is evidence that the surface subject is an underlying object, while the surface subjects of other verbs are deep-structure subjects as well. Verbs of the latter class take subjects that are agents, while verbs of the former class take subjects that are non-agents. Examples of the two types of verbs are the verbs telephone and arrive:

23) John telephoned.

24) John arrived.

Therefore, the deep structures of (23) and (24) are (25) and (26), respectively:

25) T”

N” T’

John T V”

Past V’



26) T”

N” T’

e T V”

Past V’

V N”

arrive John

There is no evidence for this distinction in English, but there is a great deal of evidence from other languages. Furthermore, there seems to be no basis for learning this distinction in these other languages, and the two classes in each of the languages that show overt evidence of the distinction seem to have the same set of verbs.

We will first look at the evidence from Italian, as first discussed by David Perlmutter (1978, “Impersonal Passives and the Unaccusative Hypothesis”, Proceedings of the Berkeley Linguistic Society). Perlmutter noted that Italian has two auxiliaries that are used for expressing past tense- the verbs avere (roughly ‘have’) and essere (roughly ‘be’). Transitive verbs take avere in the past tense:

(27) (L. Burzio (1986), Italian Syntax, Reidel, ex.((80)(a))

L’artigliera ha affondato due navi memiche.

The artillery has (A) sunk two enemy ships.

Agentive intransitives take avere:

(28)) (Burzio’s (79)(b))Giovanni ha telefonato.

Giovanni has telephoned.

Non-agentives, however, take essere as the past auxiliary, as do passives:

(29) (Burzio’s (79)(a)): Giovanni e arrivato.

Giovanni has arrived.

(30) (Burzio’s (81)(a)): Maria e stata accusata.

Maria has been accused.

The generalization that Perlmutter arrives at is the following:

31) Essere is the past tense auxiliary in Italian when the surface structure subject is not the underlying subject, and avere is the auxiliary that is used when the surface structure subject is the underlying subject.

Generalization (31) receives immediate support from the fact that essere is the past tense auxiliary for passives, as in (30), while avere is the auxiliary used for transitives, assuming that subjects of transitives do not have any possible point of origin within the verb phrase. If one looks at passives and transitives as the two types of verbs whose subject origins are transparently justified, we can extrapolate from the auxiliary choice for these two types of verbs to the two types of intransitives, with the non-agentive intransitives patterning with passives (in that both take essere) while the agentive intransitives pattern with transitives (in that both take avere).

Further support for this distinction in Italian comes from the distribution of the partitive clitic ne (meaning “of them”). It can modify quantified objects (the quantified noun phrases that ne modifies must follow ne, but Italian has subject-postposing, meaning that the subject can appear in final position in the sentence), but not quantified subjects:

(32) (Burzio’s 1.7a): Giovanni ne invitera molti.

Giovanni of-them will invite many.

(33)(Burzio’s 1.5iii) * Ne esamineranno il caso molti.

Of-them will examine the case many.

A postposed subject of a non-agentive intransitive may be modified by ne, while a postposed subject of an agentive intransitive cannot be:

34) (Burzio’s 1.5I): Ne arriveranno molti.

Of-them will arrive many.

Many of them will arrive.

(35) (Burzio’s 1.5ii): * Ne telefoneranno molti.

Of-them will telephone many.

We have the same distinction between agentive intransitives and non-agentive intransitives that we had in the discussion of auxiliary selection. Furthermore, objects can be modified by ne, as in (32). We can make sense of these facts if we say that the post-verbal noun phrases that follow non-agentive intransitives are really not post-verbal subjects, but rather are objects that have never been moved. The post-verbal noun phrases that follow agentive intranstives, however, are postposed subjects, adjoined to the verb phrase. Hence , we can say that ne can only modify objects.

An intransitive verb whose sole argument is an underlying object is called an unaccusative verb, while an intransitive verb whose sole argument is an underlying subject is called an unergative verb. Russian also gives evidence for the unaccusative-unergative contrast, based on observations of Leonard Babby ( Existential Sentences in Russian (1980), Slavica Publishers). Russian is a heavily Case-marked language, and the object is usually marked in the accusative Case. However, when the main verb is negated, the object may be marked in the genitive Case. This is termed the ‘genitive of negation’. However, subjects of non-agentive intransitives may also be marked with the genitive of negation. Examples are given in (36) and (37):

36) (Babby’s (4)(b)): V- nasem- lesu-ne-ratet-gribov.

In-our-forest-neg-grow(3rd. sg.) –mushrooms (GEN pl.)

There are no mushrooms growing in our forest.

37) (Babby’s (6)(b)):

38) Ne-ostalos’-somnenij.

Neg-remained(3rd.n. sg.)-doubts (GEN pl.).

There were no doubts that remained.

Subjects of negatied transitive verbs that are nominative in the affirmative cannot take the genitive:

39) (D. Pesetsky, Paths and Categories, unpublished Doctoral dissertation, MIT (1982), ex. (15)):

a. ni odna gazeta ne pecetaet takuji erundu.

Not one newspaper(fem nom sg) NEG prints (3sg) such nonsense (fem acc sg).

b. *ni odnoj gazety ne pecataet takuju erundu.

fem. gen. sg.

Also, agentive subjects of negated intransitive verbs cannot appear in the genitive.

40) (Pesetsky’s (9)):

a. v pivbarax kul’turnye ljudi ne p’jut.

In beerhalls refined people NEG drink.

(masc. nom. pl.) (3 pl).

b. * v pivbarax kul’turnyx ljudej ne p’et.

(masc. gen. pl.) (3rd sg.)

Babby’s generalization is that those subjects that can appear in the genitive of negation are in the scope of negation at D-Structure, in fact are D-Structure direct objects, and the distinction between unaccusatives and unergatives makes this distinction correctly for Russian. Passive subjects, as would be predicted, may also appear in the genitive of negation:

41) (Babby’s (24)(a)):

ne- naslos’- mesta.

NEG- be found (n. sg).- seat/place (GEN

There was not a seat to be found.

To sum up this section, there seems to be strong evidence from Italian and Russian that some superficially intransitive verbs are subjectless but have underlying objects, while other superficially intransitive verbs are objectless but have underlying subjects, and that each class has the same members, abstracting across translation equivalents. Furthermore, if we assume that children do not get corrected for ungrammaticality, it is impossible to see how they would learn which verbs were unaccusative and which were unergative. Therefore, we assume that the distinction is universal. Getting back to our formulation of passive as simply being NP-preposing, we see that the rule of NP-preposing that is given in (23), repeated here, will also work for unaccusatives.

(23) N” - X- V - N”

1 - 2 - 3 - 4 --( 4-2-3- 0

We can view the term passive as a term that denotes a particular syntactic construction, with particular pragmatic properties. We see, however, that it is inappropriate to term the transformation that derives passives as “the passive transformation”, because it operates more widely than just in passives. It operates in unaccusatives as well, and we would therefore not call it a construction-specific transformation.

Lecture #9- Clausal Complementation

We will now concentrate on the syntax of clausal complementation, or, more generally, the mechanism by which clauses function as arguments. Examples are given in (1) and (2):

1) (a) That John visited Sally bothered me.

b) Bill claimed that John visited Sally.

2) a. For John to visit Sally would bother me.

b. Bill would prefer for John to visit Sally.

A. The Generation of Complementizers

The words that and for which introduce these embedded sentences in English are called complementizers. The complementizer that introduces finite clauses, and the complementizer for introduces infinitives. Hence , we have the following pattern of acceptability:

3) a.That John visited Sally.

b. *That John to visit Sally.

c. *For John visited Sally.

d. For John to visit Sally.

The earliest treatment of sentential complementation in modern syntactic theory was P. Rosenbaum’s A Grammar of English Sentential Complementation (MIT Press (1967)), and Rosenbaum proposed that complementizers were not present in deep structure but, rather, were inserted transformationally. The two transformations that Rosenbaum proposed were along the lines of (4):

4) a. T”-( that T”

b. T”-( for-to T”

To –N”- T

1. - 2 - 3-(

0- 2 - 1

However, Joan Bresnan (1970) (“On Complementizers: Toward A Syntactic Theory of Complement Types”, Foundations of Language Vol. 5) argued that complementizers should be present in deep structure. Her arguments basically showed that complementizers are on a par with prepositions. Just as different meanings are signalled by different prepositions, choice of complementizer can affect meaning, as in the pair in (5):

(5)(a) John would hate it that Fred is more popular than him.

b) John would hate it for Fred to be more popular than him.

Sentence (5)(a), in which the clausal complement is introduced by the that-complementizer, presupposes the truth of the complement- in other words, the utterer of (5)(a) is committed to the belief that Fred is more popular than John. Presupposition is defined as follows:

(6)A presupposes B if and only if B is true whenever A or the negation of A is true.

In other words, just as the speaker must believe that Fred is more popular than John in (5)(a), the utterer of (7) is likewise committed to that belief:

(7) John would not hate it that Fred is more popular than him.

Verbs such as hate are termed factive (the term is due originally to Paul and Carol Kiparsky in an important paper, “Fact”, which appeared in M. Bierwisch and K. Heidolph (1970), Recent Progress in Linguistics, Mouton), in that they presuppose the truth of their complements. The important point is that while that-complements may be interpreted factively, for-to complements can never be, as noted by Kiparsky & Kiparsky (1970).

A second point about the choice of complementizer is that it is lexically restricted, a point made by Bresnan. While verbs such as hate can take either that-complements or for-to complements, verbs such as claim can only take that-complements, and verbs such as wait can only take for-to complements:

8) a. John claimed that Fred was more popular than him.

b. *John claimed for Fred to be more popular than him.

9) a. *John waited that Fred was more popular than him.

b. John waited for Fred to be more popular than him.

Therefore, if there were a complementizer-placement transformation, as proposed by Rosenbaum, there would actually have to be two complementizer-placement transformations-one to insert the that-complementizer, and the other to insert the for-to complementizer. The transformation would have to be lexically restricted, so that claim would trigger the that-placement transformation, wait would trigger the for-to placement transformation, and hate would trigger either one, with a rule of interpretation interpreting the that-complement as factive.

We have been making a division thus far between the lexicon, which contains unpredictable information that is peculiar to particular lexical items, and the syntax, which is more regular. Syntactic rules are thought to apply maximally generally, but the marking of a large number of lexical items as to which of a family of transformations apply to them undercuts this division between lexical (idiosyncratic) and grammatical (systematic). Therefore, Bresnan proposes to base-generate complementizers (i.e., generate them directly via the phrase-structure rules). She originally proposed the phrase-structure rule in (10):

10) S’--( Comp S

and the selection by particular predicates is now a simple matter of selection, rather than features that trigger particular transformations. Hence, the lexical entries for hate, claim, and wait would be as in (11):

11) a.hate, V, +[___ [S’ [Comp {that } ]]]

{ for }

b. claim, V, + [ ___[S’ [Comp that] ]]]

c. wait, V, +[ ____[S’ [ Comp for ]]]

Updating (10) into current X-bar terms, we would say that Comp is the head of this clausal projection, so that (10) would be replaced by (12):

12) a.C”-( C’

b.C’--( C T”

Bresnan (1974) (“ The Position of Certain Clause-Particles in Phrase-Structure”, Linguistic Inquiry Vol. 5) later provides direct evidence for the constituency in which the rest of the sentence forms a constituent that is sister to the complementizer. There is a construction known as the Right-Node-Raising construction, in which, in a conjoined phrase, if the rightmost elements of the conjuncts are identical, the final rightmost element is set off intonationally as a pause, and the previous rightmost elements are deleted. An example is (13):

13) Mary wrote, and John performed, a beautiful Peruvian love song.

which is presumably related to (14):

14) Mary wrote a beautiful Peruvian love song, and John performed a beautiful Peruvian love song.

The structure of (13) is plausibly (15):

15) T”

T” and T” N”

N” T’ N” T’ a beautiful Peruvian

love song

Mary T V” John T V”

Past V’ Past V’


wrote performed

The assumption is that only constituents can appear in the position after the pause in the Right-Node-Raising constuction. With this in mind, Bresnan notes that the sequence after the complementizer can appear in this position:

(16) I’m wondering whether, but I’m not sure that, your hypothesis is correct.

Hence, we have evidence for the constituency in which the complementizer is set off from the rest of the clause.

B.For –Infinitives

Let us now consider the position of the infinitive marker to. As we noted in Lecture #7, it follows the sentential negation, and is incompatible with the presence of a modal. The incompatibility of to with the modal could follow from the fact that only one modal is permitted per clause if we analyzed the to itself as a modal.

We could therefore assume that the infinitive takes a null T, which selects to in the modal position. Hence, the structure of (17) would be (18):

(17)For John to leave.

(18) C”


C T”

For N” T’

John T M”

0. M’

M V”

to V’



We can account for the dependencies between that-complementizers and finiteness, and for-complementizers and non-finiteness, via the mechanism of selection, if we assume that heads select for the heads of their complements ( as I had argued in “Heads and Projections”, in M. Baltin & A. Kroch, eds., Alternative Conceptions of Phrase-Structure (1989), University of Chicago Press). Hence, the following lexical entries would suffice:

(19) that, C, +[____ [T {Pres } ]

{ Past}

20) for, C, +[ ___[ T 0 ]

(21) 0, T, +[___[M to] ]

C. Clauses in NP Positions

It is clear that clauses can appear in subject position. To see this, consider an alternative way of expressing the previous sentence:

21) That clauses can appear in subject position is clear.

It is also clear that clauses can appear in object position:

22) John proved that Bill liked Sally.

It is also clear that clauses in object position can passivize:

23) That Bill liked Sally was believed by everybody.

Rosenbaum proposed to account for the ability of clauses to appear in subject and object position, as well as the ability of clauses to passivize, by positing a phrase-structure rule as in (24):

24) N”-( C”

However, this violates X-bar theory, in that N” is not a projection (ultimately) of N0. J. Emonds (1976) ( A Transformational Approach to English Syntax, Academic Press) modified Rosenbaum’s analysis by proposing that these clauses were actually complements to a null N0 head, so that we have the phrase-structure rule as in (25):

(25) N’--( N C”

Hence, the structure of , e.g., (21), would be as in (26):


25) C”


C T”

N” T’

N’ T V”

N C” Pres V’

0 C’ V A”

C T” be A’

That clauses can A

appear in

subject position clear

The need for a phrase-structure rule like (25) is transparently justified by the fact that a handful of nouns such as fact and claim take clausal complements overtly:

26) a. the fact that clauses can appear in subject position.

b. the claim that clauses can appear in subject position.

Positing such a phrase-structure rule will automatically account for the fact that these clauses can passivize.

D. Extraposition

However, there is one fact about clausal complementation that we have not yet accounted for, and that is the generalization in (27):

(27) For every sentence in which a clause appears in subject position, there is a variant of the sentence in which the clause appears at the end of the sentence , and the expletive it appears in subject position.

For example, we have the following pairs:

(28)a. For Fred to leave would bother me.

b. It would bother me for Fred to leave.

29) a. That Fred is crazy is obvious.

b. It is obvious that Fred is crazy.

30) a. That Fred has blood on his hands proves nothing.

b. It proves nothing that Fred has blood on his hands.

The exceptionless nature of this generalization strongly suggests that the grammar of English should be formulated in such a way as to express it . There have been two main approaches to capturing this generalization transformatonally: extraposition and intraposition. The extraposition approach moves the C” rightward and inserts the expletive it. The intraposition approach takes the variant in which the C” is in clause-final position as basic, and moves the C” leftward into the subject position. Extraposition was originally proposed by Rosenbaum, and Intraposition was proposed by J. Emonds (1970) in his MIT Doctoral dissertation, Root, Structure-Preserving, and Local Transformations.

Extraposition can be formulated as follows:

(31) [N” it - C”] - X - V”

1- 2 - 3 - 4---(

1 - 0 - 3 - 4+2

Hence, the D-structure of (28)(a) would be (32):

(32) C”


C T”

N” T’

N’ T M”

N C” Past M’

0. C’ M V”

C T” will V’

For N” T’ V N”

Fred T M” bother me

0. M’

M V”

to V’



Extraposition would then adjoin the C” to the matrix V”, yielding (33):

(33) C”


C T”

N” T’

It T M”

Past M’

M V”

will V” C”

V’ For Fred to leave

V N”

bother me

Intraposition works in reverse: the underlying structure of (28)(a) and (b) would, under the intraposition analysis, be (34):

34) C”


C T”

N” T’

It T M”

past M’

M V”

will V’

V N” C”

bother me for Fred to leave

Intraposition would be formulated as in (35):

(35) it- X- C”

1- 2- 3-(

3- 2- 0

After applying intraposition to (34), the surface structure would be as in (36):

(36) C”


C T”

N” T’

N’ T M”

C” Past M’

C’ M V”

C T” will V’

For N” T’ V N”

Fred T M” bother me

0. M’

M V”

to V’



The intraposition account generates the clausal argument within the V”, and it is tied to the independent motivation for positions within the V” for clausal arguments. For example,

the intraposition analysis of (28)(b), in which the clausal argument is generated within the VP, depends upon the phrase-structure rule in (37):

(37) V’--( V (N”) (C”)

and the analysis requires, for its plausibility, that there be independent instances of this pattern in which the subject is something other than the expletive it. We can find such independent instances of the pattern V N” C”. For example, we have the verbs convince, tell, and persuade:

38) a. John convinced Sally that she should leave.

b. John told Sally that she should leave.

c. John persuaded Sally that she should leave.

However, we have no instances of the pattern in (39), a verb followed by two clausal arguments:

(39) * V’

V C” C”

Verbs with sentential subjects and complements exist, however (these are known as bisentential verbs):

39) a. That John has blood on his hands proves that he’s the murderer.

b. That John has blood on his hands convinces me that he’s the murderer.

c. That John has blood on his hands suggests that he’s the murderer.

d. That John has blood on his hands indicates that he’s the murderer.

e. That John has blood on his hands means that he’s the murderer.

If clausal arguments are generated within the VP, as they are under the intraposition analysis, we would need to generate the configuration in (39), but this configuration would only be employed for verbs in which one of the arguments ended up in subject position. We would then have to answer the question of why no verbs existed which allowed both clausal arguments to remain inside the V’, i.e. why there are no verbs of the form in (40):

(40) * John glorped that Fred has blood on his hands that he’s the murderer.

If we adopt the extraposition analysis, which allows for clauses to be generated in subject position and moved rightward, we do not have this problem. We would generate the clausal subjects in (39) in subject position, and only generate one clause inside the V’, making (39)(a-e parallel to (41)(a-e),or (42)(a-e), in which the clausal subject or object is replaced by a constituent that is clearly an N”:

(41) (a) This proves that he’s the murderer.

(b) This convinces me that he’s the murderer.

c) This suggests that he’s the murderer.

d) This indicates that he’s the murderer.

e) This means that he’s the murderer.

42) (a)That John has blood on his hands proves nothing.

b) That John has blood on his hands convinces me of nothing.

c) That John has blood on his hands suggests his guilt.

d) That John has blood on his hands indicates his guilt.

e) That John has blood on his hands means nothing.

I believe that a further argument can be made for allowing clauses to be generated in subject position, and this argument deals with the possibility of formulating a set of linking principles , principles that link thematic roles and syntactic positions. Recall that, in Lecture #8, when we discussed unaccusatives, we noted that the same set of verbs (i.e. translation equivalents of each other) were unaccusative and unergative, so that agentive intransitives were unergative, and non-agentive intransitives were unaccusative. The cross-linguistic predictability of the membership of the two classes of verbs indicated strongly that Universal Grammar has some linking principles that require this. A problem with the formulation of such linking principles, however, is that some psychological predicates seem to exist which are paired in such a way that the two members of the pair take the same set of arguments, and the same set of thematic relations of the arguments, but the thematic relations of the arguments of each verb are realized in the opposite positions from the other verb. The verbs fear and frighten show this:

43) a. John fears Sally.

b. Sally frightens John.

Each of these verbs takes an argument that is called an experiencer, the experiencer of the emotion, as well as what could, for convenience, be called the theme, the object of the emotion. However, the experiencer is t he subject of fear but the object of frighten, and the theme is the object of fear but the subject of frighten. If these two sentences are synonymous, and hence have the same array of thematic relations for their arguments, how could we say that there is a universal set of linking principles that allows us to predict the syntactic position of an argument from its thematic relation?

Grimshaw (1990)( Argument Structure, MIT Press) provides a solution. She claims that the synonymy of the pair in (43) is only apparent. In particular, she notes that there is a grammatical difference between verbs in which the theme is the subject and the experience is the object and verbs with experiencer subject and theme object verbs. Verbs of the former class can appear in the progressive, while verbs of the latter class cannot:

44) a. *John is fearing Sally.

b. Sally is frightening John.

She ties this difference in progressivizability to the claim that object-experiencer verbs are accomplishment verbs ( recall the discussion in terms of Zeno Vendler’s classification in the early lectures), while subject-experiencer verbs are states. She then posits a lexical representation for accomplishment verbs in which they are decomposed into two parts- (i) an activity of causing, which results in (ii) a state. Hence, the lexical representation of the meaning of,e.g. frighten would be cause to fear, as in (45):

(45) frighten, V, [[ CAUSE][ ENTITYi][ STATE [FEAR][ ENTITY]j]

The linking principle, then ,would link the causer to the subject position.

Getting back to the current concern, which is the underlying position of apparently clausal subjects, there is an extremely noteworthy fact about bisentential verbs. Every bisentential verbs allows for the expression of an experiencer in non-subject position, underlined in the examples below:

(46) a. That John has blood on his hands proves to me that he’s the murderer.

b.That John has blood on his hands convinces me that he’s the murderer.

c. That John has blood on his hands suggests to me that he’s the murderer.

d. That John has blood on his hands indicates to me that he’s the murderer.

e. That John has blood on his hands means to me that he’s the murderer.

I would then suggest that each bisentential verb has, as at least part of its meaning, something like “cause to believe”, and the sentential subject would be the cause of the experiencer’s belief. By the linking principle suggested by Grimshaw, then, the clausal subject would be linked to the subject position.

In the next lecture, we will see more syntactic evidence for the extraposition analysis over the intraposition analysis, but the focus of the next lecture will be on infinitival complementation, and the generation of infinitives with no overt subjects.

Lecture #10- Infinitival Complementation

So far, we have seen evidence that the designated initial symbol in the grammar is not S, or even T”, but C”, so that the designated initial symbol in the grammar is a maximal projection of C (for complementizer). Embedded CPs can be either finite (introduced by a that-complementizer) or non-finite (introduced by a for-complementizer).

A. Understood Subjects

We will now look at infinitives that are not introduced by a for-complementizer, and do not even seem to contain subjects. An example is the following:

1) To leave would be inconvenient.

It is clear that, in some sense, there is an “understood subject”, and this can be brought out if we add a benefactive phrase to the main clause:

2) To leave would be inconvenient for Fred.

We can understand (2) to mean either that it would be inconvenient for Fred if he himself were to leave, or it would be inconvenient for Fred if somebody else were to leave. The fact that we understand a subject, however, does not mean that the subject is present in the syntactic representation, i.e. the phrase-marker. We assume that the grammar of natural language is, like the grammars of logical languages, organized in such a way that the syntactic component generates representations that are then interpreted by the semantics. Therefore, the fact that a subject is understood does not mean that it is present syntactically.

Let us then look for some syntactic evidence that infinitives have subjects.

There are a number of theories of grammar that claim that infinitives do not have syntactic subjects, but rather, subjects that are, as it were, “plugged in”, or supplied, by the semantics. The subject is missing, in all of these approaches, because there is no structural position for it. For example, we could generate subjectless infinitives as M”s, with the phrase-structure rules that we have used in (3):

3) M”-( M’

M’--( M V”

And we could generate them in the same way that we decided to generate clauses that function as N”s, i.e. as in (4):

4) N’-( N ({ M” } )

( {C” } )

Hence, the D-structure of (1) would be as in (5):

5) C”


C T”

N” T’

N’ T M”

N M” Past M’

0. M’ M V”

M V” will V’

to V’ V A”

V be A’

leave A


Notice that we have to complicate the phrase-structure rule for generating clausal arguments, by having the curly brackets in (4) to generate either M”s or C”s. However, the disjunctive phrase-structure rule in (4) fails to capture the fact that, for individual predicates that take clausal arguments, every predicate that allows for a full for-infinitive, which is a C”, would also have to allow for a subjectless infinitive, which is an M”. To see this, notice that each of the for-infinitives below is substitutable for a subjectless infinitive:

6) a. I would prefer for John to leave.

b. I would prefer to leave.

7) a. I was hoping for John to leave.

b. I was hoping to leave.

8) a. I was waiting for John to leave.

b. I was waiting to leave.

9) a. I would hate for John to leave.

b. I would hate to leave.

We could, of course, have subcategorization frames for prefer, hope, wait, and hate as in the following:

10) prefer, V, +[N” [N 0] {[C” [C for] ]}

{ [M” [M to] ]}

In evaluating the claim that subjectless infinitives are simply M”s, one might note that disjunctive subcategorization frames are needed in any event. A case in point is the subcategorization frame for become, which takes either an N” or an A”.

11) a. become, V, +[___ { N” } ]

{ A” }

b. He became { a lawyer }.

{ quite angry }

However, there is a crucial difference between a disjunctive subcategorization frame for one verb, and a disjunctive subcategorization frame for every predicate in the language that takes a given category A, such that every predicate that take the category A will also take the category B. For example, we know that a disjunctive subcategorization frame is the appropriate mechanism for expressing the combinatory possibilities of the verb become because there are other environments in which only one of the categories with which become combines can occur, such as the verb grow, which only subcategorizes for an A”, but not a N”:

12) He grew { * a lawyer }.

{ quite angry}

However, the disjunctive subcategorization frames that would be required for the analysis of subjectless infinitives as M”s would be required for every predicate in the language that takes a for-infinitive. Moreover, there are for-infinitives that can occur as adjuncts; they are termed purpose clauses ( R. Faraci (1974), Aspects of the Grammar of Infinitives and For-Phrases, unpublished Doctoral dissertation, MIT) :

13) I bought it for Sally to play with__.

A subjectless infinitive can also occur as a purpose clause:

14) I bought it to play with__.

Clearly, by our definition of a grammatical category as a class of elements that are mutually substitutable in a sufficiently wide range of environments, subjectless infinitives and for-infinitives are members of the same grammatical class. Since the presence of the complementizer for suggests that the latter is a C”, we would seem to be required to analyze the subjectless infinitive as a C”.

However, C”s are expanded by the phrase-structure rules in (15):

15) a. C”--( C’

b. C’--( C T”

c. T”-( N” T’

d. T’--( T { M” }

{ V” }

Notice that by using the phrase-structure rules in (15), there is a subject position , and we must then ask why this subject position for the infinitive is not overtly realized. We must also ask why the complementizer position is not overtly realized.

In evaluating the claim of the “plug-in” theory of understood subjects of infinitives, in which they are not syntactically present but instead supplied in the semantics, and subjectless infinitives are generated as M”s, a further complication arises with respect to the substitutability of for-infinitives and subjectless infinitives. A for-infinitive cannot appear as the complement of a verb if its subject is understood as identical to the main clause subject. English has a form that expresses the identity of a noun phrase with another noun phrase in the sentence, and this form is called the reflexive pronoun ( we will be talking more about reflexive pronouns shortly):

16) John likes himself.

The identity of John and himself, termed referential identity because both terms pick out the same individual, is usually expressed by superscripting an index to the term that is the same as the index that is superscripted to the term with which it is co-referential, and this device is termed co-indexing. An example is (17):

17) Johni likes himselfi.

We cannot use a for-infinitive, however, when the subject of the infinitive is co-indexed with the main clause subject. We must use the subjectless infinitive:

(18)a. * He would prefer for himself to win.

b. He would prefer to win.

19) a. *He would hate for himself to lose.

b. He would hate to lose.

20) a. * He was hoping for himself to win.

b. He was hoping to win.

21) a. * He was waiting for himself to leave.

b. He was waiting to leave.

If we adopt the plug-in view of understood subjects, and all that it entails, we would still need a mechanism to prevent the generation of for-infinitives with subjects that are co-referential with main clause subjects. In other words, given that we can generate sentences with full for-infinitive complements, as in (22-25), what prevents the (a) examples in (18-21)?

22) He would prefer for John to win.

23) He would hate for John to lose.

24) He was hoping for John to win.

25) He was waiting for John to leave.

Interestingly enough, there is a dialect of English, spoken in the Ozarks, in which the for-complementizer shows up without an expressed subject of the infinitive, when the understood subject of the infinitive is understood as being co-referential with the main clause subject:

26) % He was hoping for to win. (“%” means “is acceptable in this dialect”).

We might then account for the difference between Ozark English and Standard English by positing a rule that obligatorily deletes a reflexive pronoun in the subject position of an infinitive:

27) [C for] [N” + refl]

1- 2 ---( 1- 0

and , subsequent to rule (27), which we will call “Reflexive Deletion”, a rule that obligatorily deletes a for-complementizer next to the infinitive marker to:

28) for- to

1- 2-(

0- 2

Rule (28) (For –deletion) would be obligatory in standard English.

By positing the rules of Reflexive Deletion and For-Deletion, and making them obligatory, we can account for the fact that subjectless infinitives can occur wherever for-infinitives can occur. The D-structure of , e.g. (18)(b) is simply (29):

29) C”


C T”

N” T’

He T’

T M”

past M’

M V”

will V’

V N”

prefer N’

N C”

0. C’

C T”

for N” T’

himself T M”

0. M’

M V”

to V’



Reflexive Deletion then applies, yielding (30):

30. (30)



C T”

N” T’

He T’

T M”

past M’

M V”

will V’

V N”

prefer N’

N C”

1. C’

C T”

for T’

T M”

1. M’

M V”

to V’



Finally, For- Deletion applies, yielding (31):





C T”

N” T’

He T M”

past M’

M V”

will V’

V N”

prefer N’

N C”

0 C’



T M”

0. M’

M V”

to V’



B. Verbs of Obligatory Control

The generalization that a subjectless infinitive can appear wherever a for-infinitive can appear is exceptionless, but the converse is not true- there are environments in which a subjectless infinitive can appear but a for-infinitive cannot appear. Examples are the complements of the verbs try and attempt:

32) a. *He tried for Fred to leave.

b. He tried to leave.

33) a. * He attempted for Fred to leave.

b. He attempted to leave.

We can account for this by positing a lexical feature for the relevant predicates which stipulates that the subjects of their complements must be identical to their own subjects. These verbs are known as “verbs of obligatory control”, with control being defined as the phenomenon whereby an element must be anaphoric to some other element in the phrase-marker.


1. How does the existence of passive infinitives, as in (i), choose between the plug-in theories of subjectless infinitives and the analysis of subjectless infinitives as arising through reflexive deletion?

i) John wants to be visited by werewolves.

2. Under the analysis that posits reflexive deletion, show the derivation of (i). What would be the ordering of N” –preposing and reflexive deletion?

C. Subject-to Subject Raising

Of the predicates that take obligatorily subjectless infinitives, it can be shown that the process by which the infinitive comes to lack its subject is not always reflexive deletion. It is also possible for the subject to have moved out of the infinitive .

To see this, consider verbs such as try and attempt, on the one hand, and verbs such as seem and appear, on the other:

34) a. John tried to be happy.

b. * The car tried to be heading toward us.

c. *There tried to be a good reason for that.

d. *Headway tried to have been made.

Vs. seem:

35) a. John seemed to be happy.

b. The car seemed to be heading toward us.

c. There seemed to be a good reason for that.

d. Headway seemed to have been made.

Apparently, the verb try imposes restrictions on its subject (specifically, the subject must be animate, and hence capable of being an agent), while the verb seem imposes no such restrictions on its subject. When seem takes an infinitive, the subject of seem gets no restrictions from seem itself. Any noun phrase can be the subject of seem provided that it can be the subject of the infinitive predicate.

Hence, the expletive there requires, in simple finite clauses, the verb be for its appearance:

36) There is a good reason for that.

37) * There became a good reason for that.

And, although (35)(c) is acceptable, (38) is not:

(38) * There seemed to become a good reason for that.

We note, further, that the verb seem can also take a finite complement. When it takes a finite complement, however, its subject must be the expletive it:

38) It seems that there is a good reason for that.

Furthermore, when we look at idiom chunks, it will be recalled, in our discussion of passives, that idioms had to be listed as such in the lexicon. Specifically, the optimal lexical entries for the idioms keep track of, keep tabs on, and make headway, were given as (10) in Lecture #8, repeated here:

(10) a. track, N, +[keep____[P” of X ]]

b. tabs, N, +[ keep___ [P” on X ]]

. c. headway, N, +[make ____]

Now consider the fact that these idiom chunks can appear as subjects of the verb seem, when seem takes an infinitival complement. We have already seen this in (35)(d). Parallel to (35)(d) is (39):

39) a. Careful track seemed to have been kept of his progress.

b. Careful tabs seemed to have been kept on Monica.

If the infinitive complement of seem were to come to lack its subject by the rule of reflexive deletion, we would have to generate the antecedent of the reflexive as the subject of seem. Hence, the derivation of , e.g. (39a) would have to take (40) as the D-Structure:

40) C”


C T”

N” T’

Careful track T V”

Past V’

V C”

seem C’

C T”

for N” T’

e T M”

0 M’

M V”

to V’

V V”

have V’

V V”

been V’

V N” P”

kept careful of

track his prog-


We would then need a mechanism to convert the second occurrence of the idiom chunk careful track to a reflexive, after which it would undergo N”-preposing to the empty subject position of the infinitive, where it would undergo reflexive deletion, and the for would undergo for-deletion.

However, we would be violating our lexical requirements on the occurrence of these idiom chunks by generating them as subjects of seem.

It would also be desirable to relate the use of seem with the finite complement and expletive subject to the use with the infinitive. Let us now try to do this.

Suppose we give seem the lexical entry in (41):

41) seem, V, +[___C”]

We must make one stipulation. Given that the overt complementizer for never shows up in this type of infinitive construction, we actually have no evidence that the infinitive is introduced by for here. We do have evidence , as we have just seen , that the subject of seem , when seem takes an infinitive, is, for all intents and purposes, the subject of the infinitive. Furthermore, we have seen that seem, when it takes a finite complement, lacks a subject in the semantic sense. Let us then generate seem without a semantic subject in both instances, so that the D-structure of, e.g. (35)(a), would be (42):

42) C”


C T”

N” T’

e T V”

Past V’

V C”

seem C’

C T”

N” T’

John T M”

0. M’

M V”

to V’

V A”

be A’



The symbol “e” simply means “empty”, i.e. an unexpanded node in the phrase-marker. We then apply N”-preposing, the same transformation that was employed in the derivation of passive and unaccusative constructions, to move the subject of the infinitive into the subject position of seem. Recall that the formulation of N”-preposing was given in Lecture #8, (23), repeated here:

(23) ) N” - X- V - N”

1 - 2 - 3 - 4 --( 4-2-3- 0

In order to have the phrase-marker in (42) meet the structural description of N”-preposing, we must disregard the intervening complementizer. Let us therefore, for the moment, assume that null elements are not factored as being present when inspecting phrase-markers for compatibility with the structural descriptions of transformations.

Therefore, N”-preposing will apply, yielding (43):

43) C”


C T”

N” T’

John T V”

pres V’

V C”

seem C’

C T”

0. T’

T M”

0. M’

M V”

to V’

V A”

be A’



The analysis of the subject position of seem in the infinitive as coming to be occupied by the employment of N”-preposing now makes the derivation of (35)(d), repeated here, straightforward.

(35)(d) Headway seemed to have been made.

It simply involves t wo applications of N”-preposing to the D-structure in (44):

44) C”


C T”

N” T’

e T V”

past V’

V C”

seem C’

C T”

0 N” T’

e T M”

0. M’

M V”

to V’

V V”

have V’

V V”

be V’

V N”

make+en headway

There is a significant aspect to this account of passives, unaccusatives, and subject-to-subject raising. The terms passive, unaccusative, and subject-to-subject raising do not play a role in the grammar at all; there is a single transformation, N”-preposing, plays a role in the generation of all three constructions. The grammar, then, can be said not to pay attention to particular constructions, and these three constructions have only an expository use.

Furthermore, in our discussion of grammatical relations versus grammatical categories in Lecture #4, we noted that phrase-markers do not explicitly represent grammatical relations. We are now in a position to see why that decision has been made.

Note that the rule of N”-preposing does not just move objects; it also moves subjects of infinitive complements. In a certain sense, transformations are structure-dependent rather than function-dependent (terms due to Joan Bresnan (1976), “On the Form and Functioning of Transformations”, Linguistic Inquiry, Vol 7). They simply move grammatical categories in the right structural positions. The claim, and it is a strong one, is that there are no rules that, say, move subjects, or objects, but only rules that move such elements as N”s , P”s, etc.

Object Infinitival Complementation

We will now see that the mechanisms that we have motivated for the syntax of infinitives on the basis of sentences that consist, on the surface, of a contentful subject and an infinitive in complement position, will work without further ado for sentences that superficially consist of a subject , and, following the verb, a N” followed by an infinitive. Examples are given in (1) and (2):

(1) John {persuaded } Sally to be polite.

{ordered }

{ convinced}

2) John { believed } Sally to be polite.

{ proved }

{ expected }

Although the strings in (1) and (2) are all identical save the choice of main verb, when one looks further, one sees a significant difference in the range of N” plus infinitival sequences that are permitted to follow each of the verbs in (1), as opposed to the class of verbs in (2). Specifically, the N” that follows a verb of t he first class (which we will refer to for now as the persuade-class for ease of exposition) must be animate. In particular, it must be interpreted as the agent of the following infinitive in some sense:

3) John {persuaded }* { the rock to be on the table }.

{ordered } { there to be a valid reason for his absence }.

{convinced } { Fred to be six feet tall }.

No such restriction exists for the verbs of the second class (called for expository convenience the believe-class); any N” –infinitive sequence is possible, provided that the N” can be interpreted as the subject of the infinitive. Hence, the star is removed for all of the examples in (3) if the verb is of the believe-class:

4) John { believed } { the rock to be on the table }

{ proved } { there to be a valid reason for his absence }.

{expected } { Fred to be six feet tall }.

We see, then, that while the N” that follows a verb of either class must be interpreted as the subject of the infinitive that follows, a verb of the persuade-class imposes thematic restrictions on the N” as well, while a verb of the believe-class does not. Can we deduce anything about the structure of sentences containing such verbs from these co-occurrence facts?

With respect to verbs of the persuade-class, we can deduce that the post-verbal N” is not syntactically the subject of the following infinitive, but rather is in the structural position of the object. We can deduce this from the following constraint on locality of theta-marking:

5) Principle of Locality of Theta-Marking:

If α theta-marks (i.e. assigns a theta-role to ) β, then α and β must be sisters.

We can see evidence of (5) by examining sentences containing clausal complements that are overtly marked by complementizers. Turning first to verbs that take complements that are introduced by the complementizer for, we see that the matrix verb never restricts the content of the infinitive in any way, let alone restricting the subject position of the infinitive:

6) John would { prefer } for { the rock to be on the table }.

{ hate } { there to be a valid reason for his absence }.

{ love } { Fred to be six feet tall}.

{ hope }

{ wait }

When we turn our attention to that-complements, we see that that-complements are similarly unrestricted by the predicates that select them:

7) John { claimed } that { the rock was on the table }.

{ knew } { there was a valid reason for his absence }.

{ said } { Fred was six feet tall }.

{ indicated}

We see, then, that for cases in which we know that an N” is not a sister to the head of the phrase in which the N” resides ( since the N” is preceded by a complementizer), the N” is never assigned a theta-role by the head. Therefore, because the N” that follows a verb of the persuade-class is assigned a theta-role by that head, it must be a sister to the verb. In short, it must be the object of the verb, rather than the subject of the infinitive. Hence, the structure of, e.g., (1)(a) must be (8) rather than (9):

8) C”


C T”

N” T’

John T V”

Past V’

V N” C”

persuade Sally C’

C T”

for N” T’

herself T M”

0. M’

M V”

to V’

V A”

be A’



9) C”


C T”

N” T’

John T V”

past V’

V C”

persuade C’

C T”

0 N” T’

Sally T M”

0. M’

M V”

to V’

V A”

be A’



We can then use the rules of reflexive deletion and for-deletion to derive the structure for (1)(a).

For verbs of the believe-class, however, the matrix verb does not assign a theta-role to the post-verbal N”. Note that the Principle of Locality of Theta-Marking only states a necessary condition for theta-marking; in order to be theta-marked, the N” must be a sister to the element that theta-marks it. We might ask, however, sisterhood is a sufficient condition for theta-marking, in the sense that if an element is a sister to a lexical head, the head would assign a theta-role to the element. If we could establish that sisterhood assigns theta-marking, we would then be in a position to establish the structures of sentences containing believe-type verbs- the post-verbal N” would have to be the subject of the following infinitive, rather than the object of believe. Therefore, the structure of , e.g. (2)(a), would have to be (10):

10) C”


C T”

N” T’

John T V”

past V’

V C”

believe C’

C T”

0 N” T’

Sally T M”

0. M’

M V”

to V’

V A”

be A’



There is some evidence that sisterhood entails theta-marking, as pointed out by Chomsky (Lectures on Government and Binding (1981), Foris Press). Specifically, expletives appear in subject position, but not in object position. Therefore, we have no intransitive verbs that take an expletive object, such as the hypothetical laugh’ ( having the thematic structure of laugh, but taking an expletive object):

11) a.* John laughed’ there.

b. * John laughed’ it.

We might therefore propose that sisterhood, the environment for subcategorization, entails theta-marking, requiring the structure in (10).

Proposals have been made in the literature, however, notably by Paul Postal (On Raising (1974), MIT Press) that , while the post-verbal N” may originate as the subject of the infinitive complement of a believe-type verb, it becomes the object , by a transformation that is called Subject-to-Object Raising, and would be formulated as follows: (12) N”- [C”- N” -X ]

1 - 2 - 3- 4--(

3 - 2- 0 - 4

We can concretize the analysis by positing an empty N” position, as in (13):




C T”

N” T’

John T V”

past V’

V N” C”

believe e C’

C T”

0 N” T’

Sally T M”

0. M’

M V”

to V’

V A”

be A’



We must ask what the evidence is for subject-to-object raising, which would alter the structure but not the terminal string (as does restructuring of the helping verbs into T).

The best argument for subject-to-object raising concerns the placement of adverbs that must modify the main clause, as in (14):

14) I believe John with all my heart to be guilty.

The adverb obviously refers to the speaker’s belief. Now, let us consider a verb that takes an infinitive complement with the for-complementizer, as in (15):

15) I would prefer for John to be the winner.

Because the complementizer for is present, we assume that the N” that immediately follows is within the infinitive. Notice, however, that an adverb which modifies the main clause cannot intervene between the post-verbal N” and the infinitive marker when for is retained:

16) * I would prefer for John with all my heart to be the winner.

When the for is deleted, however, a main clause adverb can occur there more naturally.

17) I would prefer John with all my heart to be the winner.

It would seem, therefore, that an adverb must occur in the clause that it modifies. Therefore, the N” must be in the matrix clause, according to Postal’s argument.

One argument for subject-to-object raising that does not go through relies on the assumption that the antecedent for a reflexive must be in the same clause as its antecedent , known as a clause-mate condition. Evidence for the clause-mate condition can be seen in the ungrammaticality of sentences containing reflexives when this condition is not met:

18) a. * John thinks that nobody likes himself.

b. *John would prefer for himself to win.

However, reciprocals in English seem to be subject to the same distributional constraints as reflexives:

19) a. * They think that nobody likes each other.

b. * They would prefer for John to see each other.

However, reciprocals are clearly not subject to a clause-mate condition:

20) They would prefer for each other to win.

We will return to the distribution of reciprocals and reflexives and their antecedents. It is an extremely important topic in current syntactic theory, and we will account for the ungrammaticality of such examples as (18) and (19) in a different way.

Another argument for the subject-to-object raising is based on the interpretation of logical words such as every and not (called “logical operators”). Consider the interpretation of a sentence such as (21):

21) Every boy did not read the book.

Many people say that (21) is ambiguous, and can have either the interpretation in (22) or (23):

22) Not every boy read the book.

23) No boy read the book.

The two interpretations are said to correspond to a difference in the scope (roughly, the logical jurisdiction) of the two logical operators every and not. In the interpretation corresponding to (22), the negative is said to take wide scope relative to every, and every (called a “universal quantifier”) is said to take narrow scope. In (23), the universal quantifier is said to take wide scope relative to the negation, and the negative is said to take narrow scope. The assumption is that there is a mapping procedure between these expressions in natural language and a logical language which provides the basis of their semantic interpretation . The logical language is called Logical Form. The scope of negation in Logical Form corresponds to the clause in which it is contained. With this in mind, consider the interpretation of a sentence containing a persuade-type verb, in which the object is a universal quantifier and negation is contained within the infinitive, as in (24):

24) I persuaded every boy not to leave.

As predicted by the assumption that the scope of negation corresponds to the clause in which it resides at surface structure, together with the assumption that the N” that precedes the infinitive complement of a persuade-type verb is outside of the infinitive, the negative takes narrow scope with respect to the universal quantifier, and so the interpretation of (24) must be (25):

25) I persuaded no boy to leave.

Now, crucially, the interpretation of (26) is also unambiguous. The negative takes narrow scope with respect to the universal quantifier:

26) I believe every student not to like that class.

That is, (26) can only mean (27), and not (28):

27) I believe that no student likes that class.

28) I believe that not every student likes that class.

The prediction, then, would be that a universally quantified N” that precedes a negated infinitive when both follow a believe –type verb should always take wide scope over the negative. However, we also predict that when the N” + infinitive sequence follows the complementizer for, the ambiguity should reappear, and it seems that it does:

29) I would prefer for every student not to have to leave.

Lecture #11- Wh-Movement

In discussing transformations thus far, we have concentrated on one variety of phrasal movement- N”-preposing. This operation moves an N” into an argument position, typically the subject- position, but, as we saw at the conclusion of Lecture #11, into the object position at times. We will now discuss another transformation that moves phrasal elements, but which has some properties that distinguish it from N”-preposing.

To begin with, let us consider the process of question formation in English. So far, in our discussion of English questions, we have concentrated on what are called “Yes-No Questions”, in that they simply admit of yes or no answers, such as (1):

1) Did John eat the steak?

However, there is another variety of question, known as a constituent question, which does not admit of a yes or no answer, but which asks for a specification of some element in the sentence, such as (2):

2) What did John eat?

The element that is being questioned, in this case the object of eat, is expressed as what, and appears at the beginning of the sentence. The normal position of the object is empty- there is nothing after the verb. The questioned constituent takes a form that is known as a wh-form, so-called because all of the words that are used for questioning parts of a sentence in English have the letters w and h in them, as in (3):

3) Who did John kill?

4) How angry did John become?

5) Where did John put the book?

The wh-form in (3) ( the wh-forms are all underlined) corresponds to an animate N”, as in (6):

6) John killed Fred.

The wh-form in (4) corresponds to an adjective phrase, as in (7):

7) John became quite angry.

And the wh-form in (5) corresponds to a prepositional phrase, as in (8):

8) John put the book on the table.

We therefore have the following informal description of the formation of English constituent questions. A wh-phrase appears at the beginning of the sentence (in main clause questions) which corresponds to a particular constituent type (N”, A”, or P”), and there is a gap that corresponds to that constituent type elsewhere in the sentence. That the wh-phrase must correspond to a gap can be seen by the unacceptability of the following, which results from placing a full constituent of the appropriate type in the position of the gap in (3-5):

9) *Who did John kill Fred?

10) * How angry did John become quite sad?

11) *Where did John put the book on the table?

One can also see the effect of a matching between the wh-phrase and the gap in that the wh- phrase must correspond to a gap that is appropriate. Hence, the following shows the effect of this mismatch:

12) * How angry did John kill?

Hence, there is a dependency between a wh-phrase at the front of a question, and a gap at the beginning, such that the wh-phrase must, in a sense, “agree “ with the gap, in the sense of having the same characteristics as a non-wh-phrase that could have appeared in the position of the gap. We can account for this dependency by generating the wh-phrase in the position of the gap, where it would obey all the co-occurrence restrictions appropriate to elements that are generated in that position, and then moved to the front of the sentence (we will be more specific in a moment). For example, we have discussed the generation of idioms, and have utilized the mechanism of lexical specification of idioms as an argument for movement , in connection with N”-preposing, as in (13) (a) and (b):

13) (a) John made significant headway.

b) Significant headway was made.

Parallel to the argument for N” movement in the passive construction (and subject-to-subject raising) on the basis of idiom chunks, we find evidence for movement of questioned elements in constituent questions:

14) How much headway did John make___?

By the (by now) familiar argument from idiom chunks, the wh-phrase in (14) must have been moved to that position.

Notice that the wh-phrase that moves can move from a potentially indefinitely embedded position, so that the wh-phrase in (15) must have moved from a sentential complement within a sentential complement, i.e. three clauses down:

15) How much headway did Joe say that Bill thought that John made__?

The Landing Site of Moved Wh Phrases

In our phrase-structure rule for the projection of complementizers, we posited the phrase-structure rules in (16):

16) C”-( C’

C’--( C T”

The phrase-structure rule expanding C” violates the X- bar schema which predicts that every X” has a specifier. We might therefore analyze moved wh-phrases as moving to the Spec of C”, filling this otherwise missing position. One way of formulating the movement process would be to posit a feature [+wh] on the X” to be moved, and then formulating the movement as in (17):

17) X” - C- W- [X” +wh]

1 - 2 -3 - 4--(

4 - 2- 3 - 0

Hence, the D-structure of (2) would be (18):

18) C”

N” C’

C T”

N” T’

John T V”

Past V’

V [ N” ]



A Refinement of Term #2 in the Structural Description of Wh-Movement

It seems that (17) is too general in one crucial respect: in the way that it is formulated, a wh-phrase can move to the Spec of any C”. Hence, consider (19):

19) John believes that Fred saw who.

Nothing would prevent the wh-phrase from moving to a Spec within the embedded C”, generating either (20) if the complementizer is retained, or (21) if it is deleted:

20) *John believes who that Fred saw?

21) * John believes who Fred saw?

Of course, the only possible wh-movement that could occur to the structure corresponding to (19) would be (22):

22) Who does John believe that Fred saw?

(22) is often called a direct question, in that the wh-phrase occurs at the beginning of the entire sentence. English and every other language also has what are called indirect questions (also called embedded questions), in that the question is actually an argument that is selected by a predicate. For example, although (21) is unacceptable, (23) is acceptable, when the verb believe is replaced by the verb wonder. In fact, wonder requires a wh-phrase to introduce its complement, as seen by the unacceptability of (24):

23) John wonders who Fred saw.

24) * John wonders that Fred saw Sally.

Interestingly enough, wonder also allows a complement to occur with a wh-phrase that does not correspond to a gap in the clause-namely, the word whether:

25) John wonders whether Fred saw Sally.

Notice that the interpretation of a complement introduced by whether is that of a yes-no question. Hence, what John is wondering about can be given (26) as its content:

26) Did Fred see Sally?

Notice also that just as whether can occur with the phrase or not, as in (27)(a) and (b), or not can occur in a direct yes- no question, as in (28):

27) (a) I wonder whether or not John saw Sally.

(b) I wonder whether John saw Sally or not.

28) Did John see Sally or not?

There are voerbs other than wonder, which take such wh-complements: The verbs inquire, ask, tell, and know, for example:

29) a. John inquired as to who Fred saw.

b. John told me who Fred saw.

c. John asked me who Fred saw.

d. John knew who Fred saw.

They can all take complements that are introduced by whether as well:

30) a. John inquired as to whether Fred saw Sally.

b. John told me whether Fred saw Sally.

c. John asked me whether Fred saw Sally.

d. John knew whether Fred saw Sally.

Notice also that the complements are interpreted as having the content of questions.

We might account for the selection of such complements as questions by positing a feature on the Comp that is notated as +wh, and say that a +wh complementizer is an alternative to that, accounting for the impossibility of that ‘s occurrence in embedded questions. The lexical entry for a predicate that selects for an indirect question, such as wonder, will then be as in (31):

31) wonder, V , +[___[C +wh] ]

Notice that the verb know allows both interrogative complements (question complements) and that –complements, as in (32). Hence, it would have the lexical entry as in (33):

32) John knows that Fred saw Sally.

33) Know, V, + [___ { [C +wh ] }

{ [ C that ] }

We might then reformulate wh-movement to require that the C to whose Spec the wh-phrase moves must be a +wh C. Hence, we can account for the ungrammaticality of (20) and (21), because believe selects a CP headed by a that-complementizer.

To account for whether, we might propose that it is a marker of a yes-no question that is generated in [ Spec, C” ] when the question is a yes-no question. We might propose, then, that whether is deleted in the Spec of a direct question, accounting for the interpretation of whether in embedded contexts, but its absence in main clause contexts.

In short, (17) should be re-formulated as (34):

(34) X” - C- W- [X” +wh]


1 - 2 - 3 - 4--(

4 - 2- 3 - 0

In short, wh-movement to the Spec of a CP that does not contain a +wh complementizer will be ruled out because the structural description of wh-movement will not be met.

English, however, like most (but not all) other natural languages allows for more than one constituent to be questioned . When a multiple question occurs, such as (35), however, only one element will undergo wh-movement:

(35)Who gave what to whom?

We can see the reason for this if we consider the derivation of (35). The D-structure will be (36):

36) C”

N” C’

C T”


N” T’

who T V”

past V’

V N” P”

give what P’

P N”

to whom

Let us assume that the subject wh-phrase moves by wh-movement into the Spec of the matrix C”, yielding (37). This is known as a string-vacuous movement (like restructuring of have or be into T), discussed in Lecture #5), in that it changes the structure without changing the terminal string of the phrase-marker.

37) C”

N” C’

Who C T”



T V”

Past V’

V N” P”

give what P’

P N”

to whom

In the case of multiple wh’s, only one can move to [Spec, C”] for the simple reason that there is only one [Spec, C”], and when it is occupied by one wh-phrase, movement of another wh-phrase to that position would cause the first wh-phrase to be irrecoverably deleted, violating Recoverability of Deletion . Recall that movement only takes place to empty positions, as we saw in the case of restructuring and N”-preposing. Hence, these transformations are obligatory, but only to the extent that their application does not violate recoverability.

C. What the feature +wh selects

A striking discrepancy exists between declarative complements and interrogative complements. We have seen that, among the set of verbs that select for declarative complements, some only select finite complements, such as say, and some only select infinitive complements, such as wait:

(38)a. John said that Sally was crazy.

b. * John said for Sally to be crazy.

(39)a. * John waited that Sally left.

b. John waited for Sally to leave.

However, whenever a verb selects an interrogative complement, the complement can always be either finite or non-finite.

40) a. I inquired as to whether or not to leave.

b. I inquired as to whether or not I should leave.

41) a. I asked him whether or not to leave.

b. I asked him whether or not I should leave.

42) a. He knew what to do.

b. He knew what he should do.

We can account for this by assuming, as we have, that when A selects for B, A is selecting for the head of B, and the head of B imposes its own selectional restrictions. Hence, selection is a head-to-head phenomenon. With this in mind, let us assume that the complementizer that selects for finite T, for selects for non-finite T, and +wh simply selects for T, and doesn’t care about whether or not T is finite. Hence, it would allow either finite or non-finite complements. Because selection is simply for the head of a sister, it would be impossible for a verb that selected a +wh complement to require that the complement be finite or non-finite, because the verb would be separated from the complement’s Tense by the intervening Complementizer.

Non- Interrogative Wh’s-

We must note that there are other instances of wh-movement that do not have an interrogative interpretation. One case in point is the relative clause construction in English:

43) I’m looking for a person whom I can trust.

Interestingly enough, relative clauses can also occur as infinitives:

44) I’m looking for something on which to put this.

Following Luigi Rizzi ( “Residual Verb Second and t he Wh-Criterion”, in Adriana Bellettiand Luigi Rizzi, eds., Parameters and Functional Heads: Essays in Comparative Syntax, Oxford University Press (1996)), we might give questions the feature +Q in addition to the feature +wh, and relative clauses the feature +Rel in addition to the feature +wh.

Lecture #12- Relative Clauses and Noun-Complement Constructions

The rule of wh-movement, discussed in Lecture #11, does not operate simply to form questions. It operates in other constructions as well. There is an interesting parallel between wh-movement and N”-preposing that shows a shift in the view of transformational grammarians toward the nature of transformations. At one point, early in the development of transformational grammar, at the time of Chomsky’s Syntactic Structures (1957, Mouton), it was thought that transformations were extremely specific, and tied to specific constructions, so that there was ,e.g., a passive transformation , a subject-to-subject raising transformation, etc.

Around the 1970’s, there was a shift in thinking about the nature of transformations, such that transformations were no longer thought to be tied to specific transformations, but were stated more generally, as operations that were responsible for the generation of a wide range of constructions. A clear statement of this could be found in a 1976 paper by Lasnik & Fiengo (“Some Issues In The Theory of Transformations”, Linguistic Inquiry, Vol. 7). So, for example, the rule of N”-preposing that we have discussed operates in the derivation of the passive construction, as well as the unaccusative construction and in the subject-raising constructions.

Similarly, there is not thought to be a specific transformation of question-formation, responsible for the generation of constituent questions, but rather a transformation of wh-movement, that generates constituent questions as well as other constructions in which wh-movement plays a role. One of these other constructions is the relative clause construction, exemplified in (1):

1) The man who I saw.

There are two types of relative clauses in English, known as restrictive relative clauses and non-restrictive relative clauses. The relative clause in (1) is known as a restrictive relative clause, and an example of a non-restrictive relative clause is given in (2):

2) John, who I like,

In written English, non-restrictive relative clauses are set off by commas, and in spoken English, by pauses ( the intonation with pauses around the relative clauses is actually known as “comma intonation”). Semantically, the two types of relative clauses are quite different. The restrictive relative clause serves to restrict the reference of the head noun., so that in (1), the speaker is specifying more closely which man is being referred to. Non-restrictive relative clauses do not restrict the reference of the head noun, but simply provide , as a sort of side-comment, a description of some property that the head noun possesses ( they are also known as “appositive relative clauses”.) We will now focus on the structure of restrictive relative clauses, but we must first distinguish restrictive relative clauses from another construction in which a clause occurs within a N” that contains a lexical head noun, known as the noun-complement construction.

The Noun-Complement Construction

Every common noun in English, and indeed all natural languages, can occur with a restrictive relative clause, but certain nouns can also take a C” complement with somewhat different characteristics. These nouns include the nouns theory, story, claim, statement, rumor, belief, knowledge, and realization, among others. For example, the following noun phrases are all well-formed:

3) The { theory } that John was the murderer

{ rumor }

{ knowledge}

{ belief }

{ claim }

Notice that the clause within the C” in (3) does not contain a gap, and the C” is introduced by that rather than a wh-phrase. Notice that this type of clause within an N” is lexically restricted by the head noun in its occurrence, in that not all nouns allow this type of clause within the N”, as can be seen by the impossibility of (4):

(4) * The { pencil } that John was the murderer.

{ book }

{ letter }

Hence, it would seem that the nouns exemplified in (3) subcategorize for the clause, and, by local subcategorization, the noun and the clause must be sisters. Hence, the structure of the N” must be as in (5):

5) N”

Det N’

the N C”

theory C’

C T”

that N” T’

John T V”

Past V’

V N”

be D N’

the N


On the other hand, restrictive relative clauses can always occur within a N” that is headed by a common noun. In this sense, the licensing of relative clauses, which must occur within N”s, is similar to the licensing of temporals, which occur within a clause. Every simple sentence allows some kind of temporal, and the specific type is restricted by the semantic class within which the particular verb is situated, but the type of temporal that can occur within a clause is not restricted by the individual verb. Hence, temporal phrases that denote duration cannot occur within sentences headed by stative verbs:

(6) John {knows } French while Sally visited Fred.

{ understands}

However, this is a matter of the semantic class of stative verbs, not an individual lexical choice. For temporals, a temporal can always occur in a simple sentence, although the particular temporal is restricted by the semantic context in which it occurs. Restrictive relative clauses show a similar freedom of occurrence, suggesting that the phrase-structures of temporals and restrictive relative clauses should be similar.

Interestingly enough, when a noun-complement occurs with a relative clause, the order within the N” is most naturally noun complement-relative clause, as in (7):

6) The theory that John is the murderer that Bill was propounding.

Furthermore, there is no upper bound on the number of restrictive relative clauses that can modify a N”:

7) The book which John wrote which you wanted to read which was on the table....

The fact that (i) restrictive relative clauses follow noun-complements in the N” ,and (ii) are infinite in number, suggests that relative clauses should be adjoined to some projection of N. There are two possibilities: (i) adjunction to N’; and (ii) adjunction to N”. The two possibilities are shown in (8):

(8) a. N” b. N”

Det N’ C” N” C”

N’ Det N’

N (C”) N (C”)

In fact, it is possible to choose between (8)(a) and (8) (b) if we analyze numerals as determiners, as argued by Jean-Roger Vergnaud (1974, French Relative Clauses, unpublished Doctoral dissertation, MIT) . Consider a relative clause such as (9):

8) Five men and three women who were similar.

Under the interpretation in which the men in question are similar to the women in question. A predicate such as similar is known as a symmetric predicate (G. Lakoff & S. Peters (1969), “Phrasal Conjunction and Symmetric Predicates in English”, in D. Reibel & S. Schane, eds., Modern Studies in English, Holt, Rinehart, & Winston). Symmetric predicates require plural or conjoined subjects, hence it is impossible to say (unless interpreted elliptically), John is similar. Hence, the conjunction must be interpreted as being base-generated. With this in mind, the structure of (8) must be (9):

9) N”

N” C”

N” and N” N” C’

Det N’ Det N’ who C T”

five N three N T’

men women T V”

Past V’

V A”

be A’



If relative clauses are adjoined to N’, as in (8)(a), there is no source for the second numeral, which is analyzed as a determiner by hypothesis. Hence, we have direct evidence for the adjunction to N” for relative clauses.

B. Relative Clauses That Are Not Introduced By a Wh-Phrase

There are restrictive relative clauses that are not introduced by a wh-phrase, an example of which is given as the title to this section. Notice, however, that all relative clauses contain a gap. Assuming that the that which introduces these relative clauses is a complementizer, the subject of the relative clause is missing in the title above. Other instances of that-relatives which contain a gap in a position other than the subject position are given in (10):

10) a. The book that I read

b. The person that John was speaking to

The process that forms the gap in that –relatives has all of the characteristics of wh-movement, with the exception that the wh-form does not occur. Interestingly enough, earlier stages of English, and some Scandinavian languages, such as Swedish, allow both the wh-phrase and the overt complementizer to occur, as in (11):

11) *The book which that he read.

One way of describing the inability of wh-phrases and overt complementizers to co-occur in Modern English would be to posit a filter on certain S-Structures ( this was proposed by Chomsky & Lasnik (1977), “Filters and Control”, Linguistic Inquiry, Vol. 8 ). In other words, there would simply be a constraint on the sequence wh-phrase followed by an overt complementizer ( This output condition is known as “The Doubly Filled Comp Filter”). In order to rescue wh-constructions from the Doubly Filled Comp Filter, either the wh-phrase or the complementizer must delete. English must allow complementizers to delete in any event, as in (12):

12) a. I believe that he left.

b. I believe he left.

We must allow wh-phrases to delete as well, when they occur in [Spec, C”]. This does not seem particularly problematic, since the content of the wh-phrase is really given by the head of the relative clause (i.e., the N” to which the relative clause is adjoined), and hence recoverability would not be violated. On the other hand, deletion of a question wh-word would violate recoverability, since it has independent semantic content.

To sum up, then, we would derive a that-relative by deleting the wh-phrase in [Spec, C”], or a wh-relative by deleting the complementizer that. Examples are given in (13):

(13) a. The book *(wh-phrase) that he read.

b. The book which *(that) he read.

We also have the option of deleting both the wh-phrase and the complementizer:

(14) The book he read.

Lecture #13- Island Constraints

Most of the research in syntactic theory over the last 30 years or so has shown exactly how restricted the grammars in fact are, compared to what it is logically possible for them to be. We have seen this so far in the material that we have covered, but some of the constraints on grammars may have what is known as a functional basis, in the sense that the constraints may exist, while not for logical reasons, for reasons external to language. One such constraint is recoverability, which requires that no transformation in apply in such a way as to render material with semantic content irrecoverable. It is clear, however, that natural language would be much less efficient without the recoverability constraint. In this section, we will be concerned with other constraints whose external basis is much less clear, but which are nevertheless well-supported and pervasive.

J.R. Ross, in his 1967 MIT Doctoral dissertation ( Constraints on Variables In Syntax), noted a paradox in that there are transformations that are seemingly unbounded, but yet cannot apply out of certain configurations. The configurations out of which these transformations cannot apply were termed islands. For example, wh-movement, discussed in the past two lectures, apparently applies over an in principle unbounded stretch of the phrase-marker, as in (1):

1) Who does John think that Fred said that Mary believed that Bill liked?

However, there are certain configurations out of which wh-movement cannot occur. For example, if a clause is contained within an N” that has a lexical head noun, wh-movement cannot apply. For example, wh- movement cannot occur out of a noun-complement into a main clause, as in (2), or out of a relative clause into a main clause, as in (3):

2) * Who does John believe the claim that Mary likes?

3) *Who did John visit the man who saw?

So far, we have been looking at only one transformation that applies over an apparently unbounded distance. Another such transformation is the movement rule of topicalization, which can be seen to be operative in (4) and (5):

4) John I really like.

5) John I can’t believe the claim that anybody likes.

It is clear that topicalization is a different transformation than wh-movement. For one thing, topicalization moves the N” that is topicalized to a position in the phrase-marker that is distinct from [Spec, C”]. Wh-phrases never follow a complementizer, but topicalized phrases do, as pointed out in Baltin (1982) (“A Landing Site Theory of Movement Rules”, Linguistic Inquiry, Vol. 13, No. 1):

6) John said that this book, he really likes.

As I had also pointed out, topicalized elements can also follow fronted wh-phrases, as in (7):

7) He’s a man to whom liberty, we could never grant.

I will adopt the analysis of topicalization in Baltin (1982), in which topicalized elements adjoin to T”. With this in mind, notice that topicalized elements show the same restriction as the one exemplified by wh-movement in (2) and (3):

8) *This book I can’t believe the claim that anybody likes.

9) *This book I saw the man who read.

Therefore, Ross argued that the relevant restriction was not a restriction that was stated as conditions on particular transformations, but rather as a separate constraint on all transformations. It was stated as follows:

10) Complex NP Constraint

No transformation can move an element out of a C” that is contained within an N” that has a lexical head noun to a position out of that N”.

We can see how the Complex NP Constraint operates to block, e.g. (2). The underlying structure of (2) would be (11):

11) C”

N” C’

C T”


N” T’

John T V”

Pres V’

V N”

believe Det N’

the N C”

claim C’

C T”

that N” T’

Mary T V”

Pres V’

V N”

like who

The circled N” counts as a complex NP by Ross’s definition, and hence extraction out of it is impossible.

There are other island constraints, and we shall now go through them.

A. The Coordinate Structure Constraint

Consider a coordination such as (12):

12) John gave a book to Bill and Mary gave a magazine to Fred.

It is impossible to wh-move an N” within just one of the conjuncts, as in (13):

13) * I wonder what John g ave to Bill and Mary gave a magazine to Fred.

It is possible, on the other hand , to extract from both conjuncts, as in (14):

14) I wonder what John gave to Bill and Mary gave to Fred.

Extraction from all of the conjuncts simultaneously is called Across-The-Board extraction (see Edwin Williams (1978) “Across-The-Board Rule Application”, Linguistic Inquiry, Vol. 9 for a clear account of this phenomenon). It would operate as follows in (14). The underlying structure of (14) would be (15):

15) C”


C T”

N” T’

I T V”

Pres V’

V C”

wonder N” C’

C T”


T” and T”

N” T’ N” T’

John T V” Mary T V”

Past V’ Past V’

V N” P” V N” P”

give what to Bill give what to Fred

and both wh-phrases will move into [Spec, C”] simultaneously, with recoverability not being violated because they are identical.

With this in mind, the Coordinate Structure Constraint is stated in (16) :

16) Coordinate Structure Constraint

No element can be extracted from just one conjunct of a coordinate structure.

B. The Subject Condition

The Subject Condition may be formulated as follows:

17) Subject Condition

No element may be extracted from within a subject.

To see an example of the subject condition, consider the fact that wh-extraction can at times operate out of N”s in object position, but not in subject position, as can be seen in the contrast in (18):

18) a. Who did you see a picture of?

b. * Who was a picture of seen?

C. The Right Roof Constraint

This constraint notices an asymmetry between leftward movement rules, such as wh-movement, and rightward movement rules, such as extraposition . As we have seen, leftward movement rules can move elements leftward out of the clauses in which they originate. Rightward movement rules can never move elements out of the clauses in which they originate. To see this, consider extraposition of sentential complements, discussed earlier:

19) a. That John has blood on his hands proves nothing.

b. It proves nothing that John has blood on his hands.

Now, we notice that we can have an underlying sentential subject within a sentential subject, as in (20):

20) That it is obvious that John has blood on his hands proves nothing.

In (20), the underlying subject of obvious has been extraposed to the end of the matrix sentential subject, but it cannot extrapose to the end of the matrix clause:

21) * That it is obvious proves nothing that John has blood on his hands.

The underlying structure of (20) is (22):

22) C”0


C T”

N” T’

N’ T V”

N C”1 Pres V’

C T” V N”

That N” T’ prove nothing

N’ T V”

N C”2 Pres V A”

C’ be A’

C T” A

That John has blood on his hands obvious

The Right Roof Constraint as stated by Ross is as follows:

23) Right Roof Constraint

No element can be moved rightward out of a clause in which it originates.

We will examine some implications of the Right Roof Constraint. One implication that we can see is that it enables us to choose between the analysis of restrictive relative clauses in which they are adjoined to a phrasal projection of N, and one in which they are not. In the last lecture, we discussed the analysis of restrictive relative clauses such as Lecture #12’s (7), repeated here as (24):

24) The book which John wrote which you wanted to read which was on the table....

Such relative clauses were analyzed as being left-branching, with the structure given in (25):

25) N”

N” C”

N” C” which was on the table

N” C” which you wanted to read

The book which John read

Another alternative, suggested by Stefan Benus, is based on the fact that relative clauses can extrapose, as in (26):

26) A man arrived who came from Boston.

Given the existence of a process which moves restrictive relative clauses to the ends of the clauses that contain them, we can argue that sequences of restrictive relative clauses within the N” are really such that each relative clause is contained within the relative clause that appears to its left, so that a more abstract structure for , e.g. (27) would be (28):

(27) Someone who likes Mary who Fred likes arrived.

(28) C”0


C T”

N” T’

N’ T V”

N C”1 Pres V’

Someone N” C’ V

N’ C T” arrived

N C”2 T V”

who” N” C’ Pres V’

who C T” V N”

N” T’ likes Mary

Fred T V”

Pres V’



The structure for (27) would be derived by extraposing C”2 to the end of C”1. In this view, sequences of relative clauses are derived by positing structures in which the later relative clauses are contained within the earlier ones.

However, (27) has (29) as a variant:

29) Someone who likes Mary arrived who Fred likes.

If (28) were the correct structure for (27), (29) would have to be derived by extraposing C”2 to the end of C”0. However, this application of extraposition would violate the Right Roof Constraint, otherwise well-motivated. Hence, if we assume the Right Roof Constraint, we must allow the second relative clause to be dominated by the matrix clause, rather than the first relative clause. In short, we must assume the possibility of “stacked” relative clauses, as in (25), rather than assuming that the only source for sequences of restrictive relative clauses is one in which the second relative clause is embedded within the first one.


[1] A question mark before a sentence indicates that the sentence is of dubious acceptability, while an asterisk indicates ungrammaticality. The point of this section, however, is that acceptability is a pre-theoretic notion, having to do with how we feel about certain strings, while grammaticality is a post-theoretic notion, having to do with whether or not a certain string is generated by the grammar. We cannot decide whether or not a certain string is generated by the grammar until we have constructed the grammar, however, and it is in this sense that grammaticality is post-theoretic, since the account that we are constructing, a grammar, is a theory of what it means to know a language.

[2] We will discuss S in the next lecture.

[3] At this point, note that we have the same disjunction, {V”}, in two separate phrase-structure


rules. There is a way to eliminate this, but we will not go into it at this time.

[4] We will return to the question of whether anything is left behind when a head moves out of its maximal projection.

[5] There is a reason that I am using the term post-verbal NP rather than object, in that I will be trying to show in Lecture #10 that there are post-verbal NPs that are not objects, but which participate in the passive construction.




Art Adj N P N interest me

The big books about Nixon


N” M have been V”

N’ would eating the steak






M N” have been V”

Would John eating the steak


N” have been V”

John eating the steak

1 2



Online Preview   Download