Lexically Specified Derivational Control in Combinatory Categorial Grammar Jason Baldridge Abstract 1. Introduction The central thesis of this dissertation is that an explanatory theory of natural language grammar can be based on a categorial grammar formalism which allows cross-linguistic variation only in the lexicon and has computationally attractive properties. To back up this thesis, I present Multi-Modal Combinatory Categorial Grammar, a refinement of the Combinatory Categorial Grammar (CCG) framework (Steedman 2000), and apply it to several phenomena from typologically diverse languages. There are three primary goals of this work: first, to boost the predictive power and explanatory force of the CCG theory by enhancing its sensitivity to the resources it manipulates and consumes; second, to provide new accounts for linguistic phenomena, improved adaptations of existing analyses, and cross-linguistic comparisons; and finally, to demonstrate some of the advantages of the resulting formulation of CCG for computational implementations. In this abstract, I outline and discuss the theses behind these goals and provide an overview of the dissertation. 2. Theses Proposed The primary linguistic focus of this dissertation is a detailed examination of two core types of behavior in natural language grammar: syntactic extraction asymmetries and scrambling behavior. The former is characterized by situations in which particular arguments in a sentence are unsuitable targets for extraction in certain contexts; that is, it is not possible to use these arguments in forming questions, relative clauses, topicalized sentences and the like. For example, the well-known subject/object asymmetry of English appears in embedded clauses such as the following: (1) *Brazil is the team_i that John knew that t_i would beat Germany. (2) Germany is the team_i that John knew that Brazil would beat t_i. We see in (1) that extraction of the subject from the embedded clause to form the relative clause is ungrammatical, whereas the object is accessible for extraction, as shown in (2). Unlike the situation with many island violations, there is nothing semantically incoherent about a relative clause such as that in (1), and the grammar apparently disallows it for entirely syntactic reasons. Perhaps the majority of languages exhibit greater flexibility in word order than English, some to a greater extent than others. For example, languages like Czech, Modern Greek, Russian, Turkish, Korean, and Tagalog all permit the same propositional content to be conveyed with multiple word orders in which the arguments of verbs can permute with respect to one another. Word order freedom is even greater in a language such as Warlpiri, which even permits parts of a single noun phrase to permute with other elements in a sentence. Paying attention to these two core phenomena, we thus observe a basic tension in natural language grammar: sometimes it blocks perfectly sensible meanings from being expressed in certain ways which at first glance appear to be arbitrary, and sometimes it allows a single meaning -- modulo information structure -- to be expressed in multiple ways. As we will see with Tagalog, these restrictions and freedoms can co-exist in the grammar for a single language. This leads naturally to the question of how we can define a theory of grammar which is able to rule out examples such as (1) whilst having the flexibility to permit multiple word orders in other contexts. The following thesis addresses this question. Thesis 1. A resource-sensitive approach which distinguishes multiple modes of grammatical composition is necessary to adequately characterize both the restrictiveness and the freedom exhibited by natural language grammars. The task of almost any formal system is to apply some group of operations to collections of structured objects, or resources, in order to determine some global properties about each collection. Resource-sensitivity is a notion that governs the manner in which a system's operations can utilize its resources: how often they may be used, how they can be assembled together to create larger structures, and how they can be reconfigured into other equivalent structures. Resource-sensitivity is nicely exemplified in Linear Logic (Girard, 1987), which counters the lack of discrimination inherent in many logical systems, such as classical propositional logic, in which a single proposition may be used multiple times or wasted in proving a result. Linear Logic gets a grip on resource consumption by employing an implicational connective whose associated rule consumes the resource that is needed to prove the consequent. These logical concerns have direct parallels in natural language grammar. Clearly, the multiplicity of linguistic material is important, since linguistic elements must generally be used once and only once during an analysis. Thus, we cannot ignore or waste linguistic material (3,4), nor can we indiscriminately duplicate it (5). (3) *The coach smiled the ball. (\neq The coach smiled.) (4) *The fans the coach cheered. (\neq The fans cheered.) (5) *Ronaldo passed the ball to. (\neq Ronaldo passed the ball to himself.) Linear Logic does permit a single resource to be used in multiple proof steps through the rule of Contraction and resources to be wasted via the rule of Weakening. However, there is an important difference in that these rules are restricted to apply only to resources of the appropriate types and are thus not globally available, unlike the case in Classical and Intuitionistic Logics. Resources themselves are designed so as to invoke only a subset of the available rules, making it possible to capture the distinction between resources such as money and love: money gets used up when it is given but love can be spread around infinitely. Linear Logic thus shifts to a perspective in which logics with different behaviors can co-exist and operate over the same set of premises without stepping on each other's feet. The issue of resource consumption is a fundamental basis of resource-sensitivity, and most formal systems for natural language grammar do indeed respect the dictum that resources may not be indiscriminately wasted or used multiple times. However, a resource-sensitive system must also pay attention to the arrangement of its resources --- how they are ordered linearly and hierarchically and the means by which they may have been combined. It is hardly surprising that linear order matters for natural language since it is the only aspect of syntax to which we have direct access. Formal grammar systems thus typically respect the importance of order and thereby ensure that sentences with the same lexical material but different word orders do not necessarily have the same analytical properties. Indeed, most syntacticians would be rather suspect of any system that could not differentiate the strings "Brazil defeated Germany" and "defeated Brazil Germany" in English. To continue with the theme of viewing the properties of natural language grammar through a logical lens, we can consider building a logical system that uses directional implications and treats lexical items as proof terms, as is done in the Categorial Type Logic (CTL) tradition of categorial grammar (Morrill, 1994; Moortgat, 1996; Oehrle, to appear). The task of the grammar is then to find a proof that some set of axioms (in the form of items retrieved from the lexicon based on a given sentence or string) can be arranged in a manner that gives rise to the correct order and has the appropriate resultant properties (syntactic category, semantics, etc.). CTL provides sensitivity to much more than linear order --- it also permits the definition of multiple modes of grammatical composition which each have their own associated connectives. Operations which restructure the premises are keyed to particular modes so that they are not globally applicable. This is similar to the restricted use of Contraction and Weakening in Linear Logic, except that CTL allows a far wider range of rules to be defined. This allows one to use different kinds of implicational operators, each exhibiting its own unique behavior. Some might permit associative or permutative restructuring of the premises, while others might have more limited capabilities. The premises themselves are constructed using these keyed connectives, endowing the system with what Oehrle (to appear) calls self-contained inferential control. This means that instead of acting as absolute and global choices, parametric options regarding the way in which a set of premises can be restructured are selectively invoked via the appropriate type-declarations in the premises. It is precisely this aspect of resource-sensitivity which is least obvious and which is the crux of Thesis 1. It allows us to get a precise handle on the ability of different parts of the syntax to have access to rules which induce associativity and permutativity, and this plays a major role in this dissertation's account of why syntactic extraction asymmetries arise. Having an explicit resource-management regime is also crucial for defining a system that is liberal enough to permit word order variation without needing ad hoc constraints to ensure that it does not fall into word order collapse. To exploit sensitivity to structural arrangement in CCG, I use the category constructors of CTL and redefine the rules of CCG so that they respect the modes of grammatical composition licensed by the categories they attempt to combine. It this manner, CCG gains the ability to utilize lexically specified derivational control, the implications of which are explored extensively in this dissertation. Thesis 1 regards general architectural considerations that must be supplemented with investigations into specific patterns of natural language. One phenomenon which I consider in this dissertation is the kind of word order variation known as scrambling, which can occur in both local and long distance contexts. Locally scrambled arguments are those which are dependents of a single head that can permute with respect to one another, whilst an argument which has scrambled long distance is found not in the domain of its own head, but in that of another head. For example, the following Turkish sentences, in which the subject and object arguments can permute, convey the same basic propositional content: (6) Ayse kitabi okuyor Ayse book read "Ayse reads the book." (7) Kitabi Ayse okuyor book Ayse read "Ayse reads the book." With long distance scrambling, we find an argument of a lower clause appearing higher up, as in the following example: (8) Esra'nin_i Fatma [t_i gittiugini] biliyor. Esra Fatma left know "As for Esra, Fatma knows that she left." The question thus arises as to whether local and long distance scrambling should be accounted for with the same or different kinds of grammatical mechanisms. The position I take is that these are fundamentally different processes, as stated by the following thesis. Thesis 2. Local scrambling behavior results when heads subcategorize for their arguments in a manner which does not specify an explicit linear order of combination. Long-distance scrambling arises instead as a reflex of the interaction between lexical subcategorization and the rules made available by the resource-sensitive system. The generative power of grammatical formalisms is of interest in many traditions. The grammar of natural languages is recognized to require at least mildly context-sensitive power (Huybregts, 1984; Shieber, 1985), and it has been argued that long distance scrambling requires more power than this out of the competence grammar (Rambow, 1994; Hoffman, 1995). The multi-modal formulation of CCG provided in this dissertation remains mildly context-sensitive (like CCG) and is nonetheless able to handle long distance scrambling to the level that appears to correspond with the amount of scrambling which native speakers tolerate. Thesis 3. Mildly context-sensitive generative power is sufficient for handling long distance scrambling. By using a system with limited generative power, many linguistic predictions come for free since the system simply cannot perform a wide range of potential operations. Nonetheless, we should not be absolutely stuck with a mildly context-sensitive formalism if we do eventually need more power. The multi-modal formulation provides the means to increase the power of the system in a highly controlled fashion such that more powerful operations are used only by grammars that need them, only when they need them, and without precipitating a collapse in word order. Having said this, I strongly contend that we should result to more powerful formulations only with great skepticism in the face of overwhelming evidence for their necessity. Categorial grammar in general is an extremely lexicalist tradition, but it is nonetheless standardly assumed in most categorial formalisms that variation between the grammars of different languages can occur not only in the lexicon, but also in the rules of combination. The multi-modal approach I adopt in this dissertation facilitates the creation of an invariant rule component that permits me to take a fully lexicalist position. This leads to the following thesis: Thesis 4. It is possible and desirable to define a framework in which all variation between grammars is specified in the lexicon. While I do not wish to claim that a parametric view on grammatical rules is inherently flawed, this thesis acts as a handcuff that leads to interesting observations about how a given lexicon will exploit a universal set of rules and simplifies the task of the grammar developer over formulations that assume that rules have a parametric nature. This dissertation demonstrates that a great deal of mileage can be obtained from a relatively small, invariant rule component that is sensitive to the grammatical resources that it combines. One of the most important effects of this perspective is that it permits a simple characterization of how syntactic extraction asymmetries arise cross-linguistically, as summarized in the following thesis: Thesis 5. Syntactic extraction asymmetries emerge in grammars which enforce limits on local and/or long distance scrambling by utilizing lexical categories that are inaccessible to syntactic operations which induce associativity and/or permutativity. The strategy of removing all variation from the rule-base places increased demands on the lexicon. It is thus important that generalizations can be expressed so that redundant information can be shared between categories. I therefore adopt the approach put forth by Villavicencio (2002) for permitting the categorial lexicon to be structured via an inheritance hierarchy of typed default feature structures (Pollard and Sag, 1987). Even without an invariant rule component, such a view of the lexicon is needed in CCG. Parsing in CCG is generally construed as the application of a finite set of combinatory rules to the categories licensed by the input and created from previous applications of the rules. Lexical ambiguity is a major factor in reducing the speed of parsing. CCG has traditionally permitted its rules to be restricted in their applicability to only apply or not apply to certain categories. When using such rules in parsing, computational overhead is incurred as the input categories must be checked for compatibility with the restrictions. Thesis 6. The multi-modal formulation can be exploited to improve implementations of CCG. Multi-Modal CCG helps in two ways. First, it is possible to use one category in situations where otherwise several categories would be required (eg. in languages with scrambling). Second, by providing modally aware formulations of the combinatory rules and disallowing restrictions on those rules, specialized implementations of the rules can be created which scan the input categories and fail much more quickly than is possible with restrictable CCG rules. 3. Overview of Dissertation In Chapter 2, "Formal Foundations", I begin by describing categorial grammars. Categorial grammars provide a type-driven perspective on natural language grammar that maintains a tight connection between syntactic and semantic composition. They are precisely defined, permit flexible surface constituency, are semantically transparent, and are at the center of a growing body of linguistic work. This chapter introduces the basic concepts behind categorial approaches, such as syntactic categories, semantic interpretation and rules of category combination, and it then gives greater detail for formalisms and traditions that the core categorial perspective has given rise to. Specifically, we consider Combinatory Categorial Grammar (CCG) (Steedman, 2000), Multiset Combinatory Categorial Grammar (Multiset-CCG) (Hoffman, 1995), and Categorial Type Logic (CTL) (Morrill, 1994; Moortgat, 1996; Oehrle, to appear), all of which play a major role in the approach developed in this dissertation. CCG provides the most important backdrop, whilst Multiset-CCG and CTL point toward ways of relaxing and fine-tuning, respectively, grammatical composition in categorial grammar. The generative power of the various frameworks is then discussed with respect to the linguistic significance they attach to restricted generative capacity. The chapter finishes with a brief look at the dependency grammar tradition of Functional Generative Description (FGD) (Sgall etal., 1986). Throughout the primarily linguistic parts of this dissertation, I make use of FGD's dependency relations for different kinds of arguments as a descriptive device to obviate the need to explicitly show logical forms for categorial derivations whilst demonstrating that the correct dependencies are obtained by the linguistic analysis. I then turn to issues regarding the creation of a linguistic theory within a categorial approach in Chapter 3, "Substantive Universals". I outline an initial approach to a theory of lexical categories based on distinctions made in the Government \& Binding (Chomsky, 1981) and Minimalist traditions (Chomsky, 1994) and the typed feature structures of Head-driven Phrase Structure Grammar (Pollard and Sag, 1994). I also consider approaches for providing structure to the lexicon and expressing relationships between the objects stored within it. The final section of this chapter explicates the principles which guide the form of combinatory rules in CCG and discusses how restrictions can be placed on any given rule in standard CCG analyses. In Chapter 4, "Linguistic Motivation", I explicate the linguistic data which motivates the formal developments made in the dissertation. I begin with the English subject/object asymmetry in extraction from embedded clauses and discuss the CCG explanation of the asymmetry due to Steedman (1996). After this, I turn to the striking extraction asymmetries found in the Austronesian languages Tagalog and Toba Batak and describe some of the proposals that have been put forth to explain their distribution. Then, I discuss local and long distance scrambling in Turkish and different manners of handling such variability in categorial approaches, especially that of Multiset-CCG (Hoffman, 1995). Finally, I show that there is a need for limitations on permutativity even in languages with a great deal of word order freedom like Turkish. Chapter 5, "Modal Control in CCG", explicates how CCG's resource-sensitivity can be boosted by incorporating the multi-modal perspective on grammatical composition familiar from CTL. I show how this provides fine-grained lexical control over the use of CCG's combinatory rules and permits me to dispense with restrictions on those rules. As such, I can claim a universal rule component for CCG and bring back the use of rules which were previously excluded from some grammars. Several aspects of English syntax are dealt with under this formulation and it is shown that many improvements can be made over the prior CCG analyses by using resource-sensitive rules. I then demonstrate that Steedman's analysis of Dutch (Steedman, 2002) can be significantly improved by recasting it in Multi-Modal CCG. Following that, the next section develops the argument that the CCG rule set should be universally available by showing that certain combinatory rules are interconnected and cannot be arbitrarily activated or inactivated. Finally, I show that the multi-modal formulation of CCG has the same generative power as the original formalism. In Chapter 6, "A Restricted Approach for Argument Scrambling", the definition of Multi-Modal CCG is completed by including multisets in the category constructors and rules, based in part on developments by Hoffman (1995) for Multiset-CCG. To motivate the use of multisets in categories, I demonstrate their use for local argument scrambling in Turkish. The need for resource-sensitivity in an approach which uses multisets is demonstrated with respect to limits on permutativity for some constructions in Turkish and for English phrasal verbs and adverb placement. Thereafter, I show how the system deals with long distance argument scrambling without suffering from some of the overgeneration that the less discrimating and more powerful Multiset-CCG produces. Finally, I show that Multi-Modal CCG as defined in this chapter is mildly context-sensitive, like CCG. Having thus motivated and developed Multi-Modal CCG, Chapter 7, "Syntactic Extraction Asymmetries in Tagalog and Toba Batak", demonstrates how the modal control available in the grammar combined with the proposed categories conspire to explain the observed asymmetries. It is also shown how Multi-Modal CCG permits a simple account of local scrambling in these languages -- not only without confounding the account of asymmetries, but at times even supporting it, in contrast with some previous approaches. The analysis given in Chapter 7 provides the most extensive account of Tagalog's asymmetries to date. In combination with the analysis of the English subject/object asymmetry by Steedman (1996) and a further analysis in Chapter 7 of asymmetries in Toba Batak, I explicate a cross-linguistic characterization of the appearance of asymmetries. Chapter 8, "Implementation of Multi-Modal CCG", begins by reviewing previous work in creating grammars and parsers for CCG, followed by a discussion of how I have adapted the Grok system (Hockenmaier etal., 2001; Bierner, 2001) to support the data structures and properties of Multi-Modal CCG. Grammars have been implemented based on the linguistic analyses given for English, Dutch, Turkish, and Tagalog, in this dissertation, and I highlight some of the properties of these grammars and discuss how the the process of developing them not only ensured the correctness of the analyses but also led to interesting linguistic observations in some cases. In summary, this dissertation provides: * Suggestions for and discussions of substantive universals from the categorial perspective. * Multi-Modal CCG, a formalism which has a strict resource-management regime and permits variation only in the lexicon. * Linguistic application of this formalism to English, Dutch, Turkish, Tagalog, and Toba Batak, including a cross-linguistic analysis of the appearance of syntactic extraction asymmetries in English, Tagalog, and Toba Batak. * Computational implementation of the developed formalism. 4. Overview of Results The dissertation focuses on a number of aspects of the CCG framework that until now have not been addressed or have proven problematic. Though I have dubbed the framework put forth in this dissertation Multi-Modal Combinatory Categorial Grammar, it should be noted that this is not intended to put it in opposition to CCG as it has been defined and utilized in previous work. Rather, this should be viewed as an evolution of the overall CCG approach, based on formal devices made available in CTL and Multiset-CCG. Multi-Modal CCG merges the combinatory basis of CCG with the principled resource-management regime of CTL. The restricted generative capacity of the combinatory basis leads to cross-linguistic predictions about the space of natural language grammars, and resource-management through modalities allows tighter control within that restricted basis (Thesis 1). Whereas CTL provides a very general formal theory for the modal approach to resource-management, Multi-Modal CCG makes commitments about the actual modes and rules which are utilized in all grammars. This dissertation shows that these commitments lead to cross-linguistic predictions and correlations that are stronger than those available under CCG. The key aspect of Multi-Modal CCG that makes this result possible is its universal rule component, which leaves all variation to the lexicon (Thesis 4). I have proposed a small set of modalities and defined how they are used in conjunction with the CCG rules, showing how these particular modalities and rule formulations positively impact the analysis of linguistic phenomena in English, Dutch, Turkish, Tagalog, and Toba Batak. The modalities I have proposed might be subject to revision in the future, but the linguistic analyses provided in this dissertation have demonstrated that we must use some set of modalities to control the applicability of combinatory rules. One possible expansion of the use of modalities is to associate arrays of modalities with every slash as a way of cleanly separating modes controlling associativity and permutativity from those of other dimensions, such as headedness, the active versus inert distinction, or the lexically connected versus derivationally connected distinction. If we wish to encode all of these distinctions but are forced to do so with a single modality on each slash, the relationships between the various modalities will become considerably more complex than an approach which factors their behaviors into separate dimensions. By implementing a multi-modal perspective in CCG, I hope to bring the logical and rule-based categorial traditions closer together. Indeed, the analyses provided in this dissertation should transfer fairly straightforwardly to the CTL setting, using the CTL rules that I have defined. This may require either using basic feature structures in CTL or modifying the categories so that features are declared as unary modalities. Also, categories defined with multisets would need to be expanded into all of their rigid instantiations. Nonetheless, the core properties of the analyses arise from the categories and the modes of grammatical composition they license, and therefore the two systems will demonstrate the same behaviors for the same reasons on this more fundamental level. Based on the relationship between Multi-Modal CCG and the CTL rules I have defined to ground those rules, one can furthermore envision a general technique for translating CTL grammars into rule-based instantiations, essentially by running the proof system to create a set of theorems which define the rule base. Such translations would lack the full logical power and generality of CTL, but it could prove a useful avenue for more efficient parsing with CTL grammars. Another major formal contribution of this work has been to improve CCG's ability to deal with flexible word orders by incorporating the multiset definition of categories into CCG. I have done so in a way that does not increase the generative power of the system, thus providing a kind of de-extension of Hoffman's Multiset-CCG. The ability of Multi-Modal CCG to handle limited, but not unbounded, levels of long distance scrambling provides further support to the arguments of Joshi etal. (1996) and Kulick (2000) that limits on scrambling can and should be considered as a property of linguistic competence, in contrast with the viewpoint put forth by Rambow (1994) and Hoffman (1995), who claim that such limitations lie in the domain of performance (Thesis 3). It was also shown that the flexibility provided by categories with multisets can be used in languages like English and Toba Batak without engendering too much freedom in word order. Using very restricted modalities on the arguments in the multisets greatly limits their long distance combinatory potential while providing the necessary local permutativity. Multi-Modal CCG thus provides a single system that can be utilized for configurational languages such as English and flexible ones like Turkish, catering to the individual needs of each with a small set of universal modalities. The resource-sensitivity of Multi-Modal CCG makes it possible to make the rules inviolable and leave all variation to the lexicon (Thesis 4). This reduction in the parametricity of the rule-base is a principled one. Under the restrictional setting previously assumed in CCG, it was possible to restrict parts of the rules to very specific categories and/or block certain categories from serving as input to the rule. This led to a situation in which the same restriction often needed to be stated for more than one rule --- with no explanation as to why this should be the case. By pushing the control into the lexicon through the use of categories that can combine only through particular modes of grammatical composition, we see that the categories can be appropriately blocked from all the relevant combinatory rules. And not only have the linguistic predictions become stronger --- it is also considerably easier for the working categorial grammarian to ensure that a grammar is suitably controlled by using resource-sensitive categories than it is by using globally effective rule restrictions. The flip-side of the invariant rule component is that it becomes possible to exploit rules which would otherwise need to be absolutely banned from some grammars, such as forward crossed composition in the grammar of English. Despite the universal availability of the rules, the system's resource-sensitivity allows any given grammar to select only a subset of the rules through the use of modes that are restricted to certain rule groups. For example, a Multi-Modal CCG grammar that uses only the most restrictive modality will exhibit the same behavior as the AB calculus. Even though it seems unlikely that any natural language grammar would be quite so strict, it does allow different phenomena to be handled with different levels of formal power in a principled manner. Furthermore, if it ever is determined necessary to augment the rule-base with more powerful rules, we can do so in controlled manner that will preserve previous analyses. For example, we can use more powerful rules such as those proposed by Hoffman in a controlled manner by defining a super-permuting modality that licenses these rules. In this manner, Multi-Modal CCG can grow in power without losing the discrimination provided by the present system. The Multi-Modal CCG formulation itself is completely compatible with a parametric view of the rule base, therefore leaving ample theoretical room to switch to a less extreme lexicalist position than that stated in Thesis 4. However, while I recognize that it may be necessary to allow languages to differ with respect to which rules they utilize, the rules themselves must remain unchanged from their given form and cannot be restricted or tailored for a given language in any way. This accords with the approach standardly assumed in CTL, in which a particular grammar can utilize different structural rules but cannot place arbitrary restrictions on them. Because a significant amount of variation has been pushed from the rule component into the lexicon, the need for a theory of the CCG lexicon becomes more pressing than ever. The inheritance-based approach of Villavicencio (2002) appears to be an excellent starting point for encoding lexical redundancy in a systematic and well-motivated fashion. I have shown that the structure of Multi-Modal CCG categories can be utilized to extend this approach to parametrically diverse languages, though some challenges remain for Toba Batak. I have also defined a preliminary hierarchy for atomic categories that contains some of the major distinctions which are needed in the linguistic analyses offered in this dissertation. Even though it is incomplete and merits a much more detailed specification such as that which is typically made available in HPSG, it provides a more principled approach to atomic categories than has thus far been advanced and the distinctions were crucial for the analysis of every languages investigated in this dissertation. This dissertation has also covered extensive linguistic ground over phenomena occurring in English, Dutch, Turkish, Tagalog, and Toba Batak. The formal developments were shown to provide significant improvements for existing CCG analyses of English and Dutch. Whereas the rule sets previously declared for the two languages had significant differences in terms of language-specific restrictions, omission of some rules in one and not the other (e.g. forward crossed composition in English), and changes to the categories of the rules themselves (e.g. forward harmonic composition in Dutch), the Multi-Modal CCG analyses utilized precisely the same rules for both grammars. Multi-Modal CCG was also shown to adequately handle local and long distance scrambling in Turkish without overgenerating in the way that the Multiset-CCG analysis of Hoffman (1995) did. In particular, Multi-Modal CCG provided inherent limits due to its restricted generative capacity and limits enforced straightforwardly by slashes with restrictive modalities. Hoffman instead required complex rule restrictions to avoid overgeneration. Due to the formal devices made available by Multi-Modal CCG, the categorial analysis of syntactic extraction asymmetries in Tagalog given in Chapter 7 greatly enhances the coverage and simplicity of the one provided in Baldridge (1998). Where restrictions and ad hoc mechanisms were previously needed to control Tagalog's long distance asymmetries, the Multi-Modal CCG analysis relies on cross-linguistically motivated ways of limiting associativity. The result is the most extensive coverage of Tagalog asymmetries to date, while using far less generative power to achieve that task than other accounts, especially that of Nakamura (1998). Previous linguistic analyses of Tagalog have struggled to reconcile the local scrambling behavior of Tagalog with its extraction asymmetries because they have relied on specific phrase-structural positions to single out a particular argument for extraction. In the Multi-Modal CCG analysis provided in this dissertation, on the other hand, local permutativity and extractability are not intertwined --- local permutativity is permitted because the arguments of Tagalog's verbal categories are contained in multisets, and the asymmetries arise because of differences in the modes of grammatical composition specified for those arguments (Thesis 2). Another unique aspect of the Tagalog analysis is that it crucially depends on assuming that the syntactic types of word classes can differ across languages. For example, in English and many other languages nouns have the category n, while in Tagalog nouns have the category s/n. This assumption is supported by the distributional evidence of basic Tagalog sentences, and it forces new assumptions about the categories of other word classes, such as case markers, that make the correct predictions about a wide range of constructions. It would be of great interest to examine other languages which lack a copular verb to see if similar patterns emerge. Finally, a basic categorial analysis was given for Toba Batak which demonstrates the interesting interaction of verbal voice, word order, adverb placement, and extraction asymmetries in the language. The analysis of Toba Batak rounds out the overall cross-linguistic characterization of syntactic extraction asymmetries provided in this dissertation. Extraction asymmetries arise essentially because the functor categories of the grammar of any given language combine with their arguments through particular modes of grammatical composition that mediate the possibilities for associative and permutative operations to be utilized in the grammar (Thesis 5). This affects what the possible consituents and word orders are, and this in turn makes different arguments accessible or inaccessible for extraction. In English and Toba Batak, it was shown that important categories, such as those for complementizers and verbs, license only associative operations and hence limit the possible constituents which can be created. In Tagalog we find the reverse --- asymmetries arise because generally only one argument of a verb can license associative operations, whilst all of them allow permutative operations. I am unaware of any extraction asymmetries in Turkish, but it would hardly be surprising if there are none --- given the language's propensity to allow not only local scrambling, but also long distance scrambling, the categories of the Turkish grammar must license both associative and permutative operations. Though Hoffman (1995) reports that there are some islands to extraction in Turkish, she attributes them to semantic incoherence rather than failure of the syntax. The fact that languages like Tagalog and Toba Batak have been dealt with is particularly significant from the categorial perspective. Categorial grammars have suffered from the perception by some that, although they may provide interesting accounts of phenomena in European languages such as English and Dutch, they are not generally suitable for wider linguistic application to more parametrically-diverse languages in other language families. This dissertation thus adds to the growing body of work that demonstrates that such a perception is ill-founded. Finally, the invariant nature of Multi-Modal CCG rules both simplifies the task of grammar development and makes it possible to write more efficient implementations of combinatory rules than is possible for a parametric, restrictable rule component. The category structures used by Multi-Modal CCG are also important for defining lexical inheritance hierarchies for diverse language types. The effective use of these properties has been demonstrated in Grok, a practical framework for developing and using Multi-Modal CCG grammars (Thesis 6). Grok was crucial for checking the validity of the analyses and the process of implementing the grammars even led to some interesting linguistic observations. A meta-goal of this dissertation has been to demonstrate that it is necessary to recognize the individual character and elegance of many different approaches to natural language grammar and learn from them. I hope to have shown the advantages of incorporating techniques, devices, distinctions, and perspectives from a variety of approaches, which are so often concerned with orthogonal issues. Their solutions can at times be utilized complementarily, and we should be on constant look-out for cross-fertilization of this nature. The many specific formal, linguistic, and computational points made throughout this dissertation come together to provide strong justification for the central thesis set out in the introduction --- that an explanatory theory of natural language grammar can be based on a categorial grammar formalism which allows cross-linguistic variation only in the lexicon and has computationally attractive properties. CCG's notion of universal grammar just got more universal, and we now await a fuller and more cross-linguistically articulated theory of the lexicon.