Grammatical constraints on code-switching

The behaviour of bilingual and multilingual speakers in a wide variety of speech communities and a broad range of social contexts has been the subject of research since the 1970s. Specific attention has been paid in the literature on bilingualism/multilingualism to the phenomenon of code-switching, one of the results of which has been the proposal of and subsequent debate surrounding a number of different grammatical approaches to it.

This essay will attempt to examine and discuss some of the main grammatical approaches to code-switching, and go on to look at the arguments advanced to support (and undermine) these.

As Poplack (1980) mentions, authors of the early literature – when not focusing on the sociolinguistic and discourse elements relating to code-switching – concluded that code-switching was a phenomenon that occurred at random. Subsequent research has shown that there are code-switching patterns and that switching is, in fact, subject to grammatical rules; the debate now is centred on what, exactly, those rules are.

The various theories put forward by scholars in this field of research seek to elaborate universally-applicable rules that account for all instances of code-switching in all language pairs. As will be seen in this essay, and as is claimed by Gardner-Chloros and Edwards (2004) and Alvarez-Cáccamo (1998), none of these theories achieves its aim.

It is worth bearing in mind that, broadly speaking, there are two main “types” of code-switching: intersentential and intrasentential. The latter is arguably of greater interest to researchers as “it is only there that the two grammars are in contact” (Myers-Scotton and Jake (1995)).

There are several main grammatical approaches to code-switching which fall into a number of broad categories, each of which will be discussed in turn.

Gardner-Chloros and Edwards (2004: 3-4) argue that any given grammatical approach to code-switching depends on the sense of the word “grammar”. They claim that at least five senses of the term can be identified and, of those five senses, grammatical approaches to code-switching have focused (explicitly or otherwise) on the following two:

Formal grammar; and
Chomskyan/Universalist grammar

Poplack’s study of code-switching amongst a sample of bilingual Puerto-Ricans in New York City (1980) is an empirical test of two simple constraints that, she claims, are universally applicable: the Equivalence Constraint and the Free Morpheme Constraint.

The Equivalence Constraint dictates that intrasentential switches will only be made by any bilingual speaker (regardless of the speaker’s proficiency in his or her L2) “at points in discourse where juxtaposition of L1 and L2 elements does not violate a syntactic rule of either language, i.e. at points around which the surface structures of the two languages map onto each other”. So a bilingual speaker implicitly obeys the syntactic rules imposed by the respective grammars (which, in this model, are deemed to share rules that apply to the use of particular lexical items or language constituents) and will only make a switch from one code to the other at points where that switch will not violate the rules of either grammar. Indeed, the title of Poplack’s paper is a case in point:

(1) Sometimes I start a sentence in Spanish y termino en espanol(“and finish in Spanish”)

Here, the switch is made at a point in the sentence where the Spanish subordinate clause “y termino en espanol” does not violate the grammatical rules of English (which are deemed to set the framework for the sentence): the verb “terminar” is correctly inflected (“termino” – first person singular, present indicative) as the English verb “to finish” would be (i.e. “I finish”) had the clause been uttered in the latter language – and indeed, the grammar of the subordinate clause does not violate any grammatical rules of Spanish, were the entire sentence to be uttered solely in Spanish.

The Free Morpheme Constraint states that an intrasentential switch may be made by any bilingual speaker “provided [a] constituent is not a bound morpheme”. Thus a sentence such as:

(2) And what a tertuliait was, Dios mio!

(And what a gathering it was, my God!)

is acceptable under the Free Morpheme constraint (note that idiomatic expressions such as Dios mio above are “considered to behave like bound morphemes in that they show a strong tendency to be uttered monolingually”), unlike a sentence such as:

(3) *Estaba type-ando su ensayo

(She was type-ing her essay)

Subsequent discussion and research have shown that Poplack’s Constraints theory is not universally applicable to all language pairs or all instances of code-switching. It would appear that the Constraints model sits perfectly with Poplack’s own data set drawn from her sample Puerto-Rican speech community, and may be appropriate for language pairs which share particular grammatical, syntactic or lexical features, such that these facilitate switches that indeed do not violate any grammatical rules of either of the languages in contact. Nevertheless, Poplack has continued to defend and refine the model, arguing that instances of code-switching that violates either or both of the constraints are not code-switches at all, but rather what are termed by Poplack “nonce borrowings” (a term first coined by Weinreich (1953)). These, it is argued, are tantamount to single-word code-switches: words from the L2 are used in an L1-dominant utterance but have yet to become an established part thereof. Poplack argues that the Free Morpheme constraint is “a consequence of the nonce borrowing hypothesis (Sankoff et al, 1990)”. However, further research has yet fully to substantiate the claim of universal applicability of the Constraints model to all language pairs and all instances of code-switching.

Other constraints models have also been put forward, amongst others, by Pfaff (1979) in her study of Spanish-English code-switching and borrowing. She argues that there are four main types of constraints on constraints: functional, structural, semantic and discourse-related. Further constraints have also been formulated by Woolford (1983) in her generative model of code-switching (again based on data from Spanish-English bilinguals).

Such constraints models can be contrasted with the far more elaborate Matrix Language Frame model developed and advocated by Myers-Scotton and her collaborators (1993 and subsequently refined: 1995, 2000), in which sociolinguistics and psycholinguistics are combined within a grammatical approach to code-switching.

Read also Language transfer

The notion of a base or matrix language was not new when the MLF model was initially published by Myers-Scotton. Work by Klavans (1985), Joshi (1985) and others had already posited a “frame” or “matrix” into which elements of the other language could be embedded.

The broad lines of the MLF are as follows. Myers-Scotton makes the case for code-switching to involve a base or matrix language (ML), into which pockets of embedded language are inserted. The ML, then, is the “unmarked” language choice that provides the grammatical structure for the utterance or discourse, with “islands” of EL inserted at grammatically acceptable points of that utterance. She distinguishes between different types of morphemes and the role they play in code-switching: the ML supplies the system morphemes (closed-class items) in the sentence, while the EL supplies a proportion of the content morphemes (open-class items). There is also a psycholinguistic dimension to the MLF model, in that the ML is deemed to be more “activated” than the EL; it therefore lends itself more readily to providing the frame for code-switching between a bilingual speaker’s two (or more) languages.

In seeking to define “matrix language” Myers-Scotton argues that the decision made on the part of bilingual speakers to make intrasentential switches is “based on social, psychological and structural factors”. It is these factors that essentially form the basis of a definition of the ML. There are two structural criteria involved:

§ The ML is the language that projects the morphosyntactic frame for the CP that shows intrasentential CS. This is operationalised by two principles: the morpheme order principle, which states that the “surface morpheme order (reflecting surface syntactic relations) will be that of the ML”; and the system morpheme principle, which states that “all system morphemes that have grammatical relations external to their head constituent (i.e. participate in the sentence’s thematic role grid) will come from the ML”.

§ The ML generally supplies the greater number of morphemes in intrasentential code-switching.

The sociolinguistic aspects of the MLF model underpin the psycholinguistic ones: as stated above, the ML is the “unmarked” or expected language choice for the exchange between code-switching speakers. It is pointed out, however, that this is not always the case, for instance when the speakers do not share the same first language. It is also argued that the ML can change over the course of an exchange, in relation to situational changes for example.

The distinction between content and system morphemes is central to the MLF model, in that they help to identify the ML and EL. Under the MLF model, content morphemes come mainly from the EL, with system morphemes coming principally from the ML to form the frame in which code-switching can occur.

There are, however, difficulties in using morphemes to identify the ML, particularly when a speaker’s bilingualism is quite balanced. The quantitative criterion states that the majority of the morphemes in a code-switched utterance will come from the ML. However, this raises the issue of sample size – which Myers-Scotton herself concedes is difficult to determine – and comes up against instances of code-switching by balanced bilingual speakers who use both of their languages more or less equally; the number of morphemes from each language will, therefore, be more or less equal, thus undermining the applicability of the quantitative criterion posited in the MLF model in identifying the ML. .

Like the Constraints model, subsequent research and commentary have led to the MLF model being refined into its current form, the 4-M Model. In this theory, further distinctions are drawn between categories of system morpheme. Attempts are also made to resolve issues in the original MLF model, such as double morphology.

An interesting aspect of the MLF model is that it does not adopt the “sentence” as an appropriate unit for the grammatical analysis of code-switching. Myers-Scotton instead uses the CP (complement phrase) as an analytical unit, which she defines as

a syntactic structure expressing the predicate-argument structure of a clause, plus the additional syntactic structures needed to encode discourse-relevant structure and the logical form of that clause. Because CP explicitly assumes that the unit of structure includes COMP (complementizer) position, it is a more precise term than either clause or sentence.

For all of its innovation and complexity – which sets it in stark contrast with the simplicity of the Constraints model discussed above – the MLF model does not account for all instances of code-switching in all language pairs, fitting only with certain language pairs, and particularly with Myers-Scotton’s data set drawn from East African languages and dialects; as well as “cases of very asymmetric bilingualism” where the speakers’ proficiency in one or other of the languages in contact is weaker.

So neither the Constraints model nor the MFL model gives a complete grammatical description of code-switching; instead, they each describe a particular form or class of code-switching into which particular language pairs or forms of bilingualism fit. A more complete view is therefore required.

Muysken (2000) proposes a typology of code-mixing (a term that he favours over “code-switching”, which he reserves for referring to instances of rapid interchange between languages in the same discourse) that attempts to encompass both of the models discussed above, with an additional component that he terms “congruent lexicalization”. He argues that there are three main types of CS:

Alternation: this is a form of code-switching in which bilingual speakers alternate between their two (or more) languages. An example of alternational code-mixing is Poplack’s Constraints model.
Insertion: in this form of CS, speakers insert chunks of switched constituents from the L2 into discourse framed in L1. Muysken argues that the MLF model is an illustration of insertional code-mixing.
Congruent lexicalization: this is code-mixing between language pairs that share close morphological and phonological ties. An example of one such language pair (and the corresponding code-switching) is provided by Clyne’s study of Dutch-English code-switching in Australia (1987).

Muysken argues that different language pairs will fit into one or other of those types. So, rather than proposing a “one size fits all” grammatical approach to code-switching/code-mixing, he acknowledges that code-mixing/code-switching between different languages pairs will display different characteristics, rather than claiming that all instances of code-mixing/code-switching will fit into a single immutable model or theory.

It is interesting to note that Muysken is also a proponent of the Chomskyan Government model of code-switching. In a paper co-authored with Di Sciullo and Singh (1986), it is argued that the government constraint, whereby there can be no switch in codes between a governor constituent and its corresponding governed item, will serve to predict which switches will and will not be acceptable, regardless of the languages in contact in a bilingual person’s lexicon. The model, however, does not account for or predict all instances of code-switching; indeed, bilingual speakers will code-switch at any point in any given utterance, Government or no. Even when the scope of the model is restricted to lexical government by non-function words (Muysken 1990), it remains an overstatement. It must also be borne in mind that this model will change as many times as Chomsky’s theory of Universal Grammar goes through its various transformations; in its current incarnation of the Minimalist Program, the notion of Government has been cast aside altogether owing to definitional difficulties

Another take on the generativist approach to code-switching is the “null theory” of code-switching. A number have been put forward (Mahootian (1993), Chan (1999), MacSwan (1999, 2000), Woolford (1983)). The basic premise of the “null theory” approach – whether it is couched in terms of Tree Adjoining Grammar (Joshi 1985) or the Minimalist Program/Principles and Parameters – is that code-switching can be described in terms of grammatical principles relevant to monolingual grammars, without postulating additional devices or constraints that are specific to code-switching itself.

This is an attractive argument, but far from compelling. Generativist models are highly abstract, to the point where they are too far removed from the realities of bilingual speech. The underlying premise of Chomsky’s notion of the monolingual “ideal speaker” is not helpful here, as it leads to generalisations about bilingual speakers that are simply not accurate, as they are not a reflection of how bilinguals combine their languages in speech. Additionally, the “ungrammatical” nature of speech weakens any grammatical model of code-switching (see below).

There are a number of reasons why none of these models (perhaps with the exception of Muysken’s proposed typology of code-mixing) can account for all instances of CS.

1. Variability: As Gardner-Chloros and Edwards rightly point out, this variability is found between communities, within a single community, right down to the speech of individuals and even within the speech of a single individual within the same conversation (2004: 4). This may be the end result of – and, at the very least, related to – the idiolectal competence of individual speakers.

2. Nature of bilingual speech: Bilingual speakers are known to employ all kinds of devices and “tricks” to avoid being constricted by the dictates of grammatical rules. Speakers use pauses, interruptions and other means to neutralize any grammatical awkwardness resulting from switching at a particular point in the sentence.These devices serve a functional purpose in allowing speakers to make full use of both of their languages, and legitimising combinations from languages that are typologically different (e.g. word order).

3. Abstract nature of the notion of “grammar” and “sentence”: These are abstractions used by linguists to conceptualise language behaviour, in this instance amongst bilingual speakers. The issue here is whether such abstractions are relevant to the analysis of CS as seen in bilingual speech. The concept of the “sentence” may not be appropriate to the analysis of code-switching in any event: speakers rarely utter fully-rounded, grammatical sentences in everyday discourse and code-switch at will with seemingly little concern for the grammaticality of the (intersentential or intrasentential) switches that they make so effortlessly. Furthermore, from a grammatical analysis perspective, Gardner-Chloros and Edwards argue that even if the sentence were to be accepted as the “upper limit of grammar” and a meaningful unit in the context of code-switching, this would mean that grammatical approaches would only seek to explain intrasentential switches whilst omitting intersentential switches and conversational “moves” (2004: 5).

The fundamental question at issue is whether or not a grammatical approach to code-switching is even appropriate. Given the variability of code-switching and the nature of speech in general – and bilingual speech more specifically – it seems particularly difficult to formulate any kind of universally applicable principle or constraint that accurately predicts how, where and when a bilingual speaker will switch codes, let alone whether that switch will “grammatical”. Variability lays at the very heart of code-switching; it is a reflection of a human ability to handle and manipulate language in any way that serves the speaker’s purpose in any given situation and with any given interlocutor(s).

Another salient point that emerges is whether code-switching is even an observable fact. Gardner-Chloros (1995) argues that CS is an “analyst construct”, a product of linguists’ conceptualisations of language contact and language mixing and, as such, not separable from borrowing, interference or pidginisation (1995: 86), be it in ideological or practical terms. She also argues that the abstract concept currently accepted in bilingualism research is “fuzzy” and should in fact be used as a much broader term for a range of interlingual phenomena in which strict alternation between two discrete systems is the exception rather than the rule (1995: 68). If that is indeed the case, is it possible to begin to formulate a “grammar of code-switching” when there is still uncertainty as to what code-switching actually is?

The arguments put forward by Alvarez-Cáccamo (1998) are also related to the points raised by Gardner-Chloros. In tracing the development of code-switching as a field of bilingualism research and of applied linguistics as a whole, he distinguishes between linguistic varieties and communicative codes, arguing that code-switching pertains to the former category and, as such, suggests that “code-switching” is perhaps a misnomer. He proposes that the concept of CS in its current form be both narrowed to exclude unrelated phenomena that have come under the banner of “code-switching”, and broadened to include those elements that have been excluded (including aspects of monolingual speech). It is difficult to see how an all-encompassing approach to code-switching can be put forward until the phenomenon of code-switching has been properly identified (and presumably labelled:

“In order to argue convincingly for or against the existence of “code-switching constraints” and “code-switching grammars” (…) research should first convincingly prove that (a) speakers who code-switch possess two (or more) identifiable systems or languages, each with its identifiable grammatical rules and lexicon; and (b) “code-switched” speech results from the predictable interaction between lexical elements and grammatical rules from these languages.” (Alvarez-Cáccamo (1998: 36))

However, the issue here again lays in the conceptualisation of bilingual speech. Abstractions used by linguists in examining language phenomena such as code-switching remove the “human” element reflected in discourse strategies employed by bilingual speakers (discussed above; see below).

A further aspect of code-switching, while not strictly grammatical, is discussed by Bentahila and Davies (1995): the variables related to language contact situations, and how those change depending on developments in the contact situations. In a study of different generations of Moroccan Arabic-French bilinguals, they examine the relationship between patterns of code-switching and patterns of language contact and the influence of extraneous factors on those patterns. They point out that code-switching is affected by the nature of the contact between a particular pair of languages: duration of contact, for instance, and the impact of governmental language planning policies. They found that while all the bilingual speakers in their sample speech community used the same languages, their use of those same languages depended on their proficiency in both, which in turn depended on their age and the effects of governmental language planning and nationalist policies pursued in the post-colonial continuum. It could be argued that evolving patterns of code-switching contribute to the variability of code-switching practices amongst bilingual speakers and, therefore, constitute another (indirect) reason why grammatical approaches to code-switching so often fall short.

In summary, then, a number of grammatical models of intrasentential code-switching, with each claiming to predict where in the sentence a bilingual person will switch languages and that such switches will be made in such a way as not to violate any of the grammatical rules of either of the languages in contact. It is contended that, rather than achieving that aim, each model is specific to the data sets on which they are based, and can only really apply to similar language pairs. They therefore only describe an aspect of a phenomenon that is far more complex than the models would suggest. Furthermore, the applicability of the various models also depends on the “kind” of bilingual concerned and their proficiency in their respective language pairs: the Constraints model appears to be more relevant to more balanced bilinguals, for instance, while the MLF model seems to be more appropriate to more asymmetric bilinguals. It must be remembered that the models are not in stasis but rather continually refined and amended in relation to developments in their particular theoretical backdrop: the Government model of code-switching, for instance, is based on a theory of Universal Grammar that is itself evolving over time. Muysken’s typology of bilingual speech (2000), which draws on the leading models of code-switching/code-mixing and seeks to account for all instances of code-switching by taking into account the various aspects involved therein, appears to be the most rounded of the grammatical approaches to the phenomenon, in that it encompasses the disparate aspects that have formed the focus of individual models. There is also the issue of whether code-switching is a phenomenon in its own right and, if not, what linguistic phenomena the concept of code-switching can be deemed to cover. Has the concept become an umbrella term used to describe a number of different linguistic devices employed by bilingual speakers? Or are these elements that are indistinguishable from a wider phenomenon?

To conclude, it would appear that research into and grammatical approaches to code-switching have lost sight of the fact that code-switching is an abstraction used by linguists to conceptualise an aspect of the behaviour of bilingual speakers. After all, “languages do not do things; people do things, languages are abstractions from what people do” . Such a conceptualisation has led to researchers attempting to fit bilingual speech behaviour to a particular model rather than the other way around, discounting aspects such as variability, bilingual discourse strategies and the fact that code-switching is a creative, innovative process designed, it would appear, almost to avoid grammatical constraints altogether. Abstract grammatical models cannot reflect the realities of language contact and use. Not only that, but code-switching is also a gauge of language change and shift; this being the case, it is plausible that a grammatical shift would ensue, thus undermining a given model. Factors such as those mentioned by Bentahila and Davies (1995) must also have some kind of impact on grammatical models when these are based on a language contact situation which is shifting and evolving. A step back towards the realities of bilingual communication and speech acts, combined with an acceptance of the variability that they necessarily entail – as reflected in the typology proposed by Muysken (2000) – would constitute a more appropriate starting point for any grammatical approach to code-switching that sets out to be all things to all bilingual speakers.

Order Now