In linguistics, word order typology is the study of the order of the syntactic constituents of a language, and how different languages can employ different orders. Correlations between orders found in different syntactic sub-domains are also of interest. The primary word orders that are of interest are the constituent order of a clause – the relative order of subject, object, and verb; the order of modifiers (adjectives, numerals, demonstratives, possessives, and adjuncts) in a noun phrase; and the order of adverbials.
Some languages use relatively restrictive word order, often relying on the order of constituents to convey important grammatical information. Others—often those that convey grammatical information through inflection—allow more flexibility, which can be used to encode pragmatic information such as topicalisation or focus. Most languages, however, have a preferred word order, and other word orders, if used, are considered "marked".
Most nominative–accusative languages—which have a major word class of nouns and clauses that include subject and object—define constituent word order in terms of the finite verb (V) and its arguments, the subject (S), and object (O).
There are six theoretically possible basic word orders for the transitive sentence. The overwhelming majority of the world's languages are either subject–verb–object (SVO) or subject–object–verb (SOV), with a much smaller but still significant portion using verb–subject–object (VSO) word order. The remaining three arrangements are exceptionally rare, with verb–object–subject (VOS) being slightly more common than object–verb–subject (OVS), and object–subject–verb (OSV) being the rarest by a significant margin.
|SOV||"She him loves."||45%||Sanskrit, Hindi, Ancient Greek, Latin, Japanese, Korean|
|SVO||"She loves him."||42%||Chinese, English, French, Hausa, Italian, Malay, Russian, Spanish|
|VSO||"Loves she him."||9%||Biblical Hebrew, Arabic, Irish, Filipino, Tuareg-Berber, Welsh|
|VOS||"Loves him she."||3%||Malagasy, Baure|
|OVS||"Him loves she."||1%||Apalaí, Hixkaryana|
|OSV||"Him she loves."||0%||Warao, (certain dialects of) Korean|
These are all possible word orders for the subject, verb, and object in the order of most common to rarest (the examples use "she" as the subject, "ate" as the verb, and "bread" as the object):
Sometimes patterns are more complex: German, Dutch, Afrikaans and Frisian have SOV in subordinates, but V2 word order in main clauses, SVO word order being the most common. Using the guidelines above, the unmarked word order is then SVO.
Many synthetic languages such as Latin, Greek, Persian, Romanian, Assyrian, Russian, Turkish, Korean, Japanese, Finnish, and Basque have no strict word order; rather, the sentence structure is highly flexible and reflects the pragmatics of the utterance.
Topic-prominent languages organize sentences to emphasize their topic–comment structure. Nonetheless, there is often a preferred order; in Latin and Turkish, SOV is the most frequent outside of poetry, and in Finnish SVO is both the most frequent and obligatory when case marking fails to disambiguate argument roles. Just as languages may have different word orders in different contexts, so may they have both fixed and free word orders. For example, Russian has a relatively fixed SVO word order in transitive clauses, but a much freer SV / VS order in intransitive clauses. Cases like this can be addressed by encoding transitive and intransitive clauses separately, with the symbol 'S' being restricted to the argument of an intransitive clause, and 'A' for the actor/agent of a transitive clause. ('O' for object may be replaced with 'P' for 'patient' as well.) Thus, Russian is fixed AVO but flexible SV/VS. In such an approach, the description of word order extends more easily to languages that do not meet the criteria in the preceding section. For example, Mayan languages have been described with the rather uncommon VOS word order. However, they are ergative–absolutive languages, and the more specific word order is intransitive VS, transitive VOA, where S and O arguments both trigger the same type of agreement on the verb. Indeed, many languages that some thought had a VOS word order turn out to be ergative like Mayan.
The table below displays the word order surveyed by Dryer. The 2005 study surveyed 1228 languages, and the updated 2013 study investigated 1377 languages. Percentage was not reported in his studies.
|Word Order||Number (2005)||Percentage (2005)||Number (2013)||Percentage (2013)|
Hammarström (2016) calculated the constituent orders of 5252 languages in two ways. His first method, counting languages directly, yielded results similar to Dryer's studies, indicating both SOV and SVO are common patterns among human languages. However, when stratified by language families, the distribution became skewed. The order of SOV turned out to be most common.
|Word Order||No. of Languages||Percentage||No. of Families||Percentage|
(NODOM means "no dominant word order".)
A fixed or prototypical word order is one out of many ways to ease the processing of sentence semantics and reducing ambiguity. One method of making the speech stream less open to ambiguity (complete removal of ambiguity is probably impossible) is a fixed order of arguments and other sentence constituents. This works because speech is inherently linear. Another method is to label the constituents in some way, for example with case marking, agreement, or another marker. Fixed word order reduces expressiveness but added marking increases information load in the speech stream, and for these reasons strict word order seldom occurs together with strict morphological marking, one counter-example being Persian.
Observing discourse patterns, it is found that previously given information (topic) tends to precede new information (comment). Furthermore, acting participants (especially humans) are more likely to be talked about (to be topic) than things simply undergoing actions (like oranges being eaten). If acting participants are often topical, and topic tends to be expressed early in the sentence, this entails that acting participants have a tendency to be expressed early in the sentence. This tendency can then grammaticalize to a privileged position in the sentence, the subject.
The mentioned functions of word order can be seen to affect the frequencies of the various word order patterns: The vast majority of languages have an order in which S precedes O and V. Whether V precedes O or O precedes V however, has been shown to be a very telling difference with wide consequences on phrasal word orders.
Knowledge of word order on the other hand can be applied to identify the thematic relations of the NPs in a clause of an unfamiliar language. If we can identify the verb in a clause, and we know that the language is strict accusative SVO, then we know that Grob smock Blug probably means that Grob is the smocker and Blug the entity smocked. However, since very strict word order is rare in practice, such applications of word order studies are rarely effective.
A paper by Murray Gell-Mann and Merritt Ruhlen, building on work in comparative linguistics, asserts that the distribution[clarification needed] of word order types in the world's languages was originally SOV. The paper compares a survey of 2135 languages with a "presumed phylogenetic tree" of languages, concluding that changes in word order tend to follow particular pathways, and the transmission of word order is to a great extent vertical (i.e. following the phylogenetic tree of ancestry) as opposed to horizontal (areal, i.e. by diffusion). According to this analysis, the most recent ancestor of[all?] currently known languages was spoken recently enough to trace the whole evolutionary path of word order in most cases.
There is speculation on how the Celtic languages developed VSO word order. An Afro-Asiatic substratum has been hypothesized, but current scholarship considers this claim untenable, not least because Afro-Asiatic and Celtic were not in contact in the relevant period.
The order of constituents in a phrase can vary as much as the order of constituents in a clause. Normally, the noun phrase and the adpositional phrase are investigated. Within the noun phrase, one investigates whether the following modifiers occur before or after the head noun.
Within the adpositional clause, one investigates whether the languages makes use of prepositions (in London), postpositions (London in), or both (normally with different adpositions at both sides).
There are several common correlations between sentence-level word order and phrase-level constituent order. For example, SOV languages generally put modifiers before heads and use postpositions. VSO languages tend to place modifiers after their heads, and use prepositions. For SVO languages, either order is common.
For example, French (SVO) uses prepositions (dans la voiture, à gauche), and places adjectives after (une voiture spacieuse). However, a small class of adjectives generally go before their heads (une grande voiture). On the other hand, in English (also SVO) adjectives almost always go before nouns (a big car), and adverbs can go either way, but initially is more common (greatly improved). (English has a very small number of adjectives that go after their heads, such as extraordinaire, which kept its position when borrowed from French.)
This article needs additional citations for verification. (July 2007) (Learn how and when to remove this template message)
Some languages have no fixed word order and often use a significant amount of morphological marking to disambiguate the roles of the arguments. However, some languages use a fixed word order even if they provide a degree of marking that would support free word order. Also, some languages with free word order, such as some varieties of Datooga, combine free word order with a lack of morphological distinction between arguments.
Typologically, highly-animate actors are more likely topical than low-animate undergoers, a trend that would come through even in languages with free word order languages. That a statistical bias for SO order (or OS in the case of ergative systems, but ergative systems do not usually extend to the highest levels of animacy and usually give way to some form of nominative system, at least in the pronominal system).
Most languages with a high degree of morphological marking have rather flexible word orders, such as Turkish, Tamil, Latin, Portuguese, Ancient and Modern Greek, Romanian, Hungarian, Lithuanian, Serbo-Croatian, Russian (in intransitive clauses), and Finnish. In some of those languages, a canonical order can still be identified, but that is not possible in others. When the word order is free, different choices of word order can be used to help identify the theme and the rheme.
The word order in Hungarian sentences is changed according to the speaker’s communicative intentions. Hungarian word order is not free in the sense that it must reflect the information structure of the sentence, distinguishing the emphatic part that carries new information (rheme) from the rest of the sentence that carries little or no new information (theme).
The position of focus in a Hungarian sentence is immediately before the verb, that is, nothing can separate the emphatic part of the sentence from the verb.
For "Kate ate a piece of cake", the possibilities are:
The only freedom in Hungarian word order is that the order of parts outside the focus position and the verb may be freely changed without any change to the communicative focus of the sentence, as seen in sentences 2 and 3 as well as in sentences 6 and 7 above. These pairs of sentences have the same information structure, expressing the same communicative intention of the speaker, because the part immediately preceding the verb is left unchanged.
Note that the emphasis can be on the action (verb) itself, as seen in sentences 1, 6 and 7, or it can be on parts other than the action (verb), as seen in sentences 2, 3, 4 and 5. If the emphasis is not on the verb, and the verb has a co-verb (in the above example 'meg'), then the co-verb is separated from the verb, and always follows the verb. Also note that the enclitic -t marks the direct object: 'torta' (cake) + '-t' -> 'tortát'.
In Latin, the endings of nouns, verbs, adjectives, and pronouns allow for extremely flexible order in most situations. Latin lacks articles.
The Subject, Verb, and Object can come in any order in a Latin sentence, although most often (especially in subordinate clauses) the verb comes last. Pragmatic factors, such as topic and focus, play a large part in determining the order. Thus the following sentences each answer a different question:
Latin prose often follows the word order "Subject, Direct Object, Indirect Object, Adverb, Verb", but this is more of a guideline than a rule. Adjectives in most cases go before the noun they modify, but some categories, such as those that determine or specify (e.g. Via Appia "Appian Way"), usually follow the noun. In Classical Latin poetry, lyricists followed word order very loosely to achieve a desired scansion.
Due to the presence of grammatical cases (nominative, genitive, dative, accusative, ablative, and in some cases or dialects vocative and locative) applied to nouns, pronouns and adjectives, the Albanian language permits a large number of positional combination of words. In spoken language a word order differing from the most common S-V-O helps the speaker putting emphasis on a word, thus changing partially the message delivered. Here is an example:
In the aforementioned examples, "(mua)" can be omitted causing a perceivable change in emphasis, the latter being of different intensity. "Më" is always followed by the verb. Thus, a sentence consisting of a subject, a verb and two objects (a direct and an indirect one), can be expressed in six different ways without "mua", and in twenty-four different ways with "mua", adding up to thirty possible combinations.
The word order of many Indo-European languages can change depending on what specific implications a speaker wishes to make. These are generally aided by the use of appropriate inflectional suffixes. Consider these examples from Bengali and Sinhalese:
In many languages, changes in word order occur due to topicalization or in questions. However, most languages are generally assumed to have a basic word order, called the unmarked word order; other, marked word orders can then be used to emphasize a sentence element, to indicate modality (such as an interrogative modality), or for other purposes.
For example, English is SVO (subject-verb-object), as in "I don't know that", but OSV is also possible: "That I don't know." This process is called topic-fronting (or topicalization) and is common. In English, OSV is a marked word order because it emphasises the object, and is often accompanied by a change in intonation.
An example of OSV being used for emphasis:
Non-standard word orders are also found in poetry in English, particularly archaic or romantic terms – as the wedding phrase "With this ring, I thee wed" (SOV) or "Thee I love" (OSV) – as well as in many other languages.
Differences in word order complicate translation and language education – in addition to changing the individual words, the order must also be changed. This can be simplified by first translating the individual words, then reordering the sentence, as in interlinear gloss, or by reordering the words prior to translation.