Two Disadvantages Of Using Corpora For Language English Language Essay
What corpora are used for nowadays and which are the advantages and disadvantages of using corpora for language research? Nowadays, corpora are widely used for studying languages with precision and authentic information. They are used to observe the variation of language use across different types of speakers and they help us understand how languages change over time, through thousands of archives and texts. Corpora are also used for detecting new meanings of words and that also helps dictionaries to improve and refresh their definitions. So, the two main advantages of using corpora are the facts that they give authentic evidence of changes through time in a language and they are also a great tool for studying language use and variation across different types of speakers. But, there are also two disadvantages. Only 10% of the corpus is based on spoken language so there is not much information about it. The second disadvantage is that a corpus will never tell you what is grammatically or syntactically wrong or right.
Using corpora has two good advantages. The first one is the fact that they give us access to authentic evidence of changes in language through time and they help us detect new meanings of words. The corpus uses texts from different eras and ages, from different people and social classes. There are texts which are quite old but there are also texts which were written the last few years. This enables researchers to track changes in language through time. It also helps them find new meanings in words which already exist. It also gives access to new words, created the last few years which we were not aware of them and it helps to understand which words and meanings are most frequently used among speakers. The best part of it is that new words and new meanings enter the dictionaries, so, upgrading dictionaries is possible while using corpus.
Using corpora allows studying the different language variations used across different variables like the age or gender of the speaker, sex or social class and the education of the speaker. Researchers can easily find out if a word is used most frequently among male or female speakers or young and old or poor and rich people just by clicking a button. For example the word ‘cute’ is mostly used by female speakers rather than males. This kind of information also helps dictionaries to find better definitions of words while it helps language researchers understand different language variables.
In our every day life we interact with other people using spoken language. Unfortunately, in the corpus, 10% is based on spoken language texts and that is one of its disadvantages. 90% of the corpus is written texts but only 10% is spoken language and that makes the corpus less reliable on giving the correct information. Most of the times, a written text is based on Standard English, while in a spoken text, people can switch to different variables and researchers have little access to this. Most of the language used every day is spoken, but since it is difficult to have access to it, we have less knowledge on it.
The second disadvantage is that a corpus will never tell you if a sentence is grammatically or syntactically plausible or not. A text can be written by anyone and since it enters the corpus, we have no way of distinguishing whether there are mistakes or not. Even though there aren’t many mistakes and words and sentences are mostly correct, we still don’t have the facility to figure out the mistakes.
In general, I believe that, nevertheless, using corpora is an effective way of researching language. The advantages out wage all the disadvantages. The access we are given to authentic language and the information we get from corpus really help us research a language.
Phrasal Verbs:
According to the online Oxford English Dictionary (OED.com), phrasal verbs are very common and consist of a verb with an adverb, or a verb with a preposition or both verb with adverb and preposition. A phrasal verb is a complete semantic unit, which means it’s a complete sentence with its own meaning. There are a few examples of common phrasal verbs in English language:
Get out, go out, go, on, get over, walk quickly, get over with, move on, move out, get down on, am up to, check in e.t.c.
Phrasal verbs are separated in groups according to their semantic meaning and their grammatical or syntactical form. Some phrasal verbs have their literal meaning and can be interpreted just like they are. For example, the PVs get out, drive through, move quickly or get in have a literal meaning. Get out means get out and move quickly means move quickly. These are literal PVs. We also got another group of phrasal verbs, called idiomatic PVs and these PVs, have a different meaning from their written form. For instance, the PVs get by, take off, move out and, even, get in are idiomatic. Get by means survive a difficult situation and take off is when an aeroplane leaves the ground. Also, we can see that some PVs have literal and idiomatic meaning like ‘get in’. It means ‘get in’ a vehicle or a house, but it also means arrive.
There are also two different groups, that of transitive and intransitive PVs. A transitive PV has an object for the adverb like in the sentence ‘I hung up the phone’. After the adverb, an object follows. Intransitive PV is when there is no object after. Like in the sentence ‘look out, this thing is dangerous’.
Finally, we have separable and inseparable PVs. Separable are the PVs which can be separated by another word or words. For example ‘take the trash out’, or ‘turn the TV on’, and ‘add this up’. Inseparable are the PVs which can not be separated by other words. For example ‘look after the children’ or ‘get down on business’ and ‘came across this old book’ The verb can not be separated by its adverb. It is like one united form, one word which can not be separated because, otherwise, it would make no sense.
The query I would use in order to retrieve instances of PVs is this:
*_{V} (*_AVP){1,2}
We write in the query box: *_{V} (*_AVP){1,2} and press the start button, after we have chosen ‘spoken texts’. Then we choose the ‘distribution’ category and there we are. We see many numbers and categories. For now, I am going to focus on the genre of the spoken texts. We choose ‘overall: genre’ from the categories and press the ‘show distribution’ button. The spoken genre in which PVs are mostly used is conversations. In a number of 4,233,962 words, we got 36555 hits. That is 8633,76 per million words. Next, we’ve got the next category, that of demonstration. In 32,062 there are 273 hits. That means that every million words, there are 8514.75 hits. PVs are mostly used in conversations, demonstrations and in interviews. But the percentage of hits in all categories is quite close to one another. PVs are common in English language and they are always used, no matter what the situation.
Another distribution is that of the speaker’s sex. Women tend to use more often PVs than men do. Every one million words there are 8018.67 hits for women while for men, only 6844.93. It doesn’t seem to be a clear pattern in this distribution. This would be expected because women tend to be more indirect in the ways they are trying to say something. But when I distributed the age of the speaker, I was surprised to see that children, from the age of 3-14 use more PVs. They use small sentences, mostly, but they also use more PVs. They mostly use easy words like go, get and come and they combine them with easy for them to understand adverbs like on, up and out. During the ages of 15-24, they use more complex verbs and adverbs to make a PV. I can’t really see any pattern between these categories. The age that there is the least use of PVs is between 35 and 44.
Finally, I focused on the speakers’ social class. I’ve been expecting to find out that higher level social class speakers would use more PVs in their conversations. I’ve found out that middle class speakers use more PVs in their conversations whereas high level speakers use less. Middle class speakers use more PVs in their everyday conversations, whereas higher class speakers, like managers and professionals use less. I’ve also noticed that lower class speakers use more PVs than high and mid-high class speakers. So, basically, low-mid and low class speakers use more PVs than mid-high and higher class speakers. If we see the education distribution it could help a bit. Less educated speakers use more PVs in their speech. This could be associated with the age of the speakers in which we see that children, which are not educated, yet, use more PVs and the same thing happens with lower class speakers which are more likely that most of them didn’t achieve higher educational level. All these are associated with education. The lower the educational level, the more PVs used in the speakers’ language.
Order Now