English was declared the sole official language in Namibia upon its independence in 1990, although the country has never been under British rule. Even in 2011, only 3.4% of Namibians reported that they speak English as their primary home language (cf. NSA 2012). English does, however, play a major role in official and inter-ethnic communication in Namibia and is favored especially by young Namibians born and raised after independence (cf. Stell 2014a; 2014b; 2016). Research on Namibian English is still in its infancy and until now, the studies are based on questionnaire data, sociolinguistic interviews, and data from experimental research designs (cf. Buschfeld & Kautzsch 2014; Kautzsch & Schröder 2016; Schröder & Schneider fc.). The present investigation adds to this field by providing a data source of naturally-occurring language that is particularly relevant for studying sociophonetic aspects of this emerging variety of English.
This project envisions at compiling a multi-layered corpus containing YouTube video data of Namibian YouTubers, which is complemented with written data from the comment sections and further social media accounts of the respective YouTubers as well as qualitative interview data and metadata consisting of basic sociolinguistic variables. This collection of CMC data can, on the one hand, be analyzed using acoustic phonetic methods, but its digital ethnographic nature also allows an approach to qualitative analysis that is driven by key notions of third-wave sociolinguistics (cf. Eckert 2012). Such a corpus therefore combines three fields of study, CMC, sociolinguistics, and phonetics, in a completely new way. Third-wave sociolinguistic studies are usually based on ethnographies conducted in small communities, and the data collected in such settings are most often not usable for studying phonetic and phonological aspects of language production. Researchers working in sociophonetics have usually supplemented ethnographic data with audio-recorded sociolinguistic interviews (cf. e.g. Drager 2009). By incorporating CMC data into a sociolinguistic study, the complete dependence on interview data for sociophonetic investigation can be circumvented and naturally occurring language can be used as an additional data source. Also, by using language data from multiple contexts, both naturally occurring and based on interviews, sociolinguistic ethnography can be expanded to use digital forms of communication. This is especially relevant for the study of Namibian English, as outer and expanding circle varieties of English are generally under-researched compared to inner-circle varieties such as British or American English. We know only little about how language is used in digital contexts by young Namibians, and the combination of CMC research with cutting-edge sociolinguistic approaches will help to shed light on this question.
I will present the pilot corpus, which consists of 300 minutes of YouTube videos by 15 self-identified Namibian content creators including orthographic transcriptions of the language used in the videos as well as the comment sections of the respective videos (as of 31st July 2018). I will provide two case studies to test the usability of this database: The first one is a sociophonetic case study, which analyzes the audio layer and investigates whether the NURSE-WORK vowel split described in recent work based on sociolinguistic interviews with Namibians (cf. Kautzsch, Schröder & Zähres 2017) is also found in the CMC data. This is significant because Namibian English has traditionally been aligned with South African Englishes but may now be establishing itself as an independent variety that diverges from South African Englishes. The NURSE-WORK split will be analyzed with acoustic phonetic methods, using standard programs and procedures from that field, in particular Praat (Boersma & Weenink 2018) and R. The second case study will also make use of the other layers of the corpus by investigating features identified as typical for NamE in Kautzsch (in prep.), a study using the online newspaper corpus CNamOn. I will compare the use of bare infinitive constructions containing go and between CNamOn and the various layers of the present corpus. The results confirm previously observed features in naturally-occurring data and the ongoing nativization process of English in Namibia.