Unicode for Arabists
Andreas Hallberg's notes on Arabic linguistics.
In this post I describe my process for writing academic papers, more or less from start to finish, with examples from a chapter I recently finished for the forthcoming Routledge Handbook of Prescriptivism. This post is intended primarily for students as a hands-on example of how one might go about writing a paper or a thesis, of the steps involved, and of the amount of work one might expect to have to put into it. The post also demonstrates a way of thinking about editing as a core part of paper writing.
تجد هنا فيديو لمحاضرة عن بحثي في مجال اللغة العربية قمت بإلقائها في سلسلة المحاضرات حلقة عربية المنظمة من قبل قسم اللغة اللعربية في Philipps Universität Marburg الألمانية. عنوان المحاضرة علامات الإعراب في العربية الفصحى الشفهية وأقدم فيها أهم النتائج من رسالتي الدكتورة Case Endings in Spoken Standard Arabic من سنة ٢٠١٦، كما أناقش بعض المسائل المنهجية.
The two videos below are generated from eye-movement data I recently collected in a short test run for a research application. They show how the gaze moves across the screen as a native Arabic speaker reads text without diacritics (Text 1) and with diacritics (Text 2), or so called unvowelled and vowelled text.
I am terrible at proofreading my own texts, which many of my colleagues can attest to. I was therefore happy to discover that MacOS has quite a good text-to-speech feature, and that this can be accessed from the command line. This means that I could quite easily integrate it into my Vim workflow, a prerequisite for me using it at all. This has turned out to be incredibly useful, and is something I now use everyday for all texts I write. When the text is read back to me by the friendly lady inside my computer, I can easily hear if something is spelled incorrectly, since she, unlike me, doesn’t know how to gloss over miss-spelled or repeated words. Below I describe how I use the speech synthesis in Vim and thereafter provide the code I have in my .vimrc
to get this functionality.
Nedan finner du länkar till två föreläsningar om den arabiska skriften som jag spelade in under hösten 2020 för kursen Introduktion till arabiska vid Göteborgs universitet, en delkurs i Arabiska, Grundkurs I. Föreläsningarna spelades in i samband med att undervisningen ställdes om till distansstudier på grund av coronapandemin, och jag tog tillfället i akt att grundligt gå igenom hur den arabiska skriften fungerar.
نُشرت مؤخراً في مجلة Reading and Writing دراسة في دور العلامات الإعرابية في قراءة اللغة العربية قمت بها بمشاركة Diederick Niehorster. يمكنك قراءة الدراسة بأكملها (المكتوبة باللغة الإنجليزية) على موقع المجلة. تجد هنا ملخص الدراسة العلمي (abstract) المترجم من النص الإنجليزي، وبعدها وصف مبسط لغير المختصين. لمزيد مم التفاصيل يرجى قراءة المقالة الأصلية.
Nedan finner du länkar till inspelade föreläsningar i grundläggande arabisk grammatik jag gav under våren 2020 vid Göteborgs universitet inom Standardarabiska I, en delkurs i Arabiska, Grundkurs I. Föreläsningarna spelades in i samband med att undervisningen ställdes om till distansstudier på grund av coronapandemin. De kan vara av intresse som hjälp vid självstudier i arabiska eller för den som bara är allmänt intresserad. Samtliga exempel ges i arabisk skrift, och man behöver vara hjälpligt bekant med denna för att följa föreläsningarna i detalj.
This fall I have been teaching a course in Syrian Arabic, and in preparation for this I read Cowell’s excellent A Reference Grammar of Syrian Arabic (1964), more or less cover to cover. This grammar is nothing short of fantastic. It is well organized, fairly easy to read, and it is, above all, comprehensive. Every little nook and cranny of the language seems to be explored and explained, and all is illustrated with authentic data. (Being from 1964, it does, however, contain some examples with words that are no longer in use.) I have a pretty good command of Syrian Arabic, but I have not studied it formally, and when reading this grammar, I had quite a few aha-moments, when quirky bits of the grammar in Syrian Arabic that I had found strange or confusing fell into place. There were also a lot of things I knew intuitively, but that I had never consciously formulated, and that I had not seen formally described. This post is a description of a some things that I found particularity interesting, namely: derived verb forms with the infixes w and r; special forms of numerals for specific nouns; the “-āt of batch”, as I like to call it; the bi-/fī- complementary distribution; variants of demonstrative pronouns; and the three yeses.
In his highly influential book Mustawayāt al-ʿarabiyya al-muʿāṣira fī miṣr [The levels of contemporary Arabic in Egypt] (1973), Saʿīd Badawī presented his theory on the relationship between Standard Arabic (fuṣḥā) and vernacular Arabic (ʿāmmiyya). The model has become a staple of modern Arabic linguistics, duly covered in textbooks, monographs, and review articles of Arabic sociolinguistics and related fields. At the core of Mustawayāt is a figure that illustrates his model. It is reproduces also in his later publications (Badawi 1985, 1986). While the main gist of the figure is easy enough to grasp, there are some aspects of it that I for a long time could not quite understand. I later realized that the figure is in fact poorly designed. In this post I describe the problems with the figure and present a suggestion for how it can be redesigned to better convey Badawī’s theory.
I am currently teaching a course in Syrian Arabic using material from the Al-Kitaab textbook, 3d edition, by Brustad, Al-Batal, Mahmoud, and al-Tunsi. The course runs parallel with a course covering the Standard Arabic part of the same material. The dialectal material in this book is excellent. It is carefully designed to rely on vocabulary and grammar from the chapter at hand while still having an authentic feel to it and being slightly challenging. The material is however only available as videos. In my view, having these dialogues as text facilitates classroom discussion of the dialogues. I have therefore transcribed most the Syrian Arabic material, namely the longer dialect dialogue found in each chapter. The transcripts can be found here.
This is a quick introduction on how to use citations in LaTeX with the biblatex
package. It is intended for people who are new LaTeX and to provide the bare minimum to get you up and running with automatically generated citations and bibliography. I assume that you have some basic understanding of LaTeX, the commandline interface, and have a standard distribution of LaTeX installed.
When I read books and articles related to my research, I usually take notes of what is most interesting to me, with each note in separate text file. To each file I add one or more keywords from a set list. A note can contain any arbitrary combination of keywords. The idea with these keywords is to allow me to sort or filter the notes by topic. A while back it struck me that these keywords form a network of connected nodes and that this network could probably somehow be visualized. Some ideas completely obsess you and won’t leave you alone. The only way to get them out of your head is to let them materialize in the real world.
This is a followup to a previous post in which I presented a way of representing the ten Arabic verb forms. In the document below I have done the same thing, i.e. a table with the ten verb forms in present, past, etc. and with the roots represented by empty squares, but this time for hollow and defective verbs. This is essentially to separate document, one for hollow and one for defective verbs, conveniently packaged in one pdf.
I am currently reading up on eye movements in reading. A good way to assimilate new information is to try and come up with ways to visualize it. The document below (LaTeX source) is the result of one such attempt. It is a summary of some basic facts about eye movements in reading, focusing on the structure of fixations, saccades, and perceptual span.
I’ve recently been toying with different ways of visually representing the ten Standard Arabic verb forms. The traditional way of using the root fʿl (فعل) as a pattern is nice enough, but it does not provide clear visual cues to differentiate between the pattern and the root. What you see is whole words, not roots and patterns. This is something I have tried to amend in the document below in which the three root consonants are represented by empty squares. With the table in this format it is possible see the structure of the verb form at a glance without having to untangle it from root consonants. Or at least this is the intention.
Minimal pairs are a good way to highlight and practice unfamiliar sounds in a foreign language. Often in lists of Arabic minimal pairs in teaching materials the authors feel they have to reach towards the very bottom of the Classical Arabic vocabulary bowl, listing words that students may never come across in real life. In the document below I have only included words that are in actual use in Modern Standard Arabic, that are fairly frequent, and that students sooner or later will have to learn.
The Arabic textbook Alif Baa (Georgetown University Press, 2014), the introductory book in the popular Al-Kitaab series, lacks a good overview of the Arabic alphabet and how letters connect. (The table on pages 11–12 only shows the isolated forms.) I therefore deviced an overview of the alphabet of the type found for example in Schulz et al. (2000, a.k.a “the red book”) in which all four forms of each letter are shown. This fits neatly on one page, so I populated the second with letters not traditionally part of the alphabet but that nevertheless need to be learned by students (أ إ ؤ ئ ء آ ة and ي), as well as the system of vowel markers. I also crammed in some (I hope useful) information in sidenotes. The intention is for students to have this document at hand as a reference sheet for the entirety their introductory course.
I often write documents, such as exams and lecture notes, that contain both Latin and Arabic script, often on the same line of text. This can be challenging due to the complications of mixing of LTR (left-to-right) and RTL (right-to-left) scripts. This seems like an easy problem to solve for software developers, and it is, only not in software with graphical WYSIWYG interfaces, such as Word or OpenOffice. (I’m sure everyone who has tried writing mixed direction text in such software share my frustration with them, and I will therefore refrain from rants.) Since my shift to exclusively producing and editing text in plain text formats (.txt, .mkd, .tex, etc.) with the editor Vim, writing texts with mixed directionality has become a lot easier. This post is an attempt to explain how.
In the different dialects of Arabic there are highly developed systems of polite phrases to be uttered in various situations. Many of these phrases have one specific appropriate response. For native speakers this is simply a part of language and it isn’t given much thought. When, for example, someone says naʿīman to you after you have had a shower, you automatically reply aḷḷa yinʿam ʿalēk. For a non-native speaker like myself, recognizing and learning these phrases can be challenging. I hope this post may be of some help for others in the same situation.
Another uncalled-for graph. I recently helped a student getting word counts for suras (chapters) in the Quran by running a script on the Quran in plain text downloaded from tanzil.net. I then had this nice little data file and felt I had to do something with it. The result is the document below, a graphical representation of the number of words and ayas (verses) in each sura in the Quran. It is intended to be printed on A3-paper. For A4-paper you will need a printer with high resolution, since the text will be very small. (The LaTeX source code can be found here.)
This post describes how to make stretchable pseudo-kashidas to lengthen words (كلمة طويـــــــلة) and how to automatically insert these at letter connections in order to justify Arabic text, that is, to make it have even right and left margins. The problem, solution, and the result is first presented in a non-technical way. Thereafter the implementation of the stretchable kashida in LaTeX is described.
Dokumentet nedan är en grafisk framställning av det standardarabiska fonetiska inventariet jag gjorde för en introduktionskurs i arabiska. På dokumentets andra sida läggs det svenska fonetiska inventariet till i röd text. Det används så att första sidan med endast de arabiska konsonantljuden projiceras och diskuteras. Sedan går man till nästa sida där de svenska konsonantljuden läggs till. Man kan då visa på skillnader systemen emellan, såsom att arabiska använder fler ljud längre bak och ner i uttalsorganen (längre till höger i tabellen), att arabiska i större utsträckning använder distinktioner i ton/tonlöshet, etc.
I’m used to the Alt-Latin keyboard layout to quickly type Arabic transcription. The Alt-Latin is an extension of the US QUARTY layout that uses combinations of the ALT and SHIFT keys to add diacritics such as in ḥ and ā, as well as the characters ʿ and ʾ. A nice graphical presentation of this layout and instructions of how to install it on Windows and Mac can be found here. The image below is from that site.
Det följande är en lista på arabiska namn, platser och institutioner gångbara bland den arabisktalande befolkningen i Malmö. Listan inkluderar bara benämningar som inte är direkta översättningar av den svenska orden utan är innovation bland arabisktalande personer.
The quality of plots and graphs in academic publications is often poor. Typically, a plot is generated in some program and is simply pasted into the final document with fonts and a graphical characteristics that do not match the rest of the document. Here I describe a fairly easy and straight forward way of writing simple plots in Tikz in LaTeX documents. It allows for fine control of the output and an easy way to make us global document settings also inside the plots. This description is meant only to give the basic framework of how to rite plots this way, a proof of concept if you will. There are many ways to improve it and streamline the code. It is the method I used to generate the plots in my thesis, such as the one in page 210:
December 4, 2014, AlJazeera aired the documentary Lisān aḍ-ḍād yajmaʿunā [The language of ḍād unites us] as part of the Taḥt al-mijhar series of documentaries, produced in-house by AlJazeera. The program is interesting in that it so clearly illustrates the Arabic language ideology in action, being in effect and inventory of mainstream concerns about the current state of Arabic. This post presents a fairly thorough presentation of the views aired in the documentary.
Sometimes, when reading about the classical Arabic grammarians, I find it difficult to visualize the actual time span between different authors. Years are of course always mentioned in the literature, but it is often still hard to get a feel for how big a chunk of time there is between a series of events. To get a better sense of the stages of development of Arabic grammar I have made a graphical Timeline of Arab grammarians and their major works. It is based on The Arabic Linguistic Tradition by Bohas et al. (2006), and lists all grammarians mentioned in their book, and uses their division into periods. A pdf with the timeline can be viewed/downloaded here.
UPDATE 2022-03-11: I have decided to make this repository private and no longer publicly available. It did not, as for as I am aware, prove useful for anyone else, and having the repository private relieves me of having to be careful in the wording of critical comments. If you would like access to these notes, please let me know.
Det följande är intressanta och/eller lustiga exempel på spontan svenska-arabisk kodväxling som jag noterat i min omgivning. Listan utvidgas kontinuerligt i och med att ny data tillkommer.
This posts describes a way of showing quoted Arabic text in the margin in non-Arabic environments and how this can be achieved in LaTeX. For sentence length quotes this has several advantages over the traditional method of having the transcribed Arabic inserted in the running text. It circumvents problems with directionality, avoids aesthetic clashes between the Latin and the Arabic script, the arabophone reader can read it more easily, and the non-arabophone reader can more easily skip it. It also makes for an interesting and nice looking page.
I am currently involved in an eye-tracking study investigating some aspects of reading in Arabic. Eye-tracking is a research method in which a person’s eyes movements are recorded while he or she performs some sort of task. By analyzing the eye-movement it is possible to see exactly were that person is looking at a certain point in time, and based on this data one can draw conclusions about mental activities during that task.
In this post I describe a method for extending the standard US keyboard layout to type Swedish characters in Vim. This is done with minimal changes to the layout and without moving characters around.
In this post I describe typographical problems introduced by the characters ʿ and ʾ used in transcription of Arabic. I present a font independent (well, largely) method of rendering them in LaTeX with the XeTeX engine. The code used for this is explained in some detail and is given in its entirety at the end of this post.