The Arabic script is a writing system used for writing Arabic and several other languages of Asia and Africa, such as Persian (Farsi/Dari), Uyghur, Kurdish, Punjabi, Sindhi, Balti, Balochi, Pashto, Lurish, Urdu, Kashmiri, Rohingya, Somali and Mandinka, among others. Until the 16th century, it was also used to write some texts in Spanish. Additionally, prior to the language reform in 1928, it was the writing system of Turkish. It is the second-most widely used writing system in the world by the number of countries using it and the third by the number of users, after the Latin and Chinese scripts.
The Arabic script is written from right to left in a cursive style, in which most of the letters are written in slightly different forms according to whether they stand alone or are joined to a following or preceding letter. The basic letter form remains unchanged. In most cases, the letters transcribe consonants or consonants and a few vowels, so most Arabic alphabets are abjads. It does not have capital letters.
The script was first used to write texts in Arabic, most notably the Quran, the holy book of Islam. With the religion's spread, it came to be used as the primary script for many language families, leading to the addition of new letters and other symbols, with some versions, such as Kurdish, Uyghur and old Bosnian being abugidas or true alphabets. It is also the basis for the tradition of Arabic calligraphy.
The Arabic alphabet is a derivative of the Nabataean alphabet or (less widely believed) directly from the Syriac alphabet which are both derived from the Aramaic alphabet, which descended from the Phoenician alphabet. The Phoenician alphabet gave rise to among others the Arabic alphabet, Hebrew alphabet and the Greek alphabet (and therefore the Cyrillic and Latin alphabets).
In the 6th and 5th centuries BCE, northern Arab tribes emigrated and founded a kingdom centred around Petra, Jordan. These people (now named Nabataeans from the name of one of the tribes, Nabatu) spoke Nabataean Arabic, a dialect of the Arabic language. In the 2nd or 1st centuries BCE, the first known records of the Nabataean alphabet were written in the Aramaic language (which was the language of communication and trade), but included some Arabic language features: the Nabataeans did not write the language which they spoke. They wrote in a form of the Aramaic alphabet, which continued to evolve; it separated into two forms: one intended for inscriptions (known as "monumental Nabataean") and the other, more cursive and hurriedly written and with joined letters, for writing on papyrus. This cursive form influenced the monumental form more and more and gradually changed into the Arabic alphabet.
The Arabic script has been adapted for use in a wide variety of languages besides Arabic, including Persian, Malay and Urdu, which are not Semitic. Such adaptations may feature altered or new characters to represent phonemes that do not appear in Arabic phonology. For example, the Arabic language lacks a voiceless bilabial plosive (the [p] sound), therefore many languages add their own letter to represent [p] in the script, though the specific letter used varies from language to language. These modifications tend to fall into groups: Indian and Turkic languages written in the Arabic script tend to use the Persian modified letters, whereas the languages of Indonesia tend to imitate those of Jawi. The modified version of the Arabic script originally devised for use with Persian is known as the Perso-Arabic script by scholars.
In the cases of Bosnian, Kurdish, Kashmiri and Uyghur writing systems, vowels are mandatory. The Arabic script can therefore be used in both abugida and abjad forms, although it is often strongly, if erroneously, connected to the latter due to it being originally used only for Arabic.
Use of the Arabic script in West African languages, especially in the Sahel, developed with the spread of Islam. To a certain degree the style and usage tends to follow those of the Maghreb (for instance the position of the dots in the letters fāʼ and qāf). Additional diacritics have come into use to facilitate the writing of sounds not represented in the Arabic language. The term ʻAjamī, which comes from the Arabic root for "foreign," has been applied to Arabic-based orthographies of African languages.
Table of writing styles
Table of alphabets
Today Iran, Afghanistan, Pakistan, India, and China are the main non-Arabic speaking states using the Arabic alphabet to write one or more official national languages, including Azerbaijani, Baluchi, Brahui, Persian, Pashto, Central Kurdish, Urdu, Sindhi, Kashmiri, Punjabi and Uyghur.
An Arabic alphabet is currently used for the following languages:
Middle East and Central Asia
Garshuni (or Karshuni) originated in the 7th century, when Arabic became the dominant spoken language in the Fertile Crescent, but Arabic script was not yet fully developed or widely read, and so the Syriac alphabet was used. There is evidence that writing Arabic in this other set of letters (known as Garshuni) influenced the style of modern Arabic script. After this initial period, Garshuni writing has continued to the present day among some Syriac Christian communities in the Arabic-speaking regions of the Levant and Mesopotamia.
Kazakh in Kazakhstan, China, Iran and Afghanistan
Kurdish in Northern Iraq and Northwest Iran. (In Turkey and Syria the Latin script is used for Kurdish)
Kyrgyz by its 150,000 speakers in the Xinjiang Uyghur Autonomous Region in northwestern China, Pakistan, Kyrgyzstan and Afghanistan
Turkmen in Turkmenistan, Afghanistan and Iran
Uzbek in Uzbekistan and Afghanistan
Official Persian in Iran and its dialects, like Dari in Afghanistan and Tajiki in Tajikistan
Baluchi in Iran, in Pakistan's Balochistan region, Afghanistan and Oman An academy for the protection of the Baluchi Language was established in Iran in 2009
Southwestern Iranian languages as Lori dialects and Bakhtiari language
Pashto in Afghanistan and Pakistan, and Tajikistan
Uyghur changed to Latin script in 1969 and back to a simplified, fully voweled Arabic script in 1983
Azerbaijani language in Iran
Talysh language in Iran
The Chinese language is written by some Hui in the Arabic-derived Xiao'erjing alphabet (see also Sini (script))
The Turkic Salar language is written by some Salar in the Arabic alphabet
Balochi in Pakistan and Iran
Dari in Afghanistan
Kashmiri in India and Pakistan (also written in Sharada and Devanagari although Kashmiri is more commonly written in Perso-Arabic Script)
Pashto in Afghanistan and Pakistan
Khowar in Northern Pakistan, also uses the Latin script
Punjabi (Shahmukhi) in Pakistan, also written in the Brahmic script known as Gurmukhi in India
Saraiki, written with a modified Arabic script - that has 45 letters
Sindhi, a British commissioner in Sindh on August 29, 1857, ordered to change Arabic script, also written in Devanagari in India
Ladakhi (India), although it is more commonly written using the Tibetan script
Balti (a Sino-Tibetan language), also rarely written in the Tibetan script
Brahui language in Pakistan and Afghanistan
Burushaski or Burusho language, a language isolated to Pakistan
Urdu in Pakistan (and historically several other Hindustani languages). Urdu is one of several official languages in the states of Jammu and Kashmir, Delhi, Uttar Pradesh, Bihar, Jharkhand, West Bengal and Telangana.
Dogri, spoken by about five million people in India and Pakistan, chiefly in the Jammu region of Jammu and Kashmir and in Himachal Pradesh, but also in northern Punjab, although Dogri is more commonly written in Devanagari
Arwi language (a mixture of Arabic and Tamil) uses the Arabic script together with the addition of 13 letters. It is mainly used in Sri Lanka and the South Indian state of Tamil Nadu for religious purposes. Arwi language is the language of Tamil Muslims
Arabi Malayalam is Malayalam written in the Arabic script. The script has particular letters to represent the peculiar sounds of Malayalam. This script is mainly used in madrasas of the South Indian state of Kerala and of Lakshadweep.
Rohingya language (Ruáingga) is a language spoken by the Rohingya people of Rakhine State, formerly known as Arakan (Rakhine), Burma (Myanmar). It is similar to Chittagonian language in neighboring Bangladesh and sometimes written using the Roman script, or an Arabic-derived script known as Hanifi
Malay in the Arabic script known as Jawi. In some cases it can be seen in the signboards of shops and market stalls. Particularly in Brunei, Jawi is used in terms of writing or reading for Islamic religious educational programs in primary school, secondary school, college, or even higher educational institutes such as universities. In addition, some television programming uses Jawi, such as announcements, advertisements, news, social programs or Islamic programs
co-official in Brunei
Malaysia but co-official in Kelantan and Kedah, Islamic states in Malaysia
Indonesia, Jawi script is co-used with Latin in provinces of Aceh, Riau, Riau Islands and Jambi. The Javanese, Madurese and Sundanese also use another Arabic variant, the Pegon in Islamic writings and pesantren community.
Predominantly Muslim areas of the Philippines (especially Tausug language)
Ida'an language (also Idahan) a Malayo-Polynesian language spoken by the Ida'an people of Sabah, Malaysia
Cham language in Cambodia besides Western Cham script.
Maghrebi Arabic uses a modified Arabic script, with additional letters, in order to support /g/ (ڨ/ڭ), /v/ (ڥ) and /p/ (پ) along with the older /f/ (ڢ) and /q/ (ڧ)
Berber languages have often been written in an adaptation of the Arabic alphabet. The use of the Arabic alphabet, as well as the competing Latin and Tifinagh scripts, has political connotations
Tuareg language, (sometimes called Tamasheq) which is also a Berber language
Coptic language of Egyptian Coptics as Coptic text written in Arabic letters
Bedawi or Beja, mainly in northeastern Sudan
Wadaad writing, used in Somalia
Dongolawi language or Andaandi language of Nubia, in the Nile Vale of northern Sudan
Nobiin language, the largest Nubian language (previously known by the geographic terms Mahas and Fadicca/Fiadicca) is not yet standardized, being written variously in both Latinized and Arabic scripts; also, there have been recent efforts to revive the Old Nubian alphabet.
Fur language of Darfur, Sudan
Comorian, in the Comoros, currently side by side with the Latin alphabet (neither is official)
Swahili, was originally written in Arabic alphabet, Swahili orthography is now based on the Latin alphabet that was introduced by Christian missionaries and colonial administrators
Zarma language of the Songhay family. It is the language of the southwestern lobe of the West African nation of Niger, and it is the second leading language of Niger, after Hausa, which is spoken in south central Niger
Tadaksahak is a Songhay language spoken by the pastoralist Idaksahak of the Ménaka area of Mali
Hausa language uses an adaptation of the Arabic script known as Ajami, for many purposes, especially religious, but including newspapers, mass mobilization posters and public information
Dyula language is a Mandé language spoken in Burkina Faso, Côte d'Ivoire and Mali.
Jola-Fonyi language of the Casamance region of Senegal
Balanta language a Bak language of west Africa spoken by the Balanta people and Balanta-Ganja dialect in Senegal
Mandinka, widely but unofficially (known as Ajami), (another non-Latin script used is the N'Ko script)
Fula, especially the Pular of Guinea (known as Ajami)
Wolof (at zaouia schools), known as Wolofal.
Arabic script outside Africa
In writings of African American slaves
Writings of by Omar Ibn Said (1770–1864) of Senegal
The Bilali Document also known as Bilali Muhammad Document is a handwritten, Arabic manuscript on West African Islamic law. It was written by Bilali Mohammet in the 19th century. The document is currently housed in the library at the University of Georgia
Letter written by Ayuba Suleiman Diallo (1701–1773)
Arabic Text From 1768
Letter written by Abdulrahman Ibrahim Ibn Sori (1762–1829)
In the 20th century, the Arabic script was generally replaced by the Latin alphabet in the Balkans, parts of Sub-Saharan Africa, and Southeast Asia, while in the Soviet Union, after a brief period of Latinisation, use of Cyrillic was mandated. Turkey changed to the Latin alphabet in 1928 as part of an internal Westernizing revolution. After the collapse of the Soviet Union in 1991, many of the Turkic languages of the ex-USSR attempted to follow Turkey's lead and convert to a Turkish-style Latin alphabet. However, renewed use of the Arabic alphabet has occurred to a limited extent in Tajikistan, whose language's close resemblance to Persian allows direct use of publications from Afghanistan and Iran.
Afrikaans (as it was first written among the "Cape Malays", see Arabic Afrikaans)
Berber in North Africa, particularly Shilha in Morocco (still being considered, along with Tifinagh and Latin, for Central Atlas Tamazight)
French by the Arabs and Berbers in Algeria and other parts of North Africa during the French colonial period
Harari, by the Harari people of the Harari Region in Ethiopia. Now uses the Geʻez and Latin alphabets
For the West African languages—Hausa, Fula, Mandinka, Wolof and some more—the Latin alphabet has officially replaced Arabic transcriptions for use in literacy and education
Kinyarwanda in Rwanda
Kirundi in Burundi
Malagasy in Madagascar (script known as Sorabe)
Shona in Zimbabwe
Somali (see wadaad Arabic) has mostly used the Latin alphabet since 1972
Songhay in West Africa, particularly in Timbuktu
Swahili (has used the Latin alphabet since the 19th century)
Yoruba in West Africa (this was probably limited, but still notable)
Albanian called Elifbaja shqip
Aljamiado (Mozarabic, Berber, Aragonese, Portuguese, Ladino, and Spanish, during and residually after the Muslim rule in the Iberian peninsula)
Belarusian (among ethnic Tatars; see Belarusian Arabic alphabet)
Bosnian (only for literary purposes; currently written in the Latin alphabet; Text example: مۉلٖىمۉ سه تهبٖى بۉژه = Molimo se tebi, Bože (We pray to you, O God); see Arebica)
Greek in certain areas in Greece and Anatolia. In particular, Cappadocian Greek written in Perso-Arabic
Polish (among ethnic Lipka Tatars)
Central Asia and Caucasus
Adyghe language also known as West Circassian, is an official languages of the Republic of Adygea in the Russian Federation. It used Arabic alphabet before 1927
Avar as well as other languages of Daghestan: Nogai, Kumyk, Lezgian, Lak and Dargwa
Azeri in Azerbaijan (now written in the Latin alphabet and Cyrillic script in Azerbaijan)
Bashkir (officially for some years from the October Revolution of 1917 until 1928, changed to Latin, now uses the Cyrillic script)
Chaghatay across Central Asia
Chechen (sporadically from the adoption of Islam; officially from 1917 until 1928)
Circassian and some other members of the Abkhaz–Adyghe family in the western Caucasus and sporadically – in the countries of Middle East, like Syria
Karachay-Balkar in the central Caucasus
Kazakh in Kazakhstan (until the 1930s, changed to Latin, currently using Cyrillic, phasing in Latin)
Kyrgyz in Kyrgyzstan (until the 1930s, changed to Latin, now uses the Cyrillic script)
Mandarin Chinese and Dungan, among the Hui people (script known as Xiao'erjing)
Tat in South-Eastern Caucasus
Tatar before 1928 (changed to Latin Yañalif), reformed in the 1880s (İske imlâ), 1918 (Yaña imlâ – with the omission of some letters)
Turkmen in Turkmenistan (changed to Latin in 1929, then to the Cyrillic script, then back to Latin in 1991)
Uzbek in Uzbekistan (changed to Latin, then to the Cyrillic script, then back to Latin in 1991)
Some Northeast Caucasian languages of the Muslim peoples of the USSR between 1918 and 1928 (many also earlier), including Chechen, Lak, etc. After 1928, their script became Latin, then later Cyrillic
South and Southeast Asia
Acehnese in Sumatra, Indonesia
Banjarese in Kalimantan, Indonesia
Bengali in Bengal, Arabic scripts have been used historically in places like Chittagong and West Bengal among other places. See Dobhashi for further information.
Maguindanaon in the Philippines
Malay in Malaysia, Singapore and Indonesia. Although Malay speakers in Brunei and Southern Thailand still use the script on a daily basis
Minangkabau in Sumatra, Indonesia
Pegon script of Javanese, Madurese and Sundanese in Indonesia, used only in Islamic schools and institutions
Tausug in the Philippines
Maranao in the Philippines
Hebrew was written in Arabic letters in a number of places in the past
Northern Kurdish in Turkey and Syria was written in Arabic script until 1932, when a modified Kurdish Latin alphabet was introduced by Jaladat Ali Badirkhan in Syria
Turkish in the Ottoman Empire was written in Arabic script until Mustafa Kemal Atatürk declared the change to Latin script in 1928. This form of Turkish is now known as Ottoman Turkish and is held by many to be a different language, due to its much higher percentage of Persian and Arabic loanwords (Ottoman Turkish alphabet)
As of Unicode 14.0, the following ranges encode Arabic characters:
Most languages that use alphabets based on the Arabic alphabet use the same base shapes. Most additional letters in languages that use alphabets based on the Arabic alphabet are built by adding (or removing) diacritics to existing Arabic letters. Some stylistic variants in Arabic have distinct meanings in other languages. For example, variant forms of kāfك ک ڪ are used in some languages and sometimes have specific usages. In Urdu and some neighbouring languages the letter Hā has diverged into two forms ھdō-čašmī hē and ہ ہـ ـہـ ـہgōl hē. while a variant form of يyā referred to as baṛī yēے is used at the end of some words.
Table of Letter Components
Arabic (Unicode block)
Eastern Arabic numerals (digit shapes commonly used with Arabic script)
History of the Arabic alphabet
Transliteration of Arabic
Unicode collation charts—including Arabic letters, sorted by shape
Why the right side of your brain doesn't like Arabic
Arabic fonts by SIL's Non-Roman Script Initiative
Alexis Neme and Sébastien Paumier (2019), "Restoring Arabic vowels through omission-tolerant dictionary lookup", Lang Resources & Evaluation, Vol. 53, pp. 1–65. arXiv:1905.04051; doi:10.1007/s10579-019-09464-6