ArabTEX a System for Typesetting Arabic User Manual Version 3.00 1 2 Klaus Lagally November 22, 1993 1Report Nr. 1993/11, Universität Stuttgart, Fakultät Informatik, Breitwiesenstraße 20{22, 70565 Stuttgart, Germany 2This Report supersedes Report Nr. 1992/06 Overview ArabTEX is a package extending the capabilities of TEX/LaTEX to generate the Arabic writing from an ASCII transliteration for texts in several languages using the Arabic script. It consists of a TEX macro package and an Arabic font in several sizes, presently only available in the Naskhi style. ArabTEX will run with Plain TEX and also with LaTEX. It is compatible with NFSS, NFSS2 and the EDMAC package; other additions to TEX have not been tried. ArabTEX is primarily intended for generating the Arabic writing, but the standard scientific transliteration can also be easily produced. For languages other than Arabic that are customarily written in the Arabic script some limited support is available. ArabTEX defines its own input notation which is both machine, and human, readable, and suited for electronic transmission and Email communication. However, texts in some of the Arabic standard encodings can also be processed. ArabTEX is copyrighted, but free use for scientific, experimental and other strictly private, noncommercial purposes is granted. Offprints of publications using ArabTEX are welcome. Using ArabTEX otherwise requires a license agreement. There is no warranty of any kind, either expressed or implied. The entire risk as to the quality and performance rests with the user. Please send error reports, suggestions and inquiries to the author: Prof. Klaus Lagally Institut für Informatik Universität Stuttgart Breitwiesenstraße 20-22 70565 Stuttgart GERMANY lagally@informatik.uni-stuttgart.de Copyright cfl 1992, 1993, Klaus Lagally Contents 1 Activating ArabTEX 5 2 Input to ArabTEX 6 2.1 Arabic text elements : : : : : : : : : : : : : : : : : : : : : : : : : 6 2.2 Commands in an Arabic context : : : : : : : : : : : : : : : : : : 7 3 Language selection 10 4 Font selection 11 5 Input coding conventions 12 5.1 Standard Arabic and Persian characters : : : : : : : : : : : : : : 12 5.2 Quoting : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15 5.3 Ligatures : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 16 5.4 Vowelization : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 16 5.5 Verbatim input : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17 5.6 Alternate input codings : : : : : : : : : : : : : : : : : : : : : : : 17 6 Transliteration 19 6.1 ZDMG transliteration style : : : : : : : : : : : : : : : : : : : : : 19 6.2 Encyclopedia of Islam style : : : : : : : : : : : : : : : : : : : : : 20 7 Support for other languages besides Arabic 21 7.1 Persian (Farsi, Dari), also Ottoman, Kurdish : : : : : : : : : : : 21 1 CONTENTS 2 7.2 Urdu : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 22 7.3 Pashto (Afghanic) : : : : : : : : : : : : : : : : : : : : : : : : : : 22 7.4 Maghribi : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23 7.5 Other languages : : : : : : : : : : : : : : : : : : : : : : : : : : : 24 8 Miscellaneous features 25 8.1 Automatic stretching : : : : : : : : : : : : : : : : : : : : : : : : : 25 8.2 Dots on y?a' : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 25 8.3 Additional codings : : : : : : : : : : : : : : : : : : : : : : : : : : 25 8.4 Progress report : : : : : : : : : : : : : : : : : : : : : : : : : : : : 26 8.5 Verbatim copy of the input : : : : : : : : : : : : : : : : : : : : : 27 8.6 Using ArabTEX with EDMAC : : : : : : : : : : : : : : : : : : : : 27 9 Acknowledgments 28 10 References 29 A Obtaining ArabTEX 32 B Installing ArabTEX 33 C Release history 34 D Sample ArabTEX input 36 E Sample ArabTEX output 37 F Coding examples for Arabic 38 G Coding examples for Persian 45 H Alternate input encodings 48 H.1 ASMO 449 = ISO 9036 : : : : : : : : : : : : : : : : : : : : : : : 48 H.2 ASMO 449E = ISO 8859 - 6 : : : : : : : : : : : : : : : : : : : : : 50 CONTENTS 3 I Miscellaneous utilities 52 I.1 twoblks.sty : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 52 I.2 abjad.sty : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 53 I.3 MLS2ARAB : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 53 Index 54 List of Tables 5.1 Standard codings for Arabic and Persian. : : : : : : : : : : : : : 13 5.2 Additional codings generally available. : : : : : : : : : : : : : : : 14 5.3 Verbatim codings for the carrier of hamza : : : : : : : : : : : : : 17 7.1 Additional codings for Urdu. : : : : : : : : : : : : : : : : : : : : 23 7.2 Additional codings for Pashto. : : : : : : : : : : : : : : : : : : : 24 8.1 Additional codings for special purposes. : : : : : : : : : : : : : : 26 H.1 ASMO 449 code table : : : : : : : : : : : : : : : : : : : : : : : : 49 H.2 ISO 8859-6 code table : : : : : : : : : : : : : : : : : : : : : : : : 51 4 Chapter 1 Activating ArabTEX With Plain TEX, load the ArabTEX macros by \input arabtex.tex. With LaTEX, include the option "arabtex" in the document header. In both cases some additional files will be loaded automatically. ArabTEX defines several user commands as indicated below. There is also a large number of (hidden) internal commands which could lead to storage (hash table1) overflow in a small TEX implementation. All internal commands contain an \at" sign (@) in their names and thus should not interfere with any user defined commands (but could possibly with other TEX extensions we do not know about). With Plain TEX, the Arabic font by default is only available at the normal 14 point size which ought to cooperate well with the "cm" fonts at 10 points. A bold variant is also provided. For other sizes, the user has to change the \magnification or to define additional font identifiers himself. To change the default, inspect the file "arabtex.tex" and redefine the \pnash and/or \pnashbf command accordingly. With LaTEX, the usual size changing commands will also operate on the Arabic font. 1A TEX hash table size of 3000 to 3500 is recommended 5 Chapter 2 Input to ArabTEX After activating ArabTEX, select one of the Arabic writing styles, e.g., \setarab (see Section 3). Your modified TEX/LaTEX system will recognize the following items: - normal TEX/LaTEX text and commands, - short Arabic quotations bracketed by < and > . These must normally fit onto one line of output, except if explicitly broken up by \\ or \| commands (see below). A quotation may also be started with \< except inside a LaTEX {tabbing} environment. - longer Arabic texts which are bracketed by \begin{arabtext} and \end{arabtext}, (even when using Plain TEX!), called Arabic Environments in the sequel. An Arabic Environment consists of one or more paragraphs separated by blank lines or \par commands. Arabic quotations and Arabic environments are called Arabic contexts in the sequel. 2.1 Arabic text elements Every Arabic paragraph and every Arabic quotation is a sequence of the following kinds of Arabic items, separated by blank spaces or newlines: - isolated punctuation marks, interpreted as the corresponding Arabic punctuation mark; 6 CHAPTER 2. INPUT TO ARABTEX 7 - \numbers", i.e. character sequences starting with a digit. A \number" will be processed using the normal writing sequence from left to right even if it contains letters and/or special characters; however, if the final character is a punctuation mark, it will be split off and processed separately. - \Arabic quotes" coded as two left quotes or two right quotes each; they may also be written directly adjacent to a word. - \words", i.e. character sequences starting with a letter or a special (nondigit) character followed by a letter. A final punctuation mark will be split off and processed separately. The (coded) characters of a word will in the output be arranged from right to left. - a sequence of words, numbers, and special characters enclosed in curly braces { and } . This introduces a new level of TEX grouping; otherwise the constituents are processed normally. This feature may be nested. Output from all items will be arranged from right to left, lines will be broken as necessary. Inside an Arabic Environment, or in an Arabic quotation, you may also have: - ArabTEX commands with or without parameters. These will be executed immediately. - Some, but not all, TEX/LaTEX commands (see below). These will be executed immediately. - Short mathematical insertions, bracketed by single $ signs. They must fit on one output line and are processed as usual. TEX Display mode within an Arabic environment is not provided; if it is required, the user has to leave the Arabic environment temporarily. - short non-Arabic (\Roman") quotations, containing text and possibly also TEX/LaTEX commands, bracketed by < and > . These must fit on one output line and introduce a new level of grouping, so if they contain any TEX/LaTEX assignments the effects of these will be local by default. This feature is not available within an Arabic quotation. The alternate notation \< is also not provided. 2.2 Commands in an Arabic context A control sequence inside an Arabic context must be separated from the preceding text item by at least one blank space, newline, or another control sequence, and may be of the following kinds: CHAPTER 2. INPUT TO ARABTEX 8 - ArabTEX option changing commands. These may also be used outside an Arabic Context, and usually follow the TEX grouping rules. - \\ for a line break; the last line will be padded on the left with spaces. - \| for a line break; the last line will be aligned. If it comes out very badly spaced, automatic stretching might help (see Section 8). - \indent or \par (or a blank line) for a new paragraph, \noindent for a new paragraph without indentation; (not inside Arabic quotations). - \emphasize Arabic item will put a bar over the Arabic item. - \emphasize {group of Arabic items} will put a bar over the indicated group of Arabic items. - \setnash, \setnashbf, \setnastaliq font selection commands, see Section 4. - size changing LaTEX commands like \large etc., only if LaTEX is used! - the following commands: \footnote (observe that the syntax for Plain TEX and LaTEX is different!), \marginpar (also with Plain TEX, analogous to the LaTEX usage). - the TEX/LaTEX commands \smallskip, \medskip, \bigskip, \input, \hfill, \ (for a space), \space with their usual meaning. - \nospace will place the adjacent items in the output in contact, without any intervening space. - \hspace {width} will introduce the indicated amount of spacing in the output. - \mbox {text} puts the text into a box that will not be split across a line break. - \spreadbox {width}{text} spreads out the text to the indicated width. This may be useful e.g., when typesetting poetry. \spreadbox {width}{text\hfill } will inhibit the spreading, \spreadbox {width}{\hfill text\hfill } will center the text inside the box. \spreadbox {width}{\hfill } or \spreadbox {width}{ } just introduces the indicated amount of horizontal space, as will \hspace {width}. If two boxing commands follow each other without any intervening blank space in the input, there will also be no resulting space between the boxes in the output. CHAPTER 2. INPUT TO ARABTEX 9 - \centerline {text} will start a new line whose contents are centered (not inside Arabic quotations). - \spreadline {text} will start a new line whose contents are spread out over the whole width of the page (not inside Arabic quotations). It is approximately equivalent to \spreadbox {\hsize }{text}. - User defined commands whose expansion produces legal ArabTEX input may be called by \docommand {command and parameters}. The command is expanded exactly once,1 and the result is processed by ArabTEX again. Any side effects of the expansion will be local. - Parameter assignments inside an Arabic context may be performed by \doassign {parameter}{value}. The effect is normally local except if the form \doassign {\global parameter}{value} is used. - Any non-recognized command will generate an error message and will be echoed verbatim in the output. Even though ArabTEX tries hard to get into synchronization again, additional spurious errors may occur. - inside an Arabic Context no further LaTEX or ArabTEX environment may be nested (with the possible future exception of list environments; these are not yet implemented.) For a list of all available commands, consult the Index to this report. As a reminder, a list of all commands that are valid inside Arabic text will appear in the log file. 1This is no strong restriction as the expansion may contain \docommand calls again. Chapter 3 Language selection The processing of input text to be written in the Arabic script is somewhat language dependent. Thus before the first Arabic quotation or Arabic environment you have to indicate the desired processing mode by one of the commands \setarab, \setfarsi, \seturdu, \setpashto, \setmaghribi, or \setverb (no special processing; see however Section 5.5). The processing mode may be changed at any time, even inside an Arabic environment or an Arabic quotation. After selecting a language, the symbols < and > serve to bracket short insertions in the chosen language. Whereas this is usually convenient, observe that they can thus no more be used for other purposes, except in mathematical mode where they retain their normal meaning as relational operators. To temporarily return them to their normal mode of operation, deselect the language by \setnone. Arabic insertions may also be started by \<.1 For further details on supported languages, see Section 7. 1Note for advanced TEX users: All language selecting commands except \setnone set the character < active. If Arabic insertions are not needed, or are always started with \<, the user may reuse the command < for other purposes, or deactivate it by \catcode `\<=12 to return it to its normal meaning. 10 Chapter 4 Font selection For space economy, only the Naskh font is available by default. With LaTEX, additional fonts can be loaded by the document style options "nashbf" (for bold-face) and/or "nastaliq" (when available). Users of Plain TEX are considered specialists and have to define and load suitable fonts at the required sizes themselves. The following font selection commands are available: - \setnash (default) selects the Naskh font. - \setnashbf selects a bold-face version of Naskh. - \setnastaliq selects the Nasta`liq font. If a font is not available or has not been loaded, the corresponding command will select the default font. With LaTEX, the size changing commands will also operate on the additional fonts. 11 Chapter 5 Input coding conventions The ASCII input notation for Arabic text has been modelled closely after the transliteration standards ISO/R 233 and DIN 31 635. As these standards do not guarantee unique re-transliteration and are also not 7-bit ASCII compatible, some modifications were necessary. These follow the general rules: - whenever the transliteration uses a single letter, code that letter; - whenever the transliteration uses a letter with a diacritical mark, put the punctuation character most closely resembling the diacritical mark before the letter (and not behind it as in some other coding proposals, as otherwise the readability of the input would suffer). - use capital letters for writing variants 5.1 Standard Arabic and Persian characters The standard codings for Arabic and Persian are given in Table 5.1 and Table 5.2. - For long vowels, use the capital letters , , or , , . - To get the defective writing of long vowels, use <_a>, <_i>, <_u>. - 'Alif maqs.?ura is <_A> or . - The short vowels fath.a, kasra, d.amma are coded , , and need not normally be written except in the following cases: 12 CHAPTER 5. INPUT CODING CONVENTIONS 13 a @ a 'alif b H. b b?a' p H? p p?a' t ?H t t?a' _t ?H t? t??a' ^g ` <=g <=g??m .h h h. h.?a' _h p h>= h>=?a' d X d d?al _d ?X d? d??al r P r r?a' z R z z?ay s Ä s s??n ^s ? <=s <=s??n .s ? s. s.?ad .d ? d. d.?ad .t ? t. t.?a' .z ? z. z.?a' ` ? , `ayn .g ?? _g _gayn f ? f f?a' q ? q q?af v ö v v?a' k ? k k?af g ? g g?af l ? l l?am m ? m m??m n ?? n n?un h ? h h?a' w ? w w?aw y ?? y y?a' _A ? ?a 'alif T ?? t t?a' maqs.?ura marbut.a Table 5.1: Standard codings for Arabic and Persian. { at the beginning of a word where they generate 'alif , { adjacent to hamza where they will influence its carrier, { when the transliteration is required, { in the \fullvocalize mode. - Tanw??n is coded , , or . A silent 'alif , if required, is supplied automatically; it may also be explicitly written: . Likewise, a silent w?aw may be written as in <`amruNU>. - hamza is denoted by a single right quote <'>. After selecting a language by \setarab etc., the hamza carrier will be determined from the context according to the rules for writing Arabic words; if that is not wanted, \quote" the hamza (see Section 5.2 below). In the \setverb mode, the hamza carrier is determined by the following letter; see Section 5.5. - madda on 'alif is generated by a right quote (hamza) before : <'A>. CHAPTER 5. INPUT CODING CONVENTIONS 14 c flh c h.?a' with hamza ^c x <=c <=g??m with three dots (below) ,c ?h ?c h>=?a' with three dots (above) ^z T <=z z?ay with three dots (above) ~n ?? ~n k?af with three dots (Ottoman) ~l ?? ~l l?am with a bow accent (Kurdish) .r v _r r?a' with two bows (Kurdish) Table 5.2: Additional codings generally available. It may also be written <~A>; likewise, <~I> and <~U> will produce madda on y?a' and on w?aw , as required in some older writing conventions. - The coding <`> for `ayn is a single left quote, beware of confusing it with hamza! - The \invisible consonant" <|> may be inserted in order to break unwanted ligatures and to influence the hamza writing. It will not show in the Arabic output or in the transliteration. At the beginning of a word it will suppress a following short vowel; otherwise it acts like a consonant. - The sequence <||> will insert a small space, as does <"|> (see Section 5.2 below). The adjacent characters will not be connected. - <=Sadda is indicated by doubling the appropriate letter coding. - The definite article is separated from the following word by a hyphen. It may be written in the assimilated form (if it exists): , or always as ; in that case a subsequent \sun letter" must be doubled: , to receive a <=sadda, and to prevent a suk?un on the l?am. The transliteration in both cases is identical. - Hyphens <-> are used for tying words together, or for indicating a connecting vowel in Arabic, or an iz.?afet connection in Persian. They may be used freely, and generally do not change the writing, but will show up in the transliteration. Additionally, at the beginning and the end of an CHAPTER 5. INPUT CODING CONVENTIONS 15 otherwise isolated word they enforce the use of the connecting form of the adjacent letter (if it exists), like e.g. in the date <1400 h->. - A double hyphen <--> between two otherwise joining letters will break any ligature and will insert a horizontal stroke (tatw??l , ka<=s??da) without appearing in the transliteration. It may be used repeatedly. See also Section 8: automatic stretching. For special applications, it can also be coded ; and <|B> will behave like an ordinary consonant and may carry vowel indicators, tanw??n, suk?un, and, in the combination <|BB>: <=sadda. 5.2 Quoting In \novocalize mode (see Section 5.4), a double quote <"> will modify the meaning of the following character as follows: - if a short vowel follows, the appropriate diacritical mark fath.a, kasra, d.amma will be put on the preceding character. { If follows the short vowel, the appropriate form of tanw??n will be generated instead. { At the beginning of a word, 'alif is assumed as the first character. - if the following character is a single right quote, a hamza mark will be put on the preceding character even if in conflict with the hamza rules. At the beginning of a word, an isolated hamza will be generated. - if the following character is the \invisible consonant" <|>, the connection between the adjacent letters will be broken and a small space inserted. This can also be denoted <||> instead of <"|>. At the beginning of a word, 'alif with was.la will be generated. - otherwise: a suk?un will be put on the preceding character. The following character will be processed again. The double quote will not show up in the transliteration. In \vocalize mode, (see Section 5.4), quoting will turn a short vowel off; likewise, in \fullvocalize mode, quoting will also turn a suk?un off. Put differently: quoting will toggle the generation of short vowel indicators and suk?un on and off. CHAPTER 5. INPUT CODING CONVENTIONS 16 5.3 Ligatures There is no way to explicitly enforce ligatures as a large number of them are generated automatically. The results will not always look satisfactory, so we recommend inspecting the output after the first run. Any unwanted ligature can be suppressed by interposing the invisible character <|> between the two letters otherwise combined into a ligature. After \ligsfalse, in the middle of a word fewer ligatures will be produced; for some texts this looks better. You can return to the normal strategy by \ligstrue. 5.4 Vowelization There are three modes of rendering short vowels: - \fullvocalize: { Every short vowel written will generate the corresponding diacritical mark fath.a, kasra, d.amma, except if quoted. { If follows a short vowel, the corresponding form of tanw??n is generated instead. { Defective writing: The coding <_a> will produce a Qur'an 'alif accent (also called dagger 'alif ) instead of an explicit 'alif character which would be coded or . Likewise, <_i> will produce a small 'alif below the preceding consonant in place of (), and <_u> will produce an inverted d.amma in place of (). { If a long vowel follows a consonant, the corresponding short vowel is implied. The long vowel itself carries no diacritical mark. { If no vowel is given after a consonant, suk?un will be generated except if a double quote precedes the next consonant. The l?am of the definite article receives no suk?un if a double \sun letter" follows. { 'alif at the beginning of a word carries was.la instead of the vowel indicator if the preceding word ended with a vowel. - \vocalize: As above, but suk?un and was.la will not be generated except if explicitly indicated by \quoting". - \novocalize: No diacritics will be generated except if explicitly asked for by \quoting". In all modes, a double consonant will generate <=sadda, and <'A> always generates madda on 'alif . CHAPTER 5. INPUT CODING CONVENTIONS 17 After the silent 'alif character is generated if necessary. The silent 'alif may also be explicitly indicated by , or coded literally as in \novocalize mode. If a silent 'alif maqs.?ura is wanted instead, write , , <_A> or . The tanw??n fath.a is normally put on the last consonant of the word, even if a silent 'alif follows. If it is instead supposed to go onto the 'alif as in some modern Arabic conventions, or in Persian, this behaviour can be achieved by the option \newtanwin. The option \oldtanwin will restore the classical behaviour. A silent 'alif after w?aw is indicated by or (with a capital !). 5.5 Verbatim input 'a fl@ hamza on 'alif 'i @fl hamza below 'alif 'w fl? hamza on w?aw 'y flK hamza on a tooth 'h fl? hamza on h?a' 'B fl? hamza on the line '| Z isolated hamza 'A ffi@ madda on 'alif Table 5.3: Verbatim codings for the carrier of hamza After disabling language specific processing by \setverb or \setnone, ArabTEX will not use any context information to determine the carrier of hamza. Instead the user has to supply this information himself by the next character typed after <'>. Generally this character will be used as the carrier; for examples and some exceptions see Table 5.3. A short vowel indicator may follow. To ease automatic conversion, an initial 'alif may also be coded . 5.6 Alternate input codings The ArabTEX input notation has been very carefully designed for flexibility, readability, and ease of use for linguists confined to standard 7-bit ASCII equipment for processing and transmitting data. However, it does not make much sense recoding existing machine-readable text files coded according to other standards. Thus, some alternate reading modules have been written (as there CHAPTER 5. INPUT CODING CONVENTIONS 18 are more than 10 different codings in current use, this is an open-ended activity), and a general code switching procedure has been provided. An alternate reading module, e.g. asmo449.sty for the ASMO 449 code, is installed by adding its name (asmo449) as a LaTEX style option, or by \input asmo449.sty. Afterwards, a code name (in this case asmo449) is defined. Input coding is switched by the command \setcode {code name} that changes the coding for Arabic text globally, or by the environment \begin {setcode}{code name} ? ? ? \end {setcode} which follows the normal TEX grouping rules. Coding may be switched several times in the same document, provided the appropriate reading modules are installed; \setcode {arabtex} reverts to the standard ArabTEX notation. Please observe that only Arabic text is affected by \setcode {code name}; text outside of Arabic contexts, and control sequence names, are still assumed to be in 7-bit ASCII. As existing text files presumably do not contain any control sequences or non-Arabic text anyway, we suggest using a small ASCII TEX/LaTEX driver file setting all relevant options and containing any non-Arabic text, and calling the Arabic text files by \input {file name} from within an Arabic environment . For details on available additional reading modules, see Appendix H. Chapter 6 Transliteration 6.1 ZDMG transliteration style In addition to the arabic writing, the standard scientific transliteration may also be obtained from a fully vowelized input text. This mode is activated by \transtrue and may be switched off again by \transfalse. If only the transliteration is wanted, you can deactivate the arabic writing by \arabfalse; it can be reactivated by \arabtrue. If both modes are active their output will be interleaved line by line. The transliteration mode assumes that the input text is in the Arabic or Persian language and has been coded according to the rules given above. For words from other languages the transliteration might be in error. For Arabic text, the following special cases are handled: - after the definite article, a double consonant will be assimilated; - an initial vowel will be replaced by an apostrophe whenever the preceding word ended with a vowel (in this case a was.la appears in the Arabic writing). If that is not wanted, start with hamza. - a silent 'alif or 'alif maqs.?ura after (tanw??n) and is omitted in the transliteration. The same happens after w?aw if it is written as a capital . - To correctly reproduce some historical writings, a silent long vowel after <_a> is omitted in the transliteration. For examples, see the Appendix. For economy of space, the transliteration module is not loaded by default. If 19 CHAPTER 6. TRANSLITERATION 20 you want to use it, add the style option "atrans" with LaTEX; and with Plain TEX, say \input atrans.sty after loading ArabTEX. 6.2 Encyclopedia of Islam style For special purposes, the standard transliteration output may be modified by including the LaTEX option "etrans", or by loading the file "etrans.sty" when working with Plain TEX. After this modification, the transliteration will follow the style of the Encyclopedia of Islam. Chapter 7 Support for other languages besides Arabic ArabTEX is primarily intended for typesetting texts in classical and modern Arabic, but it also provides some support for several other languages that are customarily written in the Arabic alphabet. In order to switch to the conventions for one of these languages, say \setfarsi, \seturdu, \setpashto, \setmaghribi; \setverb will switch off any language specific processing. \setarab can be used to switch back to the Arabic conventions. After selecting the language, < and > serve as delimiters for quotations; \setnone will, like \setverb, deselect any language, and will also return < and > to their normal TEX meaning. This part of ArabTEX relies heavily on contributions from the user community; we want to especially mention Ivan Dershanski who completely reimplemented the routines for processing Persian. As we extensively modified these contributions while integrating the system, we are solely responsible for any remaining, or newly introduced, errors. 7.1 Persian (Farsi, Dari), also Ottoman, Kur- dish - All characters needed for writing Farsi are available by default. The short vowels and are mapped to and , the long vowels and to and without a vowel indicator. denotes final silent h?a' . This h?a' receives no suk?un even in fully vowelized mode. 21 CHAPTER 7. SUPPORT FOR OTHER LANGUAGES BESIDES ARABIC22 - For fath.a or kasra followed by a final silent h?a' you can also write <,a> or <,e> in place of and . - The iz.?afet connection may always be written <-i> or <-e> (with hyphen); then the correct spelling will be determined from the context. Likewise the y?a'-i-wah.dat can always be written <-I> or <-E>. - The present tense forms of the copula are coded <-am>, <-I>, <-ast>, <-Im>, <-Id>, <-and>. In the output they are written as separate words after a little space. - The final y?a' carries no dots. Farsi uses the Nasta`liq font if available, otherwise Naskh. For further details see Appendix G. 7.2 Urdu - For Urdu, additional codings are available, see Table 7.1. Some of the given codings also occur in Pashto but with a different meaning, see Section 7.3. - The short vowels and are mapped to and . , <,a> and <,e> are used as in Persian. - Even in fully vowelized mode, an aspirated consonant before receives no suk?un since the two are technically a single letter. - Urdu uses the Nasta`liq font if available, otherwise Naskh. 7.3 Pashto (Afghanic) - For Pashto, additional codings are available, see Table 7.2. Some of the given codings also occur in Urdu but with a different meaning, see Section 7.2. - The short vowel is indicated by a zwarakay , by an inverted d.amma. Observe also the following codings: hamza on w?aw hamza on h?a' , if not generated by iz.?afet CHAPTER 7. SUPPORT FOR OTHER LANGUAGES BESIDES ARABIC23 h ? h always denotes the \two-eyed" h?a' ,h ? h the \wavy" h?a' letter ,t ?H ?t t?a' with a small t.?a' accent ,d ?X ?d d?al with a small t.?a' accent ,r ?P ?r r?a' with a small t.?a' accent .n ? n. n?un without a dot E fl ?e ?e, y?a' bar??' in the final position ae flff? ae the diphtong ae ao ?ff? ao the diphtong ao O ?? ?o the long vowel ?o U ?fi? ?u the long vowel ?u Table 7.1: Additional codings for Urdu. - The codings , <,a> and <,e> are used as in Persian. The rules for iz.?afet and y?a'-i-wah.dat apply. - For writing some Pashto words in the Urdu style, write the command \seturdu and afterwards switch back to the Pashto conventions by \setpashto. 7.4 Maghribi Nearly like Arabic but using a different writing convention. f?a' is written with one dot below the letter, q?af with one dot above the normal letter form of f?a' . The three dots of v?a' are put below the letter. CHAPTER 7. SUPPORT FOR OTHER LANGUAGES BESIDES ARABIC24 ,t ?L ?t t?a' with a small loop ,d ^ ?d d?al with a small loop ,r V ?r r?a' with a small loop .n ? n. n?un with a small loop g ? g g?af with a small loop instead of a bar ,z n ?z r?a' with one dot above and one below ,s ? ?s s??n with one dot above and one below ae fl?ff? ae the diphtong ae Ee ü? ey the diphtong ey ee fl?? ey the diphtong ey E ?..? ?e the long vowel ?e O ?f? ?o the long vowel ?o U ?fi? ?u the long vowel ?u Table 7.2: Additional codings for Pashto. 7.5 Other languages This is up to experimentation by the user. If \setarab or \setfarsi will not produce the desired result, try \setverb for verbatim mode. The vowelization and the transliteration cannot generally be expected to be correct, but might work by accident. In case some character variants not yet provided are needed, feel free to ask the author for help. There is no simple way for the user to modify the script. Chapter 8 Miscellaneous features 8.1 Automatic stretching For special purposes, e.g. for headlines and for Arabic paragraphs containing long mathematical or non-Arabic insertions, the connection between adjacent Arabic letters may be made \elastic", if they form no ligature. Thus a ka<=s??da is inserted whose length will be adjusted automatically to uniformly fill the output line. This feature very easily leads to storage overflow during the processing, and should only be used whenever necessary. It is switched on with \spreadtrue and switched off again with \spreadfalse. Inside an Arabic Environment, it will also be switched off automatically at the end of every paragraph. 8.2 Dots on y?a' Whether y?a' in the final position carries dots or not is controlled by the chosen language convention. You can override this, after selecting the language, by \yahdots and \yahnodots. 8.3 Additional codings To reproduce exotic, erroneous or archaic texts exactly as they are written, some additional codings are available, see Table 8.1. 25 CHAPTER 8. MISCELLANEOUS FEATURES 26 .k ? k k?af in the final position without a mark ^d X. <=d d?al with a dot below .f ? f. f?a' without a dot .b H b. b?a' without a dot .n ? n. n?un without a dot (not available in Pashto mode) Y ? ?a 'alif maqs.?ura; y?a' without dots in all positions Table 8.1: Additional codings for special purposes. If further variants are needed, write to the author and indicate: - the required shape, - the assumed transliteration, - a suggestion for the input coding, - some information on the intended use. We are willing to consider any suggestion. Adding a new character might be easy, or else it might be impossible. ArabTEX is flexible, but there are some technical limitations. 8.4 Progress report As ArabTEX is slow, it will produce some terminal output while running to indicate it is still alive. If that is not wanted, e.g., on a very fast system, or while running a batch job, say \quiet or \tracingarab = 0 (outside an Arabic Environment; otherwise say \doassign {\tracingarab }{0}). \tracingarab = 1 will only report Arabic paragraphs, a value of 2: Arabic lines and insertions, a value of 3 or more: individual Arabic items. CHAPTER 8. MISCELLANEOUS FEATURES 27 8.5 Verbatim copy of the input For test purposes, the Arabic input may be reproduced verbatim after \showtrue in addition to the normal output; \showfalse switches this feature off again. Commands will not usually be shown. The output will generally not look pleasant, and this feature is only provided in order to trace down errors, or to demonstrate the operation of ArabTEX as in the appendix. 8.6 Using ArabTEX with EDMAC ArabTEX will cooperate with EDMAC, a Plain TEX macro package for critical editions, written by John Lavagnino and Dominik Wujastyk. If EDMAC is already present when ArabTEX is loaded, the EDMAC commands will, after suitable modifications, be available inside an Arabic environment. Their arguments are considered Roman text but may contain Arabic quotations. For further details, see the EDMAC documentation. Chapter 9 Acknowledgments The development of ArabTEX would not have been possible without the assistance of many people, and it is impossible to acknowledge every individual contribution. Besides our local team, i.e. Udo Merkel and Heribert Schlebbe, helpful advice came, among others, from Chahriar Assad, Benno van Dalen, Ivan Derzhanski, Wolfdietrich Fischer, Ahmed El-Hadi, Yannis Haralambous, Abdelsalam Heddaya, Nicholas Heer, Iqbal Khan, Tom Koornwinder, Eberhard Krüger, Asif Lakehsar, Jan Lodder, Richard Lorch, Pierre MacKay, Eberhard Mattes, Fathy Neamat-Allah, Bernd Raichle, Ulrich Rebstock, Mohamed Saba, Waheed Samy, Annemarie Schimmel, Nariman Shehab, Dominik Wujastyk, and Michio Yano. We also have to thank all users who sent error reports, comments, and suggestions. 28 Chapter 10 References B. Alavi, M. Lorenz: Lehrbuch der persischen Sprache. 5. Auflage 1988. VEB Verlag Enzyklopädie, Leipzig. A. A. Ambros: Einführung in die moderne arabische Schriftsprache. 1. Auflage 1969. Max Hueber Verlag, München. ASMO 449: 7-bit coded Arabic character set for information interchange. Arabic Standards and Measurements Organization, 1982. J. D. Becker: Arabic Word Processing. Comm. ACM 30/7, 600-610 (1987). T. Borg: Arabisch für Ausländer. Ein Lehrbuch für modernes Hocharabisch. 2. Auflage 1979. Verlag Borg GmbH, Hamburg. J. A. Boyle: Grammar of Modern Persian. Wiesbaden: Otto Harrassowitz, 1966. B. Comrie (ed.): The World's Major Languages. Croom Helm, London 1987. DIN 31 635: Umschrift des Arabischen Alphabets. Deutsches Institut für Normung e.V., 1982. J. Lavagnino and D. Wujastyk: An Overview of EDMAC: A plain TEX format for critical editions. TUGboat 11/4, 623-643 (1990). L. P. Elwell-Sutton: Elementary Persian Grammar. Cambridge University Press, 1963. 29 CHAPTER 10. REFERENCES 30 C. Faulmann: Das Buch der Schrift, enthaltend die Schriften und Alphabete aller Zeiten und aller Völker des gesammten (sic!) Erdkreises. K. K. Hof- und Staatsdruckerei, Wien 1878. W.D. Fischer: Grammatik des Klassischen Arabisch. 2. Auflage 1987. Verlag Otto Harrassowitz, Wiesbaden. A. Grohmann: Arabische Paläographie (Teil I und II). Österreichische Akademie der Wissenschaften, Philosophisch-historische Klasse, Denkschriften 94, 1. Wien 1967. E. Harder, A. Schimmel: Arabische Sprachlehre. 15. Auflage 1983. Julius Groos Verlag, Heidelberg. ffl??G. Q?? @ ffl?u?' @ Y? @??fl , ?A ffl?u?' @ Yffl"m? "??A?. H?a<=sim Muh.ammad al-H?at.t.?at.: Qaw?a`id al-H?at.t.i al-`Arab??. Maktaba an-Nahd.a, Baghdad; D?ar al-Qalam, Beirut, 1400/1980. ISO/R 233 - 1961: International System for the Transliteration of Arabic Characters. International Standards Institution, 1961. ISO 8859 - 6: Information processing | 8-bit single-byte coded graphic character sets | Part 6: Latin/Arabic alphabet. International Organization for Standardization, 1987. ISO 9036: Information processing | Arabic 7-bit coded character set for information interchange. International Organization for Standardization, 1987. D. E. Knuth: The METAFONTbook. Addison Wesley Publishing Comp., Reading, Mass., 1986. D. E. Knuth: The TEXbook. Sixth printing. Addison Wesley Publishing Comp., Reading, Mass., 1986. D. E. Knuth and P. MacKay: Mixing right-to-left texts with left-to-right texts. TUGboat 8/1, 14-25 (1987). Ann K. S. Lambton: Persian Grammar. Cambridge University Press, 1953. L. Lamport: LaTEX, A Document Preparation System. Addison Wesley Publishing Comp., Reading, Mass., 1986. M. Lorenz: Lehrbuch des Pashto (Afghanisch). 2. Auflage 1982. VEB Verlag Enzyklopädie, Leipzig. P. A. MacKay: Typesetting Problem Scripts. BYTE 11/2, 201-216 (1986). CHAPTER 10. REFERENCES 31 H. Ritter: Über einige Regeln, die beim Drucken mit arabischen Typen zu beachten sind. ZDMG 100/2, 577-580 (1951). Friedrich Rückert: Grammatik, Poetik und Rhetorik der Perser. Wiesbaden: Otto Harrassowitz, 1966. C. Salemann, V. Shukovski: Persische Grammatik. 4. Auflage 1947. Verlag Otto Harrassowitz, Leipzig. A. Schimmel: Islamic Calligraphy. E.J.Brill, Leiden, Netherlands 1970. H.J. Vermeer, W. Akhtar, A. Akhtar: Urdu-Lautlehre und Urdu-Schrift. 3. Auflage 1985. Julius Groos Verlag, Heidelberg. Appendix A Obtaining ArabTEX The ArabTEX system is available from the author's institution (by anonymous FTP from ftp.informatik.uni-stuttgart.de (129.69.211.2), in the directory pub/arabtex) and from many other common servers, e.g. the CTAN network (Aston, Niord, Stuttgart). The files may be transferred individually or as a package: arabtex.zip for PC systems, arabtex.tar.Z for U*IX systems; we recommend to get and inspect the README file first. Successfull operation on the Apple Macintosh in conjunction with OzTEX has also been reported. At the time of this writing, version 3.00 is current. The Nasta`liq font is still under development; Naskh will be substituted automatically. Version 2 is downward compatible; the old version 1 is obsolete and should no more be used. ArabTEX is copyrighted, but free use for scientific, experimental and other strictly private, noncommercial purposes is granted. Offprints of any publications using ArabTEX are welcome. Using ArabTEX otherwise requires a license agreement. 32 Appendix B Installing ArabTEX The installation procedure is strongly system dependent, and we recommend securing the assistance of a local TEXpert. You have to install the "nash14" font with its "*.pk" and "*.tfm" files on the font search path of your TEX system, and the "*.sty" files and "arabtex.tex" on the source search path (usually TEXINPUT) of your system. Possibly you will also have to rename the "*.pk" files according to local conventions, and as a last resort you can try to recreate the fonts from the "*.mf" METAFONT sources. Additional fonts, whenever available, are installed analogously. ArabTEX has been found to cooperate well with TEX versions 3.xxx, LaTEX versions 2.09 of 1991 or later, NFSS and NFSS2 (not required), and previewers that can handle fonts of more than 128 characters. TEX-XET or TEX--XET are not required, and their additional features are presently not exploited. The TEX \hash size" should be at least 3000 to 3500, especially when using ArabTEX in conjunction with LaTEX, and if the transliteration module is used. Use of a BIG TEX may be necessary when using the NFSS2 due to the latter's high demand on string storage. Space and time requirements are not negligible, and have increased during development; however, ArabTEX currently still runs, albeit slowly, even on a PC XT standard configuration. 33 Appendix C Release history There was a Version 1 which is no more supported. Version 2 was not fully compatible with Version 1; however, moving to the new version usually caused little problems. Apart from some extensions, most changes were introduced in order to better conform to the transliteration standards, and to have less compatibility problems with TEX and LaTEX. Further versions are expected to be upward compatible if no serious problems will turn up. The main differences between versions 1 and 2 are: - The font size has increased, so the document layout may change. The old font "nash10" can no more be used as the character locations have been assigned differently. - Some Arabic characters are now coded differently: `ayn is denoted by a left quote, and , <^z>, <^t>, and <.n> have been assigned new meanings in order to better conform to the standard transliteration. - There are many more ligatures than before. This normally need not concern the user. - \vocalize will no more generate suk?un and was.la except if explicitly indicated by quoting. See \fullvocalize. - Arabic Environments are now always bracketed by the new control sequences \begin{arabtext} and \end{arabtext} even if only the transliteration is wanted. We strongly recommend converting any still existing version 1 input files to the new notation. To assist in this migrating procedure, the LaTEX option 34 APPENDIX C. RELEASE HISTORY 35 "oldarabtex" and/or the command \oldarabtex will switch to a mode where virtually all places where the old conventions are used, will either produce a TEX error message or will be flagged in the output. The changes introduced since the release of Version 2.00 up to now (Version 3.00) fall into one of two categories: error corrections, and upward compatible extensions. Details are not given here, but are documented in the text file CHANGES that is part of the distribution package of ArabTEX. Version 3 is upwards compatible with version 2. All supported features are documented in this manual. Appendix D Sample ArabTEX input \documentstyle[12pt,arabtex]{article} \begin{document} \setarab % choose the language conventions \vocalize % diacritics for short vowels on \transtrue % additionally switch on the transliteration \arabtrue % print arabic text ... is on anyway \spreadtrue % spread out caption \centerline {<^gu.hA wa-.himAruhu>} \begin{arabtext} 'at_A .sadIquN 'il_A ^gu.hA ya.tlubu minhu .himArahu li-yarkabahu fI safraTiN qa.sIraTiN wa-qAla lahu: sawfa 'u`Iduhu 'ilayka fI al-masA'i, wa-'adfa`u laka 'u^graTaN. \\ fa-qAla ^gu.hA: 'anA 'AsifuN ^giddaN 'annI lA 'asta.tI`u 'an 'u.haqqiqa laka ra.gbataka, fa-al.himAru laysa hunA al-yawma. \\ wa-qabla 'an yutimmu ^gu.hA kalAmahu bada'a al-.himAru yanhaqu fI i.s.tablihi. \\ fa-qAla lahu .sadIquhu: 'innI 'asma`u .himAraka yA ^gu.hA yanhaqu. \\ fa-qAla lahu ^gu.hA: .garIbuN 'amruka yA .sadIqI! 'a-tu.saddiqu al-.himAra wa-tuka_d_dibunI? \end{arabtext} \end{document} 36 Appendix E Sample ArabTEX output fi? fiP A ff?kff ff? Affm fic <=guh.?a wa-h.im?aruhu -at?a s.ad??qun -il?a <=guh.?a yat.lubu minhu h.im?arahu li-yarkabahu f?? safratin qas.??ratin wa-q?ala lahu: : fi?ff? ffÄff?fl ff? ??? ffQÖ??ff ff?fl ??? ffQ ?fi ff? ?? ?flff fi?ffJ. ff?QffÖ??ff fi? ffP Aff"gff fi? ?J?ff fiI. fi??ffÄ? Affm fic ?ff?@flff <=?K? Yff ff? ?ff?G fffl@ sawfa -u ,??duhu -ilayka f?? 'l-mas?a-i , wa--adfa ,u laka -u<=gratan. . ??? ffQc fifl@ ff?ff? fi? ff?fl X fffl@ ff? , Zff A ff? ffÜ?@ ?? ?flff ff?J? ff? @flff fi? fiYJ??ff fifl@ ff?? ff? fa-q?ala <=guh.?a: : Affm fic ffÄ ff?fi ff?fl -an?a -?asifun <=giddan -ann?? l?a -astat.??,u -an -uh.aqqiqa laka ra_gbataka, fa-'lh.im?aru laysa hun?a 'l-yawma. . ff??ffJ?? @ Aff?J fi? ffü?? ff? fiP Aff"mff?' Aff?fl , ff?ff?J ffJ. ?? ffP ff?ff? ff? fflff?fi ffk fifl@ ?? fffl@ fi?J??ffff?J? fffl@ BffB ?? fflff?G fffl@ @ ?fflYg.ff <=??ff ffi@ Aff?K fffl@ wa-qabla -an yutimmu <=guh.?a kal?amahu bada-a 'l-h.im?aru yanhaqu f?? 's.t.ablihi. . ?ff?ffJ. ff?? @ ?? ?flff fi?ff? ?DffK? fiP Aff"mff?' @ fffl@ ffYffK. fi? ff? CffC ff? Affm fic fiffl" ??ff fiK? ?? fffl@ ff?J. ff?fl ff? fa-q?ala lahu s.ad??quhu: : fi? fi?fiK? Yff ff? fi?ff? ffÄ ff?fi ff?fl -inn?? -asma ,u h.im?araka y?a <=guh.?a yanhaqu. . fi?ff? ?DffK? Affm fic AffK? ff? ffP Aff"gff fi? ffÜfiÖfffl@ ?? fflff?G @flff fa-q?ala lahu <=guh.?a: : Affm fic fi?ff? ffÄ ff?fi ff?fl _gar??bun -amruka y?a s.ad??q??! -a-tus.addiqu 'l-h.im?ara wa-tukad?d?ibun??? ? ?? ??ff fiK. fflff?Y ff?fi?K ff? ffP Aff"mff?' @ fi? fflffY ff?fi?Ä fffl@ ! ?? ?fiffK? Yff ff? AffK? ff? fiQ? fffl@ <=I.K? Qff ff?? 37 Appendix F Coding examples for Arabic1 The short vowels fath.a , kasra , d.amma are denoted, as in the transliteration, by the small letters a, i, u: mana`a ff?ff?J ff? mana ,a, _dahaba ffI. ff? ff?X d?ahaba, ^sariba ffH. Qff?ff? <=sariba, qabila ff?J.ff ff?fl qabila, `a.zuma ff? fi? ff? ,az.uma, `alu fi? ff? ,alu, bal >=?ffK. bal, ni`ma ff" >=? ?Kff ni ,ma, yaktub >=I.fi?J >=?ffK? yaktub. The long vowels ?a, ??, ?u are denoted by capitals A, I, U or by aa, iy, uw: qAtala ff?ff?K Aff?fl q?atala, nUzi`a ff? Rff ?fi?K n?uzi ,a, lUmI ???ff ?fi? l?um??, sIrI ??QffÖ??ff s??r??; lawmI ???ff >=?ff? lawm??, sayrI ??Qff>=Ö? ff? sayr??. Alif maqs.?ura is coded as _A or Y. ramY ? ff? ffP ram?a, _dikrY ? ffQ >=? ?Xff d?ikr?a, `al_A ? ff? ff? ,al?a, bal_A ? ff?ffK. bal?a. Silent 'alif : The plural suffixes -?u, -aw of the verb are denoted UA, aW or aWA: katabUA @?fiJ.ff?J ff? katab?u, yaktubUA @?fiJ.fi?J >=?ffK? yaktub?u, ramaWA @ >=? ff? ffP ramaw, yalqaW @ >=? ff?fi >=? ffK? yalqaw. 1Most of the examples are taken from: Wolfdietrich Fischer, Grammatik des Klassischen Arabisch, 2. Auflage, Verlag Otto Harrassowitz, Wiesbaden 1987. 38 APPENDIX F. CODING EXAMPLES FOR ARABIC 39 The defective notation of ?a, ??, ?u can be indicated by _a, _i, _u and leads to the appropriate spelling: dAru-h_u f? fiP @ ffX d?aru-h?u, ri^gli-h_i ???ff >=g. Pff ri<=gli-h??, however: ramA-hu fi? A ff? ffP ram?a-hu, yarmI-hi ?ffJ??ff >=Q ffK? yarm??-hi; _dih_i ?? ?Xff d?ih??, h_a_dih_i ?? ?Yff ?? h?ad?ih??, tih_i ???Kff tih??, hAtih_i ???Kff A ff? h?atih??, rabb_i fflH.? ffP rabb??, .sAl_i ??A ff? s.?al??; hum_u f" fi? hum?u; qiy_amaTuN <=?? ffÜ ?fi? ?flff qiy?amatun, 'il_ahuN <=??? @flff -il?ahun, sam_awATuN <=?? @ ff? ?ÜfiffÖ sam?aw?atun, _tal_a_tuN <=?I ?? ff?K t?al?at?un, l_akin >=?ß?ff ?? l?akin, h_a_dA @ ff?Y ?? h?ad??a, 'al-ll_ahu fi? ?ffl=g fffflQ? fffl@ -ar-rah.m?anu, _d_alika ff??ff ??X d??alika. To reproduce the historical writing correctly, a silent long vowel or 'alif maqs.?ura after _a receives no suk?un and is ignored in the transliteration: .sal_aUTuN <=??? ?? ff? s.al?atun, .hay_aUTuN <=??? ?J? ffk h.ay?atun, zak_aUTuN <=??? ?? ffR zak?atun, mi^sk_aUTuN <=??? ?? >=ä?ff mi<=sk?atun, ar-rib_aU ? ?K. fflffQ? ff@ ar-rib?a, tawr_aITuN <=??K? ?P >=?ff?K tawr?atun, ram_aYhu fi?J ?? ffP ram?ahu, sIm_aYhum >=? fi?D ?Üfi??ff s??m?ahum. The short vowel u can be written as a long vowel by _U: '_UlY ?ff?? fifl@ -ul?a, '_UlA'i Zff BffB? fifl@ -ul?a-i, '_UlU ?fi?? fifl@ -ul?u, '_UlAka ff? BffB? fifl@ -ul?aka, '_UlA'ika ff?flKff BffB? fifl@ -ul?a-ika. Tanw??n : The plural suffixes -un, -in, -an are written -uN, -iN, -aN or aNA. Silent 'alif in -an may be indicated by A or omitted; if necessary it is supplied from the context. ra^guluN <=? fig. ffP ra<=gulun, ra^guliN ?? fig. ffP ra<=gulin, ra^gulaN C?C fig. ffP ra<=gulan, madInaTaN ???ff?JK? Yff ff? mad??natan, ^gamIlaTaN ???ff?J?"ff ffg. <=gam??latan, 'i_daN @ ??X@flff -id?an, samA'aN ?Z A ffÜfiffÖ sam?a-an. There is a special case: ribaNU ??K. Pff riban; `amruNU ? <=Q>=" ff? ,amrun, `amriNU ?Q? >=" ff? ,amrin, however: `amraN @ ?Q>=" ff? ,amran. APPENDIX F. CODING EXAMPLES FOR ARABIC 40 Tanw??n fath.a is traditionally put on the last consonant even if a silent 'alif follows. Some modern conventions, and also Persian practice, require to put it on the 'alif in this case. This behaviour may be switched on by \newtanwin, and off by \oldtanwin. \newtanwin mode is the default for Persian. ra^gulaN ?CC fig. ffP ra<=gulan, 'i_daN ?@ ?X @flff -id?an. A silent 'alif maqs.?ura after tanw??n is written aNY or aN_A: hudaNY ? ?Y fi? hudan, fataN_A ? ??? ff?fl fatan; compare: al-hudY ? ffYfi?>=?ff@ al-hud?a, 'al-fat_A ? ff?? ff?fi>=? fffl@ -al-fat?a. T?a' marbut.a is denoted by T: kalimaTuN <=?? ff??ff ff? kalimatun, kalimaTiN ??? ff??ff ff? kalimatin, kalimaTaN ??? ff??ff ff? kalimatan; fatATuN <=?? Aff?J ff?fl fat?atun, fatATiN ??? Aff?J ff?fl fat?atin, fatATaN ??? Aff?J ff?fl fat?atan. Hamza is indicated by '; the appropriate carrier is determined by the context: 'amruN <=Q >=? fffl@ -amrun, 'ibiluN <=?K.ff @flff -ibilun, 'u_htuN <=?I >=s fifl@ -uh>=tun; ra'suN <=Ä >=fl@ ffP ra-sun, 'ar'asu fiÄ fffl@ >=P fffl@ -ar-asu, sa'ala ff? ffflA ff? sa-ala, qara'a fffl@ ffQ ff?fl qara-a; bu'suN <=Ä >=fl? fiK. bu-sun, 'ab'usuN <=Ä fifl? >=K. fffl@ -ab-usun, ra'ufa ff? fifl? ffP ra-ufa, ru'asA'u fiZ A ff? fffl? fiP ru-as?a-u; bi'ruN <=Q>=flÖK.ff bi-run, 'as'ilaTuN <=??ff? flJff >=? fffl@ -as-ilatun, ka'iba ffI.flJff ff? ka-iba, qA'imuN <="?fl'ff Aff?fl q?a-imun, ri'AsaTuN <=?? ff? AffflKPff ri-?asatun, su'ila ff?flJff fi? su-ila; samA'uN <=Z A ffÜfiffÖ sam?a-un, barI'uN <=Z ??Qff ffK. bar??-un, sU'uN <=Z? fi? s?u-un, bad'uN <=Z >=YffK. bad-un, ^say'uN <=Z >=???ff? <=say-un, ^say'iN Z? >=???ff? <=say-in, ^say'aN A?flJ >=?? ff? <=say-an; sA'ala ff? ffZ A ff? s?a-ala, mas'alaTuN <=??ff? ffflA >=? ff? mas-alatun, saw'aTuN <=?? fffl@ >=? ff? saw-atun, _ha.tI'aTuN <=??ffflJJ??ff ffs h>=at.??-atun. APPENDIX F. CODING EXAMPLES FOR ARABIC 41 Old Hamza convention: In an older writing style that is used, e.g., in some Qur'an editions, the hamza is sometimes put below its carrier or on the connecting line. This style may be switched on by \oldhamza (and off again by \newhamza): 'as'ilaTuN <=??ff?Jflff >=? fffl@ -as-ilatun, ka'iba ffI.Jflff ff? ka-iba, qA'imuN <="?'flff Aff?fl q?a-imun, su'ila ff?Jflff fi? su-ila, ^say'aN A?fl?>=J? ff? <=say-an, _ha.tI'aTuN <=??fffl?J??ff ffs h>=at.??-atun. Madda in the context '?a is generated automatically: 'AkiluN <=??ff ffi@ -?akilun, qur'AnuN <=??ffi@ >=Q fi?fl qur-?anun, ra'Ahu fi?ffi@ ffP ra-?ahu. To reproduce the historic writing correctly, it can also be explicitly written in other contexts: 'a.sdiq~A'uh_u f? fifl?ffiAff?fl Yff >=? fffl@ -as.diq?a-uh?u; ya^g~I'u fiZ ffi??eff? ff'? ya<=g??-u, s~U'ila ff?flKff ffi? fi? s?u-ila. <=Sadda : A double consonant must be written twice, even if it is coded by more than one character: nazzala ff? fffflS ff?K nazzala, ba^s^sAruN <=P A ffffläffÄ. ba<=s<=s?arun, nawwara ffP ffffl? ff?K nawwara, sayyiduN <=YfflffJ? ff? sayyidun, sa''AluN <=? @ fffflflA ff? sa--?alun, .sabiyyuN <=ffl???.ff ff? s.abiyyun, `aduwwuN <=ffl? fiY ff? ,aduwwun. Instead of iyy, uww one can also write Iy, Uw: .sabIyuN <=ffl???.ff ff? s.ab??yun, `adUwuN <=ffl? fiY ff? ,ad?uwun. Assimilation: the definite article may be always written al-; a following \sun letter" must be written twice like in the Arabic spelling. The transliteration and the use of suk?un are adjusted accordingly: 'al-ddAru fiP @ fffflY? fffl@ -ad-d?aru, 'al-rra^gulu fi? fig. fffflQ? fffl@ -ar-ra<=gulu, 'al-ssanaTu fi??ff?J ffffl?? fffl@ -as-sanatu, 'al-nnAru fiP Affffl?J? fffl@ -an-n?aru; 'al-^gAru fiP Affe?>=' fffl@ -al-<=g?aru, 'al-bAbu fiH. AffJ. >=? fffl@ -al-b?abu; 'al-llaylaTu fi??ff? >=J? ffffl?? fffl@ -al-laylatu, 'al-llisAnu fi?? A ff? fflff?? fffl@ -al-lis?anu, 'al-ll_ahu fi? ?ffl=Ö?@ ff? wa-'smuhu, f--a-in.sarafa ff? ffQ? ffö>=?Ä?Aff? ?fl fa-'ns.arafa.2 This also works across word boundaries: yA ibnI ?? ??ff >=K. ?@ AffK? y?a 'bn??, h_a_dA ibnuh_u f?fi?J >=K. ?@ @ ff?Y ?? h?ad??a 'bnuh?u, qAla u_hru^g >=` fiQ >=s?@ ffÄff?fl q?ala 'h>=ru<=g. An auxiliary vowel at the end of the preceding word may be separated by a hyphen: qad-i in.sarafa ff? ffQ? ffö>=?Ä?@ Yff ff?fl qad-i 'ns.arafa, ra'aW-u al-bAba ffH. AffJ. >=??@ @ fi? fffl@ ffP ra-aw-u 'l-b?aba, min-i ibnih_i ?? ?Jff >=K. ?@ ?ßff?ff min-i 'bnih??. This also works for the article preceding 'alif al-was.l : 'al-i-ismu fi"?>=Ö ?BBff fffl@ -al-i-'smu, 'al-i-i^stirA'u fiZ @ ffQ?Öff >=? ?BBff fffl@ -al-i-'<=stir?a-u, and even if the auxiliary vowel is omitted in the spelling: ra^guluN-i ibnatuh_u ^gamIlaTuN <=??ff?J?"ff ffg. f?fi?J ff??>=K. ?@ <=? fig. ffP ra<=gulun-i 'bnatuh?u <=gam??latun, mu.hammaduN-i al-qura^sIyu fiffl????ff ffQ fi?fi>=??@ <=Yffffl"ffm fi? muh.ammadun-i 'l-qura<=s??yu. 2In vowelized writing, it may sometimes be advisable to introduce a ka<=s??da to prevent the vowel marks from bumping into each other. APPENDIX F. CODING EXAMPLES FOR ARABIC 43 The particles li- and la- must be combined with the article except before l?am: lil-rra^guli ?ff fig. fffflQ??ff lir-ra<=guli, lal-ma|^gdu fiY >=b& ff?>=?ff? lal-ma<=gdu;3 however: li-llaylaTi ??ff ff? >=J? ffffl??ff li-llaylati, li-ll_ahi ?ff ?ffl=flA ffJ? ff? sa-ya-t??, li-yafra.ha ffh ffQ >=?fi ffJ? ?ff li-yafrah.a, wa-iswadda fffflX ff? >=??@ ff? wa-'swadda, ba`da-mA A ff? ffY >=?ffK. ba ,da-m?a, .tAla-mA A ffÜff?A ff? t.?ala-m?a, fI-ma ff"?? ?flff f??-ma, `alA-ma ff?CffC ff? ,al?a-ma. A single hyphen at the beginning or end of a word will enforce the use of the joining form of the first resp. the last character, if that form exists (for special uses only): s Ä s, -s ü -s, -s- ? -s-, s- ? s- h ? h, -h ? -h, -h- ? -h-, h- ? h- d X d, -d Y -d, lA BB l?a, -lA CC -l?a 1400 h- ? 1400 1400 h- Digit sequences are written in the natural order: 1234567890 1234567890 1234567890 3The ligature otherwise produced automatically looks ugly and has been broken by |. APPENDIX F. CODING EXAMPLES FOR ARABIC 44 Ligatures are generated automatically; they can be suppressed by |: 'al-'islAmu fi?CffC >=? BflffB fffl@ -al--isl?amu; 'al-^gAru fiP Affe?>=' fffl@ -al-<=g?aru, 'al|^gAru fiP A ffb>=&? fffl@ -al<=g?aru; _tumma ffffl"?fi?' t?umma, _tu|mma ffffl? & fi?K t?umma; mu.hammaduN <=Yffffl"ffm fi? muh.ammadun, mu|.ha|mmaduN <=Y ffffl? & ffj& fi? muh.ammadun. Abbreviations and emphasis are indicated by \emphasize: \emphasize .sl`m "??? s.l ,m \emphasize ab||^g `'H. @ ab<=g If necessary, use grouping by curly braces: \emphasize {`alayhi as-salAmu} ?CC ffl?? @ ?J??? ,alayhi 's-sal?amu Appendix G Coding examples for Persian1 The short vowels ? (>=a), e (>=?), o (>=u) are denoted by the lowercase letters a, e or i, o or u: bar >=QffK. bar, beh >=?K.ff beh, bon >=?ß fiK. bon. The long vowels a (?a), i (??, ?e), u (?u, ?o) are denoted by the capital letters A, I or E, U or O. Älef m?dde is automatically generated for word-initial a: Ab >=H. ffi@ ?ab, bAd >=X AffK. b?ad, bId >=YJ?K.ff b??d, bUd >=X?fiK. b?ud. Note that I yields a ya-ye m?`ruf (with zir), whilst E yields a ya-ye m?jhul (without zir). Similarly, U yields a waw-e m?`ruf (with pi<=s), whilst O yields a waw-e m?jhul (without pi<=s): tIr >=QÖ??Kff t??r, tE.g >=??J??K t?e_g; dUr >=P ? fiX d?ur, zOr >=P ?R z?or. The diphthongs ?e? and öu are written ay and aw: pay >=?ffG? pay, naw >=?ff?K naw. Intervocalic h.?mze is written ': pA'Iz >=SÖ? flKff AffK? p?a-??z; miyA'I ?flGff AffJ??ff miy?a-??, mIgU'I ?flGff ? fi?J??ff m??g?u-??; 1We gratefully acknowledge the voluntary help by Ivan Derzhanski who wrote this chapter, and implemented the language-specific processing. As we extensively modified his routines during system integration, all responsibility for any remaining, or new, errors rests with us. 45 APPENDIX G. CODING EXAMPLES FOR PERSIAN 46 tawAnA'I ?flGffAff?K @ ff?ff?K taw?an?a-??, zanA^sU'I ?flGff ? fi? Aff?K ffR zan?a<=s?u-??. Silent word-final waw is generated by _U or O: t_U ?fi?K tu, d_U ? fiX du; tO ??K t?o, dO ?X d?o. Waw-e m?`dul is written w; it is omitted in the transliteration and the preceding xe receives no j?zm: _hwAb >=H. @ ff?s h>=?ab, _hwI^s >=?Ä? ?ffs h>=??<=s, _hwod >=X fi?s h>=od. Ha-ye h?ww?z-e m?xfi is generated by H, or optionally by ,e, ,a or ,A. It does not receive a j?zm even in fully vocalised mode and is not joined to a following letter: _hAneH ? ?Kff A ff?g h>=?aneh, ^c,e ?{ff <=ceh, naH ?ff?K nah, yal_aH >=? ?? ffK? yal?ah, yal,A >=? ?? ffK? yal?ah _hAneHhA A ff?? ?Kff A ff?g h>=?anehh?a, _hAneH-hA A ff?' ? ?Kff A ff?g h>=?aneh-h?a. Short ed.afe is written -e or -i: ketAb-e U ?fi@ H.ffAff?J?ff ket?ab-e ?u, rAh-e t_U ?fi?K ?ff @ ffP r?ah-e tu, nAmeH-i man >=?ß ff? fl?ff?ff Aff?K n?ameh-i man, bInI-e An mard >=X >=Q ff? >=??ffi@ fl?ff ??ffJ?K.ff b??n??-e ?an mard, pA-i In zan >=?? ffR >=?ßK? @ff ?ffAffK? p?a-' ??n zan, bAzU-i In zan >=?? ffR >=?ßK? @ff ?ff? fiR AffK. b?az?u-' ??n zan. Long ed.afe is written -_i: dAr-_i man >=?ß ff? P? @ ffX d?ar-?? man, _hU-_i t_U ?fi?K ??? fis h>=?u-?? tu. H.?mze as ya-ye w?h.d?t/nesb?t/xet.ab is likewise written -_i: nAmeH-_i fl???ff Aff?K n?ameh-??, sormeH-_i fl???ff >=Q?fiÖ sormeh-??, gofteH-_i fl???Jff >=?fi fi? gofteh-??. Ye-ye w?h.d?t is written -I or -E: ketAb-I ?G.ff Aff?J?ff ket?ab-??, rAh-I ??ff @ ffP r?ah-??, nAmeH-I ? @ff '??ffAff?K n?ameh-??; dAnA-I ?flGff Aff?K @ ffX d?an?a-??, pArU-I ?flGff ? fiP AffK? p?ar?u-??; dAnA-I-keH ??ffJ?flKff Aff?K @ ffX d?an?a-??-keh, pArU-I-keH ??ffJ?flKff ? fiP AffK? p?ar?u-??-keh. APPENDIX G. CODING EXAMPLES FOR PERSIAN 47 The present tense forms of the verb bud?n and the pronominal clitics are written as they are spoken: rafteH-am >=? ff@' ??Jff >=?fl ffP rafteh-am, rafteH-Im >="?'? @ff '??Jff >=?fl ffP rafteh-??m, rafteH-I ? @ff' ??Jff >=?fl ffP rafteh-??, rafteH-Id >=YK? @ff '??Jff >=?fl ffP rafteh-??d, rafteH-ast >=?I >=? ff@' ??Jff >=?fl ffP rafteh-ast, rafteH-and >=Y>=?K ff@ '??Jff >=?fl ffP rafteh-and; mard-Id >=YK? Xff >=Q ff? mard-??d, asb-etAn >=?? Aff?J?.ff >=? ff@ asb-et?an; An^gA-st >=?I >=? Affe?>=?'ffi@ ?an<=g?a-st, U-st >=?I >=??fi@ ?u-st, t_U-st >=?I >=??fi?K tu-st; ketAb-I-st >=?I >=???K.ff Aff?J?ff ket?ab-??-st, nAmeH-I-st >=?I >=?Ä? @ff' ??ffAff?K n?ameh-??-st. The preposition be- can be written with or without a hyphen: be-man >=?ß ffÜß.ff be-man, be-t_U ?fi?JK.ff be-tu; be-An >=??ffiAK.ff be-?an, be-In >=?ßK? AffK.ff be-??n, beU ?fiAK.ff be?u. The components of compounds can be separated by || or "|: .sA.heb||_hAneH ? ?Kff A ffs'I.kffA ff? s.?ah.ebh>=?aneh, ta_ht-e-"|_hwAb >=H. @ ff?s' ?Iff >=u?ff?' tah>=t-e-h>=?ab; pas||andAz >=R @ ffY>=?K ff@' üffÄ? pasand?az, naw||AmUz >=R ? fi?ffi@' ?ff?K naw?am?uz, bI||_hwod >=X fi?s'?G.ff b??h>=od. Appendix H Alternate input encodings H.1 ASMO 449 = ISO 9036 The file asmo449.sty contains a reading module for the ASMO 449 code (identical to ISO 9036). It is installed by the LaTEX option asmo449 or by \input asmo449.sty. The module is activated by \setcode {asmo449} or \setcode {iso9036}; all following Arabic text will be considered to be coded according to the ASMO 449 standard. The ArabTEX notation may be reactivated by \setcode {arabtex}. ASMO 449 (see Table H.1) is a 7-bit code, differing from ASCII (ISO 646) mainly by replacing the letters by the Arabic letter characters and diacritical marks; the Arabic digits share their positions with the ASCII digits. The positions of special and control characters in both codes are identical. Aminimal driver file for processing, e.g. a file asmotext.dat, could be structured as follows: \documentstyle [arabtex,asmo449]{article} \begin {document} \setcode {asmo449} \begin {arabtext} \input asmotext.dat % the preceding blank line is required if "asmotext.dat" did not % end with a blank line itself; this is strange and embarrassing \end {arabtext} \end {document} 48 APPENDIX H. ALTERNATE INPUT ENCODINGS 49 0 1 2 3 4 5 6 7 00 NUL DLE SP 0 @ ?X ?&?? ?ff 01 SOH DC1 ! 1 Z P ? ffl? 02 STX DC2 " 2 ffi@ R ? >=? 03 ETX DC3 # 3 fl@ Ä ? 04 EOT DC4 $ 4 fl? ? ? 05 ENQ NAK % 5 @fl ? ? 06 ACK SYN & 6 flK ? ?? 07 BEL ETB ' 7 @ ? ? 08 BS CAN ) 8 H. ? ? 09 HT EM ( 9 ?? ? ? 10 LF SUB ? : ?H ?? ?? 11 VT ESC + ; ?H ] ?? } 12 FF IS4 , > ` \ <=? | 13 CR IS3 ? = h [ ?? { 14 SO IS2 . < p ^ ff? ~ 15 SI IS1 / ? X _ fi? DEL Table H.1: ASMO 449 code table APPENDIX H. ALTERNATE INPUT ENCODINGS 50 As texts coded in ASMO 449 are always rendered verbatim the commands \novocalize, \vocalize, \fullvocalize and the language selection commands \setarab etc. make no sense and are temporarily disabled. Texts in ASMO 449 are usually not fully vowelized. Thus the transliteration cannot be expected to be correct. This is especially true for Egyptian texts which commonly do not differentiate between y?a' and 'alif maqs.?ura. H.2 ASMO 449E = ISO 8859 - 6 The file iso88596.sty contains a reading module for the ISO 8859-6 code (extended ASMO 449 = ASMO 449E). It is installed by the LaTEX option iso88596 or by \input iso88596.sty. The module is activated by \setcode {iso8859-6}; all following Arabic text will be considered to be coded according to the ISO 8859-6 standard. The ArabTEX notation may be reactivated by \setcode {arabtex}. ISO 8859-6 (see Table H.2) is an 8-bit code closely related both to 7-bit ASCII and to ASMO 449; whereas the lower 128 positions are identical to ASCII (ISO 646), the upper 128 positions contain the Arabic characters of ASMO 449 in the analogous places, plus a few additional graphic and control characters. We exploit the close relationship of these codes by reusing the ASMO 449 reading routines, after suitable modification of the input. This only works correctly if the input text does not contain genuine ASCII letters, as we project the Arabic characters onto their locations in ASMO 449. Some of the code switching messages in the log file are spurious; do not worry. The notes on vowelization and transliteration of ASMO 449 apply also. The driver file indicated for ASMO 449 will be usable after the obvious modifications; however, your TEX installation must be capable of processing 8-bit data input. This is nowadays usually the case; otherwise you can try to locally find some utility program that will strip the highest order bit off the characters in your file, and process the result via ASMO 449. APPENDIX H. ALTERNATE INPUT ENCODINGS 51 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 00 NULDLE SP 0 0 @ P ` p NBSP ?X ?&?? ?ff 01 SOH DC1 ! 1 1 A Q a q Z P ? ffl? 02 STX DC2 " 2 2 B R b r ffi@ R ? >=? 03 ETXDC3 # 3 3 C S c s fl@ Ä ? 04 EOTDC4 $ 4 4 D T d t b ........ fl? ? ? 05 ENQNAK % 5 5 E U e u @fl ? ? 06 ACKSYN & 6 6 F V f v flK ? ?? 07 BEL ETB ' 7 7 G W g w @ ? ? 08 BS CAN ) 8 8 H X h x H. ? ? 09 HT EM ( 9 9 I Y i y ?? ? ? 10 LF SUB ? : J Z j z ?H ?? ?? 11 VT ESC + ; K ] k } ; ?H ?? 12 FF IS4 , > L \ l | , ` <=? 13 CR IS3 ? = M [ m { SHY h ?? 14 SO IS2 . < N ^ n ~ p ff? 15 SI IS1 / ? O _ o ? X fi? DEL Table H.2: ISO 8859-6 code table Appendix I Miscellaneous utilities The following packages are not part of ArabTEX proper, and are not supported in any way, but are distributed along with ArabTEX as possibly a convenience to the users. There is no warranty whatsoever. I.1 twoblks.sty This LaTEX option will define a command \twoblocks {#1}{#2} which will place the two parameters #1 and #2, usually two paragraphs, into two boxes side by side, separated by space of length \colsep. If necessary, the resulting boxes will be split across a page boundary. This feature is useful if two versions of a text are to be compared. They may be in different languages, and one of them might be in Arabic (if enclosed in \begin {arabtext} ? ? ? \end {arabtext}). This sentence has been written twice: in the English language and in the Arabic language. ??ff ff?? fiffl? ??A K.ff : ?ßff >=Ö? ff?K fffflQ ff? fi?? ff? >=? fie?>='?@ ?ff ?Yff ?? >=?I ff?. ?Jff fi? . ??ff fffflJ? K.ff ffQ ff? >=??@ ??ff ff?? fiffl? ??A K.ff ff? ??ff fffflK? SffÖ? ?ff eff?>=?' BflffB?@ Otherwise this command does not depend on ArabTEX in any way, and indeed originated in a completely different context. Beware that the two \blocks" should each not contain much more than one, not too long, paragraph of text, otherwise TEX's main storage might overflow. There must be no \verbatim text inside the parameters of \twoblocks, nor any \catcode changes; and all TEX groups and \if ? ? ? \fi sequences must be properly nested. 52 APPENDIX I. MISCELLANEOUS UTILITIES 53 I.2 abjad.sty This file, loaded as a LaTEX option, will define a command \abjad {#1} usable inside and outside of an Arabic context. It profited greatly from suggestions by Dr. Benno van Dalen (Utrecht University). The command \abjad {#1} will convert its argument, which has to be a legal representation of a number between 1 and 1999, to the Arabic 'ab<=gad notation used in some mediaeval manuscripts. The result of the conversion will not look perfect, and the legal 'ab<=gad number 0 can presently not be generated. Improving this routine needs a font revision, which is hard and tedious; whenever this happens, the command might well become part of ArabTEX proper. I.3 MLS2ARAB This is an UNIX SED script, written by Prof. Nicholas Heer (University of Washington), and released for free distribution. It will (almost) convert an ASCII file of Arabic text, produced by Multi-Lingual Scholar, to the ArabTEX input notation. The conversion is not perfect so some manual corrections might be necessary. For operating instructions, see the file itself. INDEX 54 Index " (quoting), 15 "|, 14, 15 $, 7 --, 15 \ , 8 \\, 8 \abjad, 53 \arabfalse, 19 \arabtrue, 19 \begin{arabtext}, 6, 34 \begin{setcode}, 18 \bigskip, 8 \centerline, 9 \colsep, 52 \doassign, 9 \docommand, 9 \emphasize, 8 \end{arabtext}, 6, 34 \end{setcode}, 18 \footnote, 8 \fullvocalize, 13, 15, 16, 34 \hfill, 8 \hspace, 8 \indent, 8 \input, 8, 18 \input arabtex.tex, 5 \input atrans.sty, 20 \input etrans.sty, 20 \ligsfalse, 16 \ligstrue, 16 \magnification, 5 \marginpar, 8 \mbox, 8 \medskip, 8 \newhamza, 41 \newtanwin, 17, 40 \noindent, 8 \nospace, 8 \novocalize, 15{17 \oldarabtex, 35 \oldhamza, 41 \oldtanwin, 17, 40 \par, 6, 8 \pnash, 5 \pnashbf, 5 \quiet, 26 \setarab, 6, 10, 13, 21 \setcode, 18 \setcode{arabtex}, 18, 48, 50 \setcode{asmo449}, 48 \setcode{iso8859-6}, 50 \setcode{iso9036}, 48 \setfarsi, 10, 21 \setmaghribi, 10, 21 \setnash, 8, 11 \setnashbf, 8, 11 \setnastaliq, 8, 11 \setnone, 10, 21 \setpashto, 10, 21, 23 \seturdu, 10, 21, 23 \setverb, 10, 21, 24 \showfalse, 27 \showtrue, 27 \smallskip, 8 \space, 8 \spreadbox, 8 \spreadfalse, 25 \spreadline, 9 \spreadtrue, 25 \tracingarab, 26 INDEX 55 \transfalse, 19 \transtrue, 19 \twoblocks, 52 \vocalize, 15, 16, 34 \yahdots, 25 \yahnodots, 25 >, 10, 21 \|, 8 |, 14{16 |B, 15 |BB, 15 ||, 14, 15 ` (`ayn), 14 ' (hamza), 13 A, 12, 17, 38 'A, 14, 16, 41 ,A, 46 _A, 12, 17, 38 ~A, 14 ,a, 22, 23, 46 _a, 12, 16, 39 a (fath.a), 12, 38 aa, 12, 38 abbreviation, 44 abjad.sty, 53 'ab<=gad numbers, 53 Afghanic, 22 `ayn, 14 al-, 14, 19 'alif , 17 dagger, 12, 16, 39 initial, 17 maqs.?ura, 12, 17, 38, 40 silent, 17, 40 Qur'an, 16, 39 silent, 17, 19, 38{40 small, 16, 39 below, 16, 39 'Allah (spelling), 43 aN, 13, 17, 39 aN_A, 17, 40 aNA, 13, 17, 39 aNY, 40 Arabic context, 6, 7 Arabic environment, 6 Arabic group, 7 Arabic item, 6 Arabic number, 7 Arabic quotation, 6 Arabic quotes, 7 Arabic word, 7 arabtex.tex, 5 ArabTEX commands, 7, 8 archaic text, 25 ASCII, 48, 50 ASMO 449, 18, 48, 50 aspirated consonant, 22 assignment, 9 assimilation, 14, 16, 19, 41 automatic stretching, 25 aW, 38 aw, 45 aWA, 38 ay, 45 B, 15 be-, 47 boxing commands, 8 breaking connections, 15 code 7-bit, 48 8-bit, 50 arabtex, 18 ASCII, 48, 50 ASMO 449, 18, 48, 50 ISO 646, 48, 50 ISO 8859-6, 18, 50 ISO 9036, 18, 48 coding conventions, 12, 34 commands ArabTEX, 7, 8 boxing, 8 illegal, 9 internal, 5 LaTEX, 7 INDEX 56 overview, 9 size changing, 5, 8, 11 TEX, 7 user defined, 5, 9 compounds, 47 copyright, 0, 32 dagger 'alif , 12, 16 d.amma, 12, 15, 16 inverted, 16, 22, 39 Dari, 21 date, 15 default font, 5, 11 defective writing, 12, 16, 39 definite article, 14, 19, 41 Derzhanski, Ivan, 45 diacritics, 16 diphthongs, 45 display mode, 7 dots on y?a' , 22, 25 E, 21 -E, 22 ,e, 22, 23, 46 -e, 22 EDMAC, 27 emphasis, 44 environment Arabic, 6, 18 arabtext, 6, 18 setcode, 18 tabbing, 6 Farsi, 21 fath.a, 12, 15, 16 Fischer, Wolfdietrich, 38 font bold, 11 default, 5, 11 installation, 33 nash10, 34 nash14, 32{34 nash14bf, 33 naskh, 11, 32, 33 nasta`liq, 22, 32 selection, 11 unavailable, 11 grouping, 7, 44 H, 21{23, 46 h-, 15 hamza, 13, 15, 22, 40, 45, 46 carrier, 17, 40 old style, 41 h.arak?at , 12, 15, 16, 38, 45 on tatw??l , 15 Heer, Nicholas, 53 hyphen, 15, 43 I, 12, 38 -I, 22 ~I, 14 -i, 22 _i, 12, 16, 39 i (kasra), 12, 38 implementation Mac, 32 PC, 32 U*IX, 32 iN, 13, 39 input switching, 18 insertion mathematical, 7 non-Arabic, 7 Roman, 7 installation, 33 internal commands, 5 inverted d.amma, 16, 22 invisible consonant, 14 ISO 646, 48, 50 ISO 8859-6, 50 ISO 9036, 48 iy, 12, 38 iz.?afet , 15, 22, 23, 46 ka<=s??da, 15, 25, 43 kasra, 12, 15, 16 INDEX 57 Kurdish, 21 la-, 43 language selection, 10 LaTEX commands, 7 li-, 43 ligature, 16, 34, 44 breaking, 14{16, 44 lists, 9 long vowels, 12, 16 Macintosh, 32 madda, 14, 16, 45 Maghribi, 23 mathematical insertion, 7 METAFONT, 33 MLS2ARAB, 53 Multi-Lingual Scholar, 53 N, 15, 16, 19 naskh, 11, 32, 33 nasta`liq, 22, 32 nesting, 7, 9 NFSS, 33 NFSS2, 33 non-Arabic insertion, 7 NU, 13, 39 numbers, 43 'ab<=gad , 53 O, 21, 46 option abjad, 53 arabtex, 5 asmo449, 18 atrans, 20 etrans, 20 iso88596, 18 nashbf, 11 nastaliq, 11 oldarabtex, 35 twoblks, 52 Ottoman, 21 Pashto, 22, 23 PC implementation, 32 Persian, 21 Persian copula, 22 pi<=s, 45 punctuation, 6 quotation Arabic, 6 non-Arabic, 7 Roman, 7 quoting, 13, 15, 16 Qur'an 'alif , 16 reading module, 18 Roman insertion, 7 <=sadda, 14, 16, 41 on tatweel, 15 short vowels, 12 silent 'alif , 17, 19 size changing, 5, 8, 11 special codings, 25 stretching, 8, 15, 25 automatic, 25 suk?un, 15, 16, 22, 34, 46 on l?am, 14 on tatw??l , 15 sun letter, 14 T, 40 tabbing environment, 6 t?a' marbut.a, 40 tanw??n, 13, 15{17, 19, 39, 40 fath.a, 40 on tatw??l , 15 ta<=sd??d , 14, 16 tatw??l , 15, 43 TEX commands, 7 TEX hash size, 5, 33 text archaic, 25 erroneous, 25 TEX-XET, 33 transliteration, 12, 19, 34 INDEX 58 Encyclopedia of Islam, 20 ZDMG, 19 twoblks.sty, 52 U, 12, 19, 38 _U, 39, 46 ~U, 14 _u, 12, 16, 39 u (d.amma), 12, 38 U*IX implementation, 32 UA, 17, 38 uN, 13, 39 unavailable font, 11 Urdu, 22, 23 user defined commands, 5, 9 uw, 12, 38 van Dalen, Benno, 53 verbatim, 17 vowel marks, 16 vowels long, 12, 16, 38, 45 short, 12, 38, 45 W, 19 WA, 17 was.la, 15, 16, 19, 34, 42 Y, 12, 38 y?a' dots, 22, 25 y?a'-i-wah.dat , 22, 23, 46 z??r , 45 zwarakay , 22