ARABIC TRANSLITERATION

I developed my transliteration system before XML days. To make it XML-friendly I would:

replace < with I (for hamza-under-alif)
replace
> with O (for hamza-over-alif—the A is already used for bare alif)
replace
& with W (for hamza-on-waw)

Transliteration Arabic Windows Unicode Value and Unicode Name
' C1 U+0621 ARABIC LETTER HAMZA
| C2 U+0622 ARABIC LETTER ALEF WITH MADDA ABOVE
> C3 U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE
& C4 U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE
< C5 U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW
} C6 U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE
A C7 U+0627 ARABIC LETTER ALEF
b C8 U+0628 ARABIC LETTER BEH
p C9 U+0629 ARABIC LETTER TEH MARBUTA
t CA U+062A ARABIC LETTER TEH
v CB U+062B ARABIC LETTER THEH
j CC U+062C ARABIC LETTER JEEM
H CD U+062D ARABIC LETTER HAH
x CE U+062E ARABIC LETTER KHAH
d CF U+062F ARABIC LETTER DAL
* D0 U+0630 ARABIC LETTER THAL
r D1 U+0631 ARABIC LETTER REH
z D2 U+0632 ARABIC LETTER ZAIN
s D3 U+0633 ARABIC LETTER SEEN
$ D4 U+0634 ARABIC LETTER SHEEN
S D5 U+0635 ARABIC LETTER SAD
D D6 U+0636 ARABIC LETTER DAD
T D8 U+0637 ARABIC LETTER TAH
Z D9 U+0638 ARABIC LETTER ZAH
E DA U+0639 ARABIC LETTER AIN
g DB U+063A ARABIC LETTER GHAIN
_ DC U+0640 ARABIC TATWEEL
f DD U+0641 ARABIC LETTER FEH
q DE U+0642 ARABIC LETTER QAF
k DF U+0643 ARABIC LETTER KAF
l E1 U+0644 ARABIC LETTER LAM
m E3 U+0645 ARABIC LETTER MEEM
n E4 U+0646 ARABIC LETTER NOON
h E5 U+0647 ARABIC LETTER HEH
w E6 U+0648 ARABIC LETTER WAW
Y EC U+0649 ARABIC LETTER ALEF MAKSURA
y ED U+064A ARABIC LETTER YEH
F F0 U+064B ARABIC FATHATAN
N F1 U+064C ARABIC DAMMATAN
K F2 U+064D ARABIC KASRATAN
a F3 U+064E ARABIC FATHA
u F5 U+064F ARABIC DAMMA
i F6 U+0650 ARABIC KASRA
~ F8 U+0651 ARABIC SHADDA
o FA U+0652 ARABIC SUKUN
`   U+0670 ARABIC LETTER SUPERSCRIPT ALEF
{   U+0671 ARABIC LETTER ALEF WASLA
P 81 U+067E ARABIC LETTER PEH
J 8D U+0686 ARABIC LETTER TCHEH
V   U+06A4 ARABIC LETTER VEH
G 90 U+06AF ARABIC LETTER GAF

The full Arabic character set can be viewed at the Unicode website:

Arabic: U+0600 to U+06FF (PDF format)
Arabic Presentation Forms-A: U+FB50 to U+FDFF (PDF format)
Arabic Presentation Forms-B: U+FE70 to U+FEFF (PDF format)

The TITUS page for U+0600 through U+06FF displays the actual characters in your browser (UTF-8 encoding).

You can test your web browser's Arabic Unicode support at Alan Wood’s Unicode Resources website.

The Microsoft developer website has a useful table of the Arabic Windows (1256) and ISO 8859-6 code pages and their corresponding Unicode values.


HOME | CORPUS COMPILATION | WORD FREQUENCY COUNTS | CONCORDANCING | MORPHOLOGY ANALYSIS | ARABIC LEXICON

Copyright © 2002 QAMUS LLC