I hear and I forget. I see and I remember. I do and I understand. — Chinese Proverb

Thaana Unicode Sheet


Dhivehi ASCII to Dhivehi Unicode

Thaana Unicode Sheet

Thaana Unicode Sheet

Now a days Dhivehi ASCII is almost unheard of as Unicode has been widely adopted for quite sometime, especially for content published on the Internet. However, there is still the case of legacy support and the mountains of content created by Accent Express, Recorder, and not to forget the repetitive pressing of the Left-Arrow key. Those were the good times when typing Dhivehi was an art and only mastered by an elite :).

So recently one old document came back to haunt us in a project that I was collaborating with a close friend of mine. I couldn’t bear to watch my friend re-type hundreds of lines all over again in Unicode. So I created a simple python script to convert the text file to its Unicode equivalent. The input has to be a text file, so the focus is only to convert the text, and not to create a fully word processed document.

The code is available on GitHub.



Thaana support for LaTex

LaTex is a typesetting system that is used to produce publication quality documents. LaTex is predominantly used by academics to produce technical/scientific/journal/conference papers. LaTex began its roots from Tex designed and developed by Donald Knuth around late 1970s. You can read more on the history from here. LaTex was developed by Lesslie Lamport around early 1980s.

LaTex is not for everyone. Most of us who are already very much used to wordprocessors such as Ms Word/LibreOffice Writer, the “point and click toolbar to get everything done” approach cannot be used. Everything is done using special markup or escape sequences of code. However, the benefit is that, you can be assured that the output and formatting is consistent all the time. The file is basically a plan text file, and the LaTex processor handles the processing and generation of the output (either PDF, PS, etc. format). More advantages and disadvantages here.

LaTex does not support Thaana typesetting natively. There are some packages such as bidi or babel that can be used for multilingual/right-to-left language support. LaTex uses packages to add features to the document. Similar to including library at the header when programming. But it wasn’t working out very well for Thaana. So XeTex was the preferred choice. Since I used XeTex for the task in hand, you must be wondering why I have been talking about LaTex all this while. The simple and honest answer is that most of us are familiar with LaTex and we often use it synonymously to refer to Tex style typesetting. Continue reading