Random Thoughts..
Tuesday, April 15, 2003
 





In a Sourceforge article on localization, Prof. Venkatesh Hariharan makes
a pertinent comment ?



"The localisation of Linux1 to Indian languages
can spark off a revolution that reaches down to the grassroots levels of
the country"




The problem faced in harnessing the power of ICT (information & communication
technologies) in India is that all such enabling software is in English ?
a language that a mere 10-15% of the population is proficient in. The dialogs,
menus, interfaces are in English as is the documentation. Although such a
scenario is desirable and advantageous to the point that it establishes a
'lingua franca', in case of countries like India and Bangladesh the situation
leads to 'technological poverty'. To bridge the digital divide by taking
technology to the masses, one of the important aspects of the Free Software
and Open Source Software movements has been taking GNU/Linux OS and adapting
it to the local cultural nuances.





The need to localize



In a Sourceforge article on localization, Prof. Venkatesh Hariharan makes
a pertinent comment ?



"The localisation of Linux1 to Indian
languages can spark off a revolution that reaches down to the grassroots
levels of the country"




The problem faced in harnessing the power of ICT (information & communication
technologies) in India is that all such enabling software is in English ?
a language that a mere 10-15% of the population is proficient in. The dialogs,
menus, interfaces are in English as is the documentation. Although such a
scenario is desirable and advantageous to the point that it establishes a
'lingua franca', in case of countries like India and Bangladesh the situation
leads to 'technological poverty'. To bridge the digital divide by taking
technology to the masses, one of the important aspects of the Free Software
and Open Source Software movements has been taking GNU/Linux OS and adapting
it to the local cultural nuances. Localization is a national level collaborative
effort to "Indianize" software and make culture an integral part of the computing
experience. Locale handles the cultural conventions as for example the formatting
of date and time, the representation of numbers, the symbols for currency
etc. The glibc package currently possesses both bn_IN and bn_BD (for India
and Bangladesh respectively)



Bangla localization



Bangla (or Bengali) is one of the most important languages in the world
in terms of the number of users. The language possesses a well-established
phonetic script and the Unicode range for Bangla Characters is from 0980
to 09FF. However, Indic scripts are notoriously difficult to add support
to due to the existence of conjuncts ('yuktakshars') and non-standard spellings.
Thus, one of the major difficulties in localizing GNU/Linux in Bangla is
also the absence of suitable font (font sets). Unicode represents Bengali
text as a sequence of Bengali characters. Unlike most European scripts, just
rendering these characters is not enough for Bengali (and other Indic scripts);
it is necessary to form new glyphs by combining several characters. Until
recently, there was no accepted standard that described it completely. This
has been addressed in the extension to the TrueType font format known as
Open Type (along with some rules in Unicode for reordering the characters
before combining them). A description of the parts of the specification relevant
to Indic scripts is available through Microsoft's typography site



Under the Windows operating system, Internet Explorer uses the Uniscribe
layout engine to render open type features. Recent versions of Windows, e.g.
Windows 2000 or XP, ship with the required version, and installing a recent
version of Internet Explorer is usually sufficient for older versions.



On GNU/Linux, the Free type library has implemented Open Type Layout features,
and the rapidly developing Pango project is trying to deal with other internationalization
issues. Although the Pango rendering engine (which handles all text rendering
in the GTk2/Gnome2 environment) has a module for rendering bangla, open source
browsers like Mozilla still do not possess support for Bangla rendering.
This combination is already in working condition on recent distributions,
though none of the more popular browsers use this technology yet. A completely
independent implementation is in the Unicode Text editor Yudit, available
for both GNU/Linux and Windows.



Even with a layout engine that implements Open Type layout features, an Open
Type Bengali font is required before one can view Bengali text. Several efforts
have started recently with the aim of creating free Bengali open type fonts,
and are available from the Free Bangla Fonts project



The Free Bangla Fonts Project



The Free Bangla Fonts Project - a volunteer run collaborative model based
project dedicated towards creating Free, high quality, completely Unicode
compliant Open Type Bengali fonts. The project positions itself as the central
resource for getting and developing Free Bangla fonts. The initial aim of
this project is to release a full set of Bangla fonts that supports all the
major Bangla Yuktakshars (conjuncts). The Akaash set of fonts (with more
than 650 glyphs) aims to be such a set. The project also plans for a conversion
of other existing Free Bangla (non Unicode compliant) fonts into Unicode
compliant Bengali Open Type fonts. Currently the team is working on four
sets of fonts. Sayamindu Dasgupta is developing the Akaash set. This set
will be having three OTFs, AkaashNormal.ttf, AkaashWide.ttf and AkaashSlanted.ttf.
Dr. Anirban Mitra is developing the Ani set. It has two fonts, Ani.ttf and
Mitra.ttf. The Mitra font is a monospaced font, which is useful in certain
specialized applications. Dr. Mitra is also developing the Mukti set of fonts
(which uses the glyphs donated by Cyberscape Multimedia Limited, Mumbai).
This set has four fonts, MuktiRegular.ttf, MuktiBold.ttf, MuktiNarrow.ttf,
and MuktiNarrowBold.ttf. Deepayan Sarkar is developing the Likhan set of
fonts. Taneem Ahmed has packaged all the fonts under development in the project
into a RPM file for RedHat 8.0.



Bangla Innovative Open Source(BIOS)



BIOS is the brainchild of a band of programmers and students with the aim
of utilizing the open source aspect of GNU/Linux to take computing to the
masses. Based on a distributed

collaborative model of project management, BIOS aims to create:



1. Bengali Open Office ? full featured, Bengali office suite as Open Source/Free
Software.

2. Bengali database - Bengali Unicode based implementation of database.

3. Bengali Speech application - Free Software speech recognition and text-to-speech
application



BIOS believes that the prevalent standard interfaces in English form a barrier
to taking computing power to all levels of society. The licensing format
under which GNU/Linux is available makes it an ideal content delivery medium
for grassroots level social programs as well as educational content delivery
platform. Thus it aims to create a 'Bangla' interface for both the graphical
(or X) and text modes. Such an effort will make it economically feasible
to install computers at village levels, thus ushering in a knowledge-based
economy built on community-based knowledge sharing platform.



Bangla Gnome Translation Project - Ankur



Ankur (the Gnome Translation Project has been named as such by Dr K Ghosh)
is working toward supporting Bangla(Bengali) language on GNU/Linux operating
system. A majority of the projects are focused on XFree86.org's Xserver,
however some are also platform independent and adds supports of other operating
systems. Ankur project has as its primary goals the following:



1. Translate popular and major XServer applications

2. Providing Bangla support for some major XServer applications such as
office suites, database, development tools and desktop environments like
GNOME,KDE. The aim/intention is to help develop and maintain open source/free
software targeted towards the Bangla speaking users.

3. Create awareness among Bangla speaking computer users

4. Content creation with the aim of educating people about GNU/Linux and
FLOSS movement.

On 02/02/2003, the project team released bspeller-0.4



Ankur is also involved in the Bengali Dictionary Project. Kaushik Ghose
outlines the aims of this project as:



1. Bangla dictionary

2. Webpage interface to bangla dictionary

3. CD version or an offline version of web interface

4. Various converters to turn bangla dictionary into say ISCII, higher ASCII
for display in other fonts

5. Various interface programs a) a dictionary GUI, b) a command-line version
of the GUI (can act as spellchecker for other programs)



'Progga'(a member of the core Ankur team) states that the project is in
need of volunteers so as to attain the deadline of August, 2003 (when Gnome
v2.4 is scheduled to be released). Till date, approximately 30% of the project
has been completed. The translation project is one of the first 'team-oriented'
project of Ankur and is based on the Open Source Software development model.
The current volunteer strength is around 10, with profiles varied across
all levels. The project allows volunteers to download files, after duly notifying
others using the mailing list. After completing either partially or fully,
the files are posted for peer review. These are generally reviewed once and
then committed to the CVS. However, in Progga's opinion one of the major
constraints to a successful completion of such a distributed project is the
lack of publicity as well as the low level of motivation of the volunteers,
especially Bengalis. And he states that more often than not there have been
cases of people who after exhibiting initial interest have just disappeared.
While this in some cases can be rationalized as to the cutting-edge technology
used, in others it can be attributed to being daunted by the task at hand.




Conclusions



Localization projects must follow the bazaar model of distributed development.
While the robustness and the stability of this model is well established
in various successful implementations, the localization and more specifically
the Translation project suffers form the lack of a firmly established command
and control structure. The project, till recently, was lacking a common 'word
pool' for words that need to be part of translated strings on a regular basis.
The peer review cycle along with the existing model needs to be modified
and re-structured so as to ensure that the translations are consistent in
quality. As is the need to create a localized set that is encapsulates the
dialects and semantics of the common populace. However, these difficulties
are the part of any such project. Given that within a span of 2 months more
than 25% of the translation has been completed, it might not be too much
ambitious to say that we can be sure to see this group meet their deadline.
Till then we will be wishing them all the best.



Note:



1. Actually refering to various softwares in GNU/Linux
distribution.



References & Links:



Bangla Penguin Project: www.banglapenguin.org

Bangla Gnome Translation Project ? Ankur: www.bengalinux.org

BIOS: banglalinux@yahoo.com

Deepayan Sarkar's page on archive of Bengali Documents on the Internet:
www.stat.wisc.edu/~deepayan/Bengali/WebPage/bengali.html

Free Banglafonts Project: http://savannah.nongnu.org/projects/freebangfont/


Kaushik Ghose: kghose@wam.umd.edu

Progga: abulfazl@juniv.edu

Sayamindu DasGupta's homepage: www.peacefulaction.org/sayamindu/

Indian Linux Users Group ? Kolkata Chapter: www.ilug-cal.org

Prof. Venkatesh Hariharan is with the Indian Institute of Information
Technology, Bangalore. He can be reached at venky@iiitb.ac.in







Sankarshan Mukhopadhyay is a Free Software enthusiast and a member
of the Indian Linux Users Group-Kolkata Chapter http://www.ilug-cal.org/.
His blog 'Random Thoughts' is at http://sankarshan.blogspot.com/ He
can be reached at sankarshanm@softhome.net









 
Microsoft & OpenSource Ports





The news



A recent post by Imran William Smith at AsiaOSC discusses about the issue of "Why Microsoft is porting some of its products to Linux ?" The article is available here. Given the stranglehold the Redmond giant seems to possess on the desktop application suite market, it is surprising change in business policy at first sight.




The truth





On hindsight and a deeper reasoning it is not. For a considerable period of time a Newsforge article drew a lot of comments on the eventual migration of MS Project to OpenSource or it being GPL-ed. The rumour made some sense. Given the fact that MrProject is one of the reasons why desktop SOHO setups go into dual boots (try a scheduling setup using spreadsheets and anyone will understand), GPL-ing the sidetracked application would have earned it some amount of goodwill. Moreover, a broadbased developer community would have actually meant that the peer review process would have been transparent and trackable. Although the article by Donald K Rosenberg argues about the probable fallout of such a deal (project forks and retail price falls), on the face of it such an analysis was the nebulous shape of things to come.




So what gives ?




Microsoft is actively pushing its Windows Rights Management Services (WRMS) or the much maligned DRM that is being looked upon with suspicion by TCPA members themselves. Anyone in the recent past who has had the experience of installing Windows Media Playerv9.0 without unchecking afew default options, would have been nastily surprised as to the wealth of poking and probing it does. WMP9.0 forms (probably) the spearhead in MS's campaign to enforce DRM systems and the Trusted Computing Paradigm. The logic is childlishly simple. People will use the desktop or embedded devices for media driven entertainment. Using the carrot and stick policy on this market will make WMP9.0 attain the mantle of a *standard* media playing widget. With people locked into the DRM setup, it becomes easy to push through as the defacto standard.





A cursory look at the SEC filing and analysing it will reveal that the major the revenue stream earners are being kept as it were. It is the sidelined applications, toolkits and widgets that are being played around with to generate brand consciousness and market penetration. Given the fact that after the Napster courtroom brawl it was evident to all that music industry is a high stakes game, it is not really surprising that Apple has joined the foray to control *media content*. The content control paradigm is a part of the push model of business honed into a fine art by MS itself. Software, digital entertainment, application services provide a giant mesh of business deals. Leaving any turf undefended leads to turf wars and battle scars.




Some pointers




An explanation about DRM and TCPA can be found in one of the articles available at iLUG-Kolkata. It is time that an awareness campaign be launched as to the ill advised measure of encoding digital data in such restricted formats.







Sankarshan Mukhopadhyay is a Free Software enthusiast and a member of iLUG-Kolkata. His blog "Random Thoughts" can be found here. He can be contacted at sankarshanm@softhome.net

Powered by Blogger