Random Thoughts..: March 21, 2004

Random Thoughts..

Wednesday, March 24, 2004

Localisation and Usability issues - a short note
Introduction
------------

In a joint paper [1] presented by D Seetharam (Country Manager, Government Relations) and Ashish Gautam (Software Sales Specialist - Linux) from IBM India Ltd, the key elements for Indian language support in GNU/Linux are listed (original list snipped) as:

[a] Correct technical description of each Indian language
[b] The adoption of international standards and methodologies for technical implementation which logically prevents fragmentation for multilingual support on GNU/Linux distributions
[c] Selection of appropriate applications in local languages for users thus providing a powerful set of useful solutions
[d] The components required for technical development to support Indian languages are:
[i] locales for Indian languages
[ii]input methods for Indic scripts and languages
[iii] layout engines for Indian scripts
[iv] fonts for Indian scripts and languages

Definitions of i18n and l10n
----------------------------

Before we begin a short look into the aspects of usability, we should learn about a few terms. Owen Taylor (otaylor at redhat dot com) in his short monograph on Internationalisation in GTK+ [2] defines the terms internationalisation (i18n) and localisation (l10n) as:

the terms internationalization and localization refer to the process of making software support a range of languages, and to the process of adapting the messages and conventions of a program to those of a particular locale, respectively. These terms are often abbreviated i18n and l10n respectively, after the number of letters between the first and last letters of the word.

The locale is the set of settings for the user's country and/or language. It is usually specified by a string like en_UK. The first two letters identify the language (English) the second two the country (the United
Kingdom). Included in the locale is information about things like the currency for the country and how numbers are formatted, but, more importantly, it describes the characters used for the language. The
character set is the set of characters used to display the language. When storing characters in memory or on disk, a given character set may be stored in different ways - the way it is stored is termed the encoding.

Usability
---------

An important aspect that is not touched in this list for key elements is factoring in 'usability' at the localisation (L10n) level. In his article [3] on 'Why GNOME hackers should care about usability' [4] - Seth Nickell (snickell at stanford dot edu)defines 'usable' as 'capable of being used'. He further goes on to state that:

For such a simplistic definition, this encapsulates the fundamental goal of usability very well. Usable software is software that people can use; whether to write Email, play games or develop the next killer application.

Traditionally, Free/Libre [5] Open Source [6] Software has been inspired and driven by what Eric S Raymond terms 'soothing the personal itch of the developer'. Throughout the 'release early, release often' iterative model of development F/LOSS has never ever taken usability as the strong point of reference but the main basis has been functionality and power. Generalising in a very generic manner and putting a short list, the problems stem from the fact that:

[1] the developers are/were not the end-users
[2] usability experts do not get involved in the development cycle
[3] functionality provides the main incentive for development and release
[4] problems in usablity context are abstract and difficult to specify
[5] usability metrics are not put in place in advance to coding

Curiously enough, when we talk (or write)about usability, the actual aspect that is being discussed is learnability - or how fast the end-user can 'learn' to use the software component functionally and optimally.

Ranking Usability
-----------------

Usability can be ranked on the following aspects :

[1] ease of learning,
[2] efficiency of use,
[3] memorability,
[4] error frequency and severity, and
[5] subjective satisfaction

The nature of 'intangibles' present in the ranking definition creates the need for implementing and utilising framework(s) and models to capture the data and end-user feedback. The main point of the exercise being to understand whether the product is 'desirable'.

Causal agent(s)
---------------

The crux of the problem lies in the fact that the average software developer cannot 'create and ideate' for the average user.

If this [desktop and application design] were primarily a technical problem, the outcome would hardly be in doubt. But it isn't; it's a problem in ergonomic design and interface psychology, and hackers have historically been poor at it. That is, while hackers can be very good at designing interfaces for other hackers, they tend to be poor at modeling the thought processes of the other 95% of the population well enough to write interfaces that J. Random End-User and his Aunt Tillie will pay to buy. (Raymond, 1999)

This has been more due to tradition than convention. Free and Open Source software users were 'power users'. For these groups of early adopters and pioneers, the scope and utility of any software lay in the functionality offered.

Thus as:

[1] The end-user issues are not addressed by the developers. Software development is carried out by using the personal equations of the developers and not the clueless (and technically challenged)newbie. This leads to a situation where esoteric and non-standard interfaces and layout dominate.
[2] Domain experts in Usability are not part of the development team. The strength and probably the Unique proposition of the F/LOSS model is the emphasis on 'hacker culture'.

Hence it is not surprising to note that anecdotal evidence suggests that few people with usability experience are involved in OSS projects; one of the "lessons learned" in the Mozilla project [7]is to "ensure that UI [user interface] designers engage the Open Source community" (Trudelle, 2002). Thus the 'end-user' perspective is lost.

Scope of Usability
------------------

Usability encompasses a wide range of disciplines both applied and theoretical eg.psychology, sociology, graphic design and even theatre studies. This makes it mandatory that a cross functional and matrix-oriented multidisciplinary design team be put in place to effectively leverage skills and competencies. These are required to create,initiate and sustain the usability momentum. It is more a norm rather an exception that existing OSS teams may just lack the skills to solve usability problems and even the skills to bring in "outsiders" to help.

Possible explanations for the absence (and sometimes 'active' non-participation)of HCI and usability people in OSS projects:

[1] Lesser number of 'reliable and capable' Usability experts.
[2] Lack of incentives for experts to participate.
[3] Usability experts do not feel welcomed into OSS projects.
[4] Inertia: traditionally projects haven?t needed usability experts.

[There is not a critical mass of usability experts involved for the incentives of peer acclaim and recruitment opportunities to operate.]

Usability Problems
------------------

Usability problems are, by the very nature of their intangible aspects, harder to specify and distribute than functionality problems (which are easier to specify and also evaluate.) Thus there is an increased tendency towards design inconsistency and hence lower the overall usability. The modularity of OSS projects contributes to the effectiveness of the approach, enabling them to side-step Brook's Law. This effectively translates into the ability to swap the 'offending' parts out to be replaced by more 'user-oriented' modules. Yet one major success criterion for usability is consistency of design. Slight variations in the interface between modules and different versions of modules can irritate and confuse, marring the overall user experience. If there is no collective advance planning even for the coding, there is no opportunity to factor in interface issues in the early design phases. F/LOSS planning is usually done by the project initiator and/or the designated project leader. Thus unless this person is fortunate to possess consumately significant interaction design skills, important aspects of usability tend to be overlooked until it is too late.

Free/Libre Open source projects, as a matter of tradition, have been lacking in terms of resources to undertake high quality (and of consistent) usability initiatives. Almost a majority of such projects are voluntary and thus the allocated financial capital is small (and sometimes non-existent in practice). The question of involving Subject Matter Experts (technical authors and graphic designers) thus simply does not arise. Research and development oriented large scale and long running experiments coupled with Usability Laboratories are simply economically unviable for most projects.

An approach by which the F/LOSS model does address the issue of Usability is software internationalization (or to be more precise, software localisation), where the language of the interface (and any culture-specific icons) is translated. This approach incorporates the 'best practices of the modular OSS approach.

GNOME, Usability and L10n
-------------------------

Almost all of the Indic L10n efforts have initiated work with the GNOME Desktop Environment [8]. GNOME offers a mature developer toolsuite and application infrastructure. The end-user interface follows the classic
Window,Icon,Mouse,Pointer (WIMP)paradigm which visually familiar interface elements for those using a computer functionally in a windowing environment. In the specific case of the Ankur Bangla Project [9] this was due to the fact that the Pango rendering engine provided support for Bangla. To ensure that a language obtains 'supported' status on GNOME, the development team has to ensure that at least partial translations exist. Officially however,a translation is considered 'supported'in
GNOME when over 80%of all messages are translated.

A major hurdle faced by L10n teams is the lack of appropriate and apt computer jargon in the local language. With English being the more or less accepted lingua franca of the IT space, an enriched vocabulary is a prime requirement for a 'good' L10n effort. As such, L10n teams create their own terms and phrases - born out of common usage, statistical methods as well as some random sampling of commonly accepted local language term substitutes.

[1] Completeness - GNOME follows a bi-yearly release schedule, thus this means that at any point of time, the L10n team is attempting to provide a completely localised interface while keeping the effort in sync with the GNOME release. This is most widely experienced in the GNOME Help system. Keeping up with new releases means that the GNOME Help system is incompletely translated. Thus sometimes although the application maybe localised, the help system would be running English.

[2] Consistency - A lack of computer jargon means that volunteers in L10n teams have found more than one synonym for a particular English equivalent. Although standardised wordlists and templates are always proposed to address this particular issue, internal QA is not always robust enough to maintain a consistent base in translation.

[3] Correctness - The specialised terminology used in the GNOME UI is often a cause for ambiguous translations. Such ambiguity (and thus incorrectness) results from a misinterpretation of the actual message string as perceived by the volunteer translator. More often than not, this leads to a situation where the translation creates a completely different message than the one intended. One notorious example is the commonly used term 'control'which,according to WordNet[10],has eleven meanings as a noun,and eight as a verb!

The most obvious way to combat incorrect translations is to test them in context.Especially when using a GUI,translation errors are much easier to spot than when looking at the texts in an editor.Context testing is not
always as easy as it sounds,because the modules being translated are often still under development. Consequently,a translator can often only context test when he is capable of beta testing as well.

Accessiblity Features are one of the important aspects of navigating the UI. In a majority of the L10n efforts, the inconsistency generated during translation time leads to identical keys being focussed on the same item or more than one focusable control uses the same access key. Another problem is untranslated access keys

Cultural paradigms are the driving force of all L10n efforts. As such these nuances are required to be considered while undertaking the initiative. Symbols, icons and pointers which are from a culturally different milieux are not required to be ported or transplanted in their entirety. But new and more recognisable features which are attuned to the local language are required to be created and used. This leads to a much better and enriched graphical interface and more intuitive environment for the end user.

Possible Solutions
------------------

Provisioning for usability should ideally take place in advance of any coding for the software. Thus the requirements analysis and design analysis phases of the system design should incorporate Joint Sessions with the various stakeholders of the system. The fallacy in this case being that such a scenario is suitable for a 'close-knit' approach to a software development model. For a globally distributed community based approach followed by F/LOSS, it is surprising the the model works in spite of probably violating (or otherwise stress testing) every single known principle of Software Engineering.

In a traditional setup projects are expected to be carefully planned and strategy sessions initiated to find an optimal plan. In contrast all the F/LOSS projects seem to be hurrying forward into the 'Coding Stage' to react to the 'personal itch'. The iterative nature of the software development model relies mainly on the 'many eyeballs' concept to review and re-structure the code alongside improvement of the original design (more often than not based on performance optimisation parameters). While writing about the experience of Mozilla with the Usability paradigm, Trudelle (2002) states (quite blandly and truthfully) that skipping much of the design stage with Mozilla resulted in design and requirements work occurring in bug reports, after the distribution of early versions.

Taking into account the known problems of the F/LOSS model to integrate and embrace Usability Concepts, it becomes imperative that at least a semblance of planning is done at the early stages. In a fully setup laboratory, formal usability studies provide complete toolsets to help evaluate the task fulfillment ability of the end-users.

In case of F/LOSS products aiming at commercial success or a modicum of it, the main index of measurement should however be 'Desirability' (defined as the need/urge to possess a commercial copy of the software under test).

Desirability involves measuring intangible aspects (again) that aim to rationalise the satisfaction gained from using the product.

Traditionally two approaches are prescribed :

[1] usage of Likert scales (with the disadvantage being that the level of understanding of the user, as well as the perceptions of the practitioner are obvious points of bias creep).
[2] interview of the users on a face-to-face basis (the cons in this case being the timespan for such a project as well as the manpower required to execute it).

To facilitate the design process as well as ensure adequate end-user feedback, the alpha release(s) could be taken as a test bed in a real-life environment. Considering the test case where a 'stable' release with optimal features that encapsulate the objective of the software development, field studies when captured on film and re-evaluated, provide some depth of information as to the 'Desirability and Usability' of the software.

The issues to be considered are:

[1] an understanding and assessment of the (expected and desired) level(s) of cognitive behavior of the 'subjects'
[2] the basic concept that there is really no requirement to provide lengthy and detailed explanations on the requirements at hand or the objective - this creates a chimera that becomes self-fulfilling
[3] since a majority of the tests will be based on field work and field level setups as opposed to typical research lab surroundings, an eye on the setup to ensure that basic conveniences are at hand
[4] providing an ease-of-use ambience to the subject(s)
[5] ensuring that the setup is as non-intrusive as possible and arrange furniture/settings such that the camera is not being directly focussed on the subjects
[6] an objective assessment of the level of concentration as well as awareness and thus time the test sessions accordingly
[7] separating the more 'active' and 'aware' from the rest for other spatial and cognitive sessions
[8] scripting a test session so as to make the interactions more suited to the audience and thus engaging
[9] providing continuous re-assurance, but ceasing from being patronising (either overtly or in an implied manner)
[10] at the initial stage some lessons might be needed on handling the mouse, cursors etc
[11] encouraging questions and ensuring that they are recorded
[12] using simplified and illustrated/illustrative instructions so as to keep in mind the (often) limited computing term vocabulary
[13] rewarding (providing incentive to participate) subjects by providing token of appreciation

Conclusion
----------

Field studies that are captured on film provide historical data. A quick method of capturing qualitative (as opposed to quantitative data) is using word-association cards. Based on the concept of 'exit polls' (that at least Indian readers will be quick to associate), it involves using cards with words that aim to provide the user with cues as to the level of desirability and thus the effect of the software. This leads to an identification of trends. Although the statistical significance of such a methodology can be put to test, the qualitative data obtained answers the question of product reactions.

The drawback is that the results from such studies cannot be generalised as they are biased towards information which is there to be utilised about the quality of user experience and thus suggest 'possible' design changes. However, one of the major advantages is that candid and spur-of-the-moment feedback can be obtained. The tests are also quick to administer and the data can be rapidly assessed and evaluated. Most importantly, being interactive and a 2-way process, the participants enjoy the engaging nature of the format.

References:

[1] Excerpted from Linux For You Vol 1 No 1 February 2003 - courtesy MAIT
[2] http://developer.gnome.org/doc/whitepapers/gtki18n/x21.html
[3] http://usability.gnome.org
[4] http://developer.gnome.org/projects/gup/articles/why_care/
[5] www.fsf.org - Free Software Foundation
[6] www.opensource.org - Open Source Consortium
[7] http://www.iol.ie/~calum/chi2002/peter_trudelle.txt
[8] www.gnome.org[9] www.bengalinux.org
[10] www.wordnet.org
- posted by sankarshan @ 3/24/2004 07:44:00 PM 0 comments