PAN
Localization Project
Science
Technology and Environment Agency of Lao PDR
Lao Fonts
Phonpasit PHISSAMAY, National Project Director
Valaxay DALALOY, National Project Coordinator
(Mr. Thonglor DUANGSAVANH);
(Mr. Vethsouvanh PHENGCHANH);
(Mr. Khamkeo KOMMADAM)
I.
Definition
Font is a collection of glyphs used for the visual depiction of character
data. A font is often associated with a set of parameters (for example, size,
posture, weight, and serifness), which, when set to particular values, generate
a collection of imagable glyphs.
Font has three
components: Coded font, Font character set and Code page
A coded font translates
your request for type (for example, text you previously entered at a computer
terminal) into characters for printing. A coded font, which associates a
specific code page with a specific font character, consists of two parts:
A character must be included in the specified font character set and
listed on the specified code page before it can be printed.
A font character set
contains the characters of a single type family, typeface, and type size. In
addition, a font character set specifies character
properties and printing attributes.
Characters are the
letters, numerals, punctuation marks, or other symbols of a font.
Character properties describe how a character is positioned relative to the characters
around it. Some character properties include the following:
Each character is
assigned a character ID; for example, the character A (uppercase A) is assigned
the character ID LA020000. The purpose of a character ID is to distinguish the
character from other, similar characters. For example, the following characters
look similar; however, they are different and are assigned different character
IDs:
–
Minus
sign (-) Character ID
SA000000;
–
Hyphen
(-) Character ID SP100000;
–
Em
dash (--) Character ID
SM900000
The printing attributes define how the font
character set will be printed. Some printing attributes include rotation of
characters, maximum ascender, and point size.
A code page maps each character of text to
the characters in a font character set. The following picture shows how a code
page maps text to the characters in a font character set. As you enter your
text at a computer terminal, each keyboard character is translated into a code point. When the text is printed, each code point is
matched to a character ID on the code page you specified. The character ID is
then matched to the image of the character in the font character set you
specified. The image in the character set is the image that is printed.

A character ID is an 8-byte character data string. A code point is an
8-bit binary number representing one of 256 potential characters (the maximum
number of characters available on a code page). Code points are usually shown
as hexadecimal representations of their binary values.
Binary: 11000001;
Decimal: 193; Hexadecimal : C1
II.
Word in Lao:
Structure of Lao syllable:
-
Level 1: The character appearing in level 1 is of diacritic
type. There are five diacritic namely:
-
Level 2: Level 2 is occupied by superscript vowels only. The
seven vowels of level 2 are:
![]()
-
Level 3: This level is the main level of Lao word. There is
always a character at level 3 at each position in a Lao word. All thirty-three
consonants as well as the before and after vowels twelve and 2 special symbols
are also at level 3. However some consonants and vowels are also extended into
level 2 and level 4 such as:
![]()
Due to the four levels structure, the high and
length of characters existed in each level are not the same. If considering the
character in the level 3 is main for compare then the size of character in
level2 and lvel4 are equivalent 50% of size of character in level3. And the
size of character in level1 is equivalent 50% of size of character in level2
The type of Lao characters:
The
type of Lao characters development also impacted from the country development
such as regime and equipment facilities. However it can be classified into 3
groups:
1.
The traditional or old typewriter: Based on MAHASILA grammar
book (Old Lao Grammar) this has been developed during the royal regime (before
1975). The characteristic is rounded glyphs with thin and uniform-width
strokes. Example:
![]()
2.
The new typewriter or schoolbook in present: Based on PHOUMY
VONGVICHITH grammar book (new Lao grammar) this has been developed after
establishment of LAO PDR (after 1975). The characteristic is glyphs with
straight strokes where possible, and somewhat heavier uniform-width strokes.
Example:
![]()
3.
Ornamental glyph: The new development glyph in order to make
the Lao character look more beauty. The most of the modern glyphs are developed
since last five year after the computer has created a big impact into the
printing materials. Most of this glyphs are using in the brochure,
advertisement letter or magazine. The characteristic is calligraphic strokes,
handwriting styles. Example:
--
III. Lao Fonts
1.
Factors for
considerations:
When considering which font to use, apart from
appearance, there are four main factors to consider:
-
Is word-wrapping
important? For a large amount of text,
it is much more convenient if the text can be entered without having to think about
breaking each line by hand. This becomes important whenever text must be edited
or revised, to prevent minor changes resulting in every subsequent line needing
adjusting.
-
Do you need both
Lao and roman character in single font? While in word processing, this is
rarely a problem, in many other applications (such as spreadsheets and database
applications) it is often important to be able to mix languages in a single
entry, for which a common font must be used.
-
Does the
application program interpret numeric character and symbols correctly? Many Lao
fonts use the standard codes for numbers and arithmetic symbols for other
characters, which leads to program errors, especially in spreadsheet and
database applications. The hyphen code, in particular, is often recognized as a
minus sign, and must be used with care.
-
Do you need a
wide range of styles? For heading, or for brochures, the above factors are
usually less important than being able to choose from a wide range of font
styles.
What
style the font is drawing in must be decided before drawing even the first
character so that they will all be balanced in shape and style. It is important
to decide on basic width for character in reference to the showing position,
especially for the tone mark and superscript vowels they have many different
positions placed in the syllable.
2.
Methodologies:
There are 3 stages for Lao shaping engine processes text:
1.
Analyze
characters for valid diacritic combinations
2.
Shape
(substitute) glyphs with OTLS (OpenType Library Services)
3.
Position glyphs
with OTLS
–
Analyze Characters
The unit that the shaping engine receives for the purpose of shaping is
a string of Unicode characters, in a sequence. The contextual analysis engine
verifies valid diacritic combinations. For additional information, see Invalid Combining Marks.
The handling of the AM in the analysis phase is special. In the case
where an above mark does not exist on the preceding base consonant, the 'ccmp'
feature will be used to decompose the AM into the NIGGAHITA and AA glyphs. This
allows the NIGGAHITA glyph to be positioned correctly above the preceding base
consonant. If there is a tone mark on the base consonant already, the analysis
engine will decompose the AM and reorder the NIGGAHITA to between the base
consonant and the tone mark. This allows the NIGGAHITA glyph to be positioned
correctly above the base consonant, and the tone mark to be positioned
correctly above the NIGGAHITA. This behavior cannot be tested in VOLT, as this logic is not in VOLT.

The first step: Uniscribe takes in shaping the character string is to
map all characters to their nominal form glyphs.
Next, Uniscribe calls OTLS to apply the features. All OTL processing is
divided into a set of predefined features. Each feature is applied, one
by one, to the appropriate glyphs in the syllable and OTLS processes them.
Uniscribe makes as many calls to the OTL Services as there are features. This
ensures that the features are executed in the desired order.
Uniscribe next applies features concerned with
positioning, calling functions of OTLS to position glyphs.
Positioning features:
–
Invalid Combining Marks
Combining marks and signs that appear in text not in conjunction with a
valid consonant base are considered invalid. Uniscribe displays these
marks using the fallback rendering mechanism defined in the Unicode Standard
(section 5.12, 'Rendering Non-Spacing Marks' of the Unicode Standard 3.0), i.e.
positioned on a dotted circle. For the fallback mechanism to work properly, a
Lao OTL font should contain a glyph for the dotted circle (U+25CC). In case
this glyph is missing from the font, the invalid signs will be displayed on the
missing glyph shape (white box).
In addition to the 'dotted circle', other Unicode code points that are
recommended for inclusion in any Lao font is the ZWSP (zero width space;
U+200B). Lao words are not separated by spaces, so the ZWSP can be used for
word boundaries since it will allow for word wrapping at the end of a line.
Some applications will use a lexical lookup to do word wrapping without needing
ZWSP characters.
If an invalid combination is found, the diacritic that causes the
invalid state is placed on a dotted circle to indicate to the user the invalid
combination. The shaping engine for non-OpenType fonts will cause invalid mark
combinations to overstrike. This is the problem that inserting the dotted
circle for the invalid base solves. It should also be noted that the dotted
circle is not inserted into the application's backing store; this is a run-time
insertion into the glyph array that is returned from the ScriptShape function. The invalid diacritic logic for Lao is based
on the classes listed below. There is a check to make sure more than one mark
of a class is not placed on the same base.
|
Class |
Description |
Code points |
|
ABOVE1 |
Above mark closest to
base |
U+0EB1, U+0EB4,
U+0EB5, U+0EB6, U+0EB7, U+0EBB, U+0ECD |
|
ABOVE2 |
Second level above
mark |
U+0EC8, U+0EC9,
U+0ECA, U+0ECB, U+0ECC |
|
BELOW1 |
Below mark closest to
base |
U+0EBC |
|
BELOW2 |
Second level below
mark |
U+0EB8, U+0EB9 |
|
Vowel:AM |
The AM character |
U+0EB3 |
3.
Lao Font feature:
–
Shape characteristic of Lao Characters.
The shape of Lao character can classify into
6
groups:

Lao Character Glyph at Syllable Structure

–
Kerning
The
'kern' feature is used to adjust amount of space between glyphs, generally to
provide optically consistent spacing between glyphs. Although a well-designed
typeface has consistent inter-glyph spacing overall, some glyph combinations
require adjustment for improved legibility. Besides standard adjustment in
either horizontal or vertical direction, this feature can supply size-dependent
kerning data via device tables, "cross-stream" kerning in the Y text
direction, and adjustment of glyph placement independent of the advance
adjustment. Note that this feature would not be used in mono-spaced fonts.
The
font stores a set of adjustments for pairs of glyphs. These may be stored as
one or more tables matching left and right classes, and/or as individual pairs.
If both forms are used, the classes should be listed last, so as to provide a
means to replace any non-ideal values that may result from the class tables.
Additional adjustments may be provided for larger sets of glyphs (e.g.,
triplets, quadruplets, etc.) to overwrite the results of pair kerns in
particular combinations. These should precede the pairs.
Example:
Kerning by pair adjustment
using Microsoft VOLT ![]()
Before Kerning
![]()
After Kerning
–
Mark to base positioning
The
'mark' feature positions mark glyphs in relation to a base glyph, or a ligature
glyph. This feature may be implemented as a MarkToBase or a MarkToLigature.
Example:
Positioning mark to base
using Microsoft VOLT
Before:
After:
–
Mark to mark positioning
The
'mkmk' feature positions mark glyphs in relation to another mark glyph. This
feature may be implemented as a MarkToMark.
Example:

Positioning
mark to mark using Microsoft VOLT

Before:
After:
Reference:
The glyphs characteristic of each Lao
character:


