next up previous contents index
Next: 4. Structures Up: 9. Inside XmStrings Previous: 2. Get Ready for   Contents   Index

3. How It Works

Ok, enough of the background. Let's see how it works in practice. The basic idea is to describe data elements as a three piece combination: tag/length/value, sometimes referred to as TLV. You basically have:
\begin{itemlist}
\item a tag, which describes what type of data this is,
\item a...
...s basically an octet (or byte) sequence
that describes the value.
\end{itemlist}

The basic unit of information is the octet (or byte): 8 bits of information. You can see how 8 bits might be a little small to describe large strings - more on that later. One thing that must be noted is that TLVs can be nested, that is, the value part of a TLV tuple can contain TLVs.

I'm going to skip a full description of BER and just report the basics of how they relate to XmStrings. Let's take a trivial example:


    xmstr = XmStringCreateLtoR("Hello\nWorld", XmFONTLIST_DEFAULT_TAG);

The first thing to notice is the XmFONTLIST_DEFAULT_TAG. That's a clue to M*TIF that the string passed in is represented in the current locale (I'm not even going to try to talk about NLS - look elsewhere for what locale means). The second thing to notice is that we used XmStringCreateLtoR, which means the function should be aware of separators (normally, this means ``look for newlines''). So M*TIF would parse that as ``Hello'' (locale text), ``n'' (separator in this locale), and ``World'' (locale text).

XmSTRING_COMPONENT_UNKNOWNXmSTRING_COMPONENT_CHARSETXmSTRING_COMPONENT_TEXTXmSTRING_COMPONENT_DIRECTIONXmSTRING_COMPONENT_SEPARATORXmSTRING_COMPONENT_LOCALE_TEXT

Table: Component identifiers for XmStrings.
Identifier Value
XmSTRING_COMPONENT_UNKNOWN 0x00
XmSTRING_COMPONENT_CHARSET 0x01
XmSTRING_COMPONENT_TEXT 0x02
XmSTRING_COMPONENT_DIRECTION 0x03
XmSTRING_COMPONENT_SEPARATOR 0x04
XmSTRING_COMPONENT_LOCALE_TEXT 0x05


Let's look at what Motif does tell us about encodings - each XmString component has a different identifier (see figure [*]). Hmm, these could be the tag part of the TLVs! Given that, the XmString that M*TIF 1.2 generates is the following (in hex and chars, with the 0x prefix removed from the hex):


   DF 80 06 10 05 05 'H' 'e' 'l' 'l' 'o' 04 00 05 05 'W' 'o' 'r' 'l' 'd'
which makes absolutely no sense when you look at it that way. Try this:


0xDF 0x80 this is a M*TIF string (essentially)
10x06 0x10 1which contains a 16 byte XmString
20x05 0x05 2which contains 5 bytes of locale text
3``Hello'' 3which has the value ``Hello''
20x04 0x00 2and a separator
3- nothing - 3which has no data (never does)
20x05 0x05 2and 5 more bytes of locale text
3``World'' 3which has the value ``World''


The first number (on lines that have them) is the tag; the second number is the length. You can see that this description shows how TLVs can be nested. Look at it this way; if I just describe the string above structurally, it comes out as (using parentheses as an indicator of nesting): TLV=(TLV=(TLV,TLV,TLV)).

The first tag value 0xDF identifies every XmString. While this value seems arbitrary at the first glance it makes some sense. The tag value can be decomposed into three separate fields as shown below.

\begin{figure}
\hskip\leftmargin\epsfig {file=TAG_val.eps,scale=1.09}\end{figure}

The most significant bits 7 and 6 indicate that this is a private tag class, thus the bits 4 to 0 are just set to an arbitrary value. The ``F'' flag (bit 5) indicates that this is a simple tag encoding and not a composed one. Ok, after this you're scratching your head once again. Where does the next value 0x80 (the first length) fit in? Remember how I said that 8 bits was a little small for describing lengths? Well, that's where BER kicks in. There are really three ways for describing lengths: short form, long form, and indeterminate form. As far as I know, Motif cheats horribly on this (more on this below). Here's how you describe lengths in BER:


\begin{itemlist}
\item If the length $<$\ \code{0x80}, then length is contained ...
...hskip\leftmargin\epsfig {file=BER_len3.eps,scale=1.09}\end{figure}\end{itemlist}

As I said before, M*TIF is really lazy (what else did you expect?!). The first header (0xDF 0x80) should imply that an XmString parser should look for a tag and length that are both 0. In practice, Motif strings contain only one element in the value: the XmString. I've parsed strings in M*TIF looking for the (0x00 0x00) tag/length, and run off into space. Therefore, LESSTIF stops after finding the first XmString component. In effect, a length of 0x80 in M*TIF means ``I don't know how long my value is, but my value is really a TLV, and there's only one of them''.

Let's look at our example string again, in light of this information:

0xDF 0x80 XmSTRING_TAG, XmSTRING_LENGTH
10x06 0x10 1XmSTRING_COMPONENT_XMSTRING, 16 bytes
20x05 0x05 2XmSTRING_COMPONENT_LOCALE_TEXT, 5 bytes
3``Hello'' 3``Hello''
20x04 0x00 2XmSTRING_COMPONENT_SEPARATOR, 0 bytes
3- nothing -  
20x05 0x05 2XmSTRING_COMPONENT_LOCALE_TEXT, 5 bytes
3``World'' 3``World''


XmSTRING_COMPONENT_XMSTRINGThat should make more sense, now. Note that the tags 6-125 are said to be reserved in M*TIF's header files; now you should understand why the value 6 is XmSTRING_COMPONENT_XMSTRING (which doesn't appear in any Motif header). The length XmSTRING_LENGTH is used within LESSTIF as a synonym for the indeterminate length value of 0x80. You'll find its definition in (LESSTIF_ROOT)/libXm/XmString.c.


next up previous contents index
Next: 4. Structures Up: 9. Inside XmStrings Previous: 2. Get Ready for   Contents   Index
Danny Backx
2000-12-13