This interface uses an int to represent a character rather than a Java character as a single Unicode character may be represented as a pair (surrogates) of Java characters. If we used Java characters this would mean that the dissector would either have to have special code for handling surrogates which would greatly complicate the code or we would run the risk of splitting a character. Using an int eliminates all these problems.
The underlying implementation could be either plain Java characters, encoded XML text containing character references, bytes or something else. Therefore those methods that relate to underlying implementation count using an abstract implementation unit rather than a specific one such as byte.
A break point is a location between two adjacent characters at which the string could be broken. A break point is represented as a zero base integer. The first break point is immediately before the first character and has a value of 0. The break point before character at index i
is represented by i
, the break point after character at index i
is i + 1
. The last break point is immediately after the last character and has a value equal to the length of the string.
At a minimum there are always two break points in a string, the first and the last.
In WML it is not valid to break in the middle of a variable reference within the body of an element.
In WBXML it is not valid to break in the middle of an extension code (which is used in WMLC for variable references). Due to an explicit design decision it is not possible (at the moment) to break in the middle of a string reference.
|
|
|
|
|
|
|
|
|
|
|
|