Examples of com.volantis.mcs.dissection.string.DissectableString

Package com.volantis.mcs.dissection.string

Examples of com.volantis.mcs.dissection.string.DissectableString

com.volantis.mcs.dissection.string.DissectableString
The purpose of this interface is to allow a dissector to manipulate a string of characters without having to be aware of the underlying implementation but still being able to take into account the cost of the underlying representation of those characters.
This interface uses an int to represent a character rather than a Java character as a single Unicode character may be represented as a pair (surrogates) of Java characters. If we used Java characters this would mean that the dissector would either have to have special code for handling surrogates which would greatly complicate the code or we would run the risk of splitting a character. Using an int eliminates all these problems.
The underlying implementation could be either plain Java characters, encoded XML text containing character references, bytes or something else. Therefore those methods that relate to underlying implementation count using an abstract implementation unit rather than a specific one such as byte.
Break Points

A break point is a location between two adjacent characters at which the string could be broken. A break point is represented as a zero base integer. The first break point is immediately before the first character and has a value of 0. The break point before character at index i is represented by i, the break point after character at index i is i + 1. The last break point is immediately after the last character and has a value equal to the length of the string.
At a minimum there are always two break points in a string, the first and the last.
WML

In WML it is not valid to break in the middle of a variable reference within the body of an element.
WBXML

In WBXML it is not valid to break in the middle of an extension code (which is used in WMLC for variable references). Due to an explicit design decision it is not possible (at the moment) to break in the middle of a string reference.

    }


    protected DissectableString createDissectableString(String string) 
            throws Exception {
        WBSAXString wbsaxString = strings.create(string);
        DissectableString dstring = new DissectableWBSAXString(wbsaxString);
        return dstring;
    }

View Full Code Here

        }
    }
    
    protected void checkGetRangeCost(String string, int start, 
            int[] expectedCosts) throws Exception {
        DissectableString dstring = checkCharacters(string);


        // check the cost for each character index
        int [] actualCosts = new int[expectedCosts.length];
        for (int i = 0; i < expectedCosts.length; i++) {
            actualCosts[i] = dstring.getRangeCost(start, start + i);
        }
        assertEquals(expectedCosts, actualCosts);
        // ensure the total matches if this is the entire string.
        if (start == 0) {
            assertEquals(expectedCosts[expectedCosts.length-1], 
                    dstring.getCost());
        }
    }

View Full Code Here

        }
    }


    protected void checkGetCharacterIndex(String string, int start, 
            int[] expectedIndexes) throws Exception {
        DissectableString dstring = checkCharacters(string);


        // check the character indexes for each cost
        int [] actualIndexes = new int[expectedIndexes.length];
        for (int i = 0; i < expectedIndexes.length; i++) {
            actualIndexes[i] = dstring.getCharacterIndex(start, i);
        }
        assertEquals(expectedIndexes, actualIndexes);
    }

View Full Code Here

        assertEquals(expectedIndexes, actualIndexes);
    }


    private DissectableString checkCharacters(String string) 
            throws Exception {
        DissectableString dstring = createDissectableString(string);
        char[] chars = string.toCharArray();
        
        assertEquals(string.length(), dstring.getLength());
        
        // Check the chars
        int[] expectedChars = new int[chars.length];
        int[] actualChars = new int[chars.length];
        for (int i=0; i < string.length(); i++) {
            expectedChars[i] = chars[i];
            actualChars[i] = dstring.charAt(i);
        }
        assertEquals(expectedChars, actualChars);
        return dstring;
    }

View Full Code Here

    protected DissectableString createDissectableString(String string) 
            throws Exception {
        WBSAXString wbsaxString = strings.create(string);
        DissectableWBSAXValueBuffer buffer = new DissectableWBSAXValueBuffer();
        buffer.append(wbsaxString);
        DissectableString dstring = new DissectableWBDOMCompositeString(buffer);
        return dstring;
    }

View Full Code Here

            throws DissectionException {


        // Lazily initialise the segmenter from the text / string when the
        // client code tries to dissect the text into the first shard.
        if (segmenter == null) {
            DissectableString string = document.getDissectableString(text);
            segmenter = new StringSegmenter(stringDissector, string);
        }


        // Handle the fact that when we first get called we haven't been
        // placed in a shard at all. Seems ugly that we have to do this...

View Full Code Here

TOP

Related Classes of com.volantis.mcs.dissection.string.DissectableString

com.volantis.mcs.dissection.annotation.TextAnnotation

com.volantis.mcs.dissection.string.DissectableStringTestAbstract

com.volantis.mcs.wbdom.dissection.WBDOMCompositeStringTestCase

com.volantis.mcs.wbdom.dissection.WBDOMStringTestCase

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.

Examples of com.volantis.mcs.dissection.string.DissectableString

Break Points

WML

WBXML

Related Classes of com.volantis.mcs.dissection.string.DissectableString