Examples of com.ibm.icu.text.UnicodeCompressor

com.ibm.icu.text.UnicodeCompressor

nicode.org/unicode/reports/tr6">Unicode Technical Report #6.

The SCSU works by using dynamically positioned windows consisting of 128 consecutive characters in Unicode. During compression, characters within a window are encoded in the compressed stream as the bytes 0x7F - 0xFF. The SCSU provides transparency for the characters (bytes) between U+0000 - U+00FF. The SCSU approximates the storage size of traditional character sets, for example 1 byte per character for ASCII or Latin-1 text, and 2 bytes per character for CJK ideographs.

USAGE

The static methods on UnicodeCompressor may be used in a straightforward manner to compress simple strings:

 String s = ... ; // get string from somewhere byte [] compressed = UnicodeCompressor.compress(s);

The static methods have a fairly large memory footprint. For finer-grained control over memory usage, UnicodeCompressor offers more powerful APIs allowing iterative compression:

 // Compress an array "chars" of length "len" using a buffer of 512 bytes // to the OutputStream "out" UnicodeCompressor myCompressor         = new UnicodeCompressor(); final static int  BUFSIZE              = 512; byte []           byteBuffer           = new byte [ BUFSIZE ]; int               bytesWritten         = 0; int []            unicharsRead         = new int [1]; int               totalCharsCompressed = 0; int               totalBytesWritten    = 0; do { // do the compression bytesWritten = myCompressor.compress(chars, totalCharsCompressed,  len, unicharsRead, byteBuffer, 0, BUFSIZE); // do something with the current set of bytes out.write(byteBuffer, 0, bytesWritten); // update the no. of characters compressed totalCharsCompressed += unicharsRead[0]; // update the no. of bytes written totalBytesWritten += bytesWritten; } while(totalCharsCompressed < len); myCompressor.reset(); // reuse compressor

@see UnicodeDecompressor @author Stephen F. Booth @stable ICU 2.4

        for(int i = 0; i < fTestCases.length; i++) {
            myTest(fTestCases[i].toCharArray(), fTestCases[i].length());
        }
    }
    private void myTest(char[] chars, int len) {
        UnicodeCompressor myCompressor = new UnicodeCompressor();
        UnicodeDecompressor myDecompressor = new UnicodeDecompressor();
        
        // variables for my compressor
        int myByteCount = 0;
        int myCharCount = 0;
        int myCompressedSize = Math.max(512, 3*len);
        byte[] myCompressed = new byte[myCompressedSize];
        int myDecompressedSize = Math.max(2, 2 * len);
        char[] myDecompressed = new char[myDecompressedSize];
        int[] unicharsRead = new int[1];
        int[] bytesRead = new int[1];
        
        myByteCount = myCompressor.compress(chars, 0, len, unicharsRead,
                myCompressed, 0, myCompressedSize);


        myCharCount = myDecompressor.decompress(myCompressed, 0, myByteCount,
                bytesRead, myDecompressed, 0, myDecompressedSize);

View Full Code Here

        for(int i = 0; i < fTestCases.length; i++) {
            myMultipassTest(fTestCases[i].toCharArray(), fTestCases[i].length());
        }
    }
    private void myMultipassTest(char [] chars, int len) throws Exception {
        UnicodeCompressor myCompressor = new UnicodeCompressor();
        UnicodeDecompressor myDecompressor = new UnicodeDecompressor();
        
        // variables for my compressor
        
        // for looping
        int byteBufferSize = 4;//Math.max(4, len / 4);
        byte[] byteBuffer = new byte [byteBufferSize];
        // real target
        int compressedSize = Math.max(512, 3 * len);
        byte[] compressed = new byte[compressedSize];


        // for looping
        int unicharBufferSize = 2;//byteBufferSize;
        char[] unicharBuffer = new char[unicharBufferSize];
        // real target
        int decompressedSize = Math.max(2, 2 * len);
        char[] decompressed = new char[decompressedSize];


        int bytesWritten = 0;
        int unicharsWritten = 0;


        int[] unicharsRead = new int[1];
        int[] bytesRead = new int[1];
        
        int totalCharsCompressed = 0;
        int totalBytesWritten = 0;


        int totalBytesDecompressed  = 0;
        int totalCharsWritten = 0;


        // not used boolean err = false;




        // perform the compression in a loop
        do {
            
            // do the compression
            bytesWritten = myCompressor.compress(chars, totalCharsCompressed, 
                   len, unicharsRead, byteBuffer, 0, byteBufferSize);


            // copy the current set of bytes into the target buffer
            System.arraycopy(byteBuffer, 0, compressed, 
                   totalBytesWritten, bytesWritten);

View Full Code Here

TOP

Related Classes of com.ibm.icu.text.UnicodeCompressor

com.ibm.icu.dev.test.compression.ExhaustiveTest

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.