Examples of java.text.BreakIterator

java.text.BreakIterator

each word in order BreakIterator boundary = BreakIterator.getWordInstance(); boundary.setText(stringToExamine); printEachForward(boundary, stringToExamine); //print each sentence in reverse order boundary = BreakIterator.getSentenceInstance(Locale.US); boundary.setText(stringToExamine); printEachBackward(boundary, stringToExamine); printFirst(boundary, stringToExamine); printLast(boundary, stringToExamine); } } Print each element in order:

 public static void printEachForward(BreakIterator boundary, String source) { int start = boundary.first(); for (int end = boundary.next(); end != BreakIterator.DONE; start = end, end = boundary.next()) { System.out.println(source.substring(start,end)); } }

Print each element in reverse order:

 public static void printEachBackward(BreakIterator boundary, String source) { int end = boundary.last(); for (int start = boundary.previous(); start != BreakIterator.DONE; end = start, start = boundary.previous()) { System.out.println(source.substring(start,end)); } }

Print first element:

 public static void printFirst(BreakIterator boundary, String source) { int start = boundary.first(); int end = boundary.next(); System.out.println(source.substring(start,end)); }

Print last element:

 public static void printLast(BreakIterator boundary, String source) { int end = boundary.last(); int start = boundary.previous(); System.out.println(source.substring(start,end)); }

Print the element at a specified position:

 public static void printAt(BreakIterator boundary, int pos, String source) { int end = boundary.following(pos); int start = boundary.previous(); System.out.println(source.substring(start,end)); }

Find the next word:

 public static int nextWordStartAfter(int pos, String text) { BreakIterator wb = BreakIterator.getWordInstance(); wb.setText(text); int last = wb.following(pos); int current = wb.next(); while (current != BreakIterator.DONE) { for (int p = last; p < current; p++) { if (Character.isLetter(text.codePointAt(p))) return last; } last = current; current = wb.next(); } return BreakIterator.DONE; } 
(The iterator returned by BreakIterator.getWordInstance() is unique in that the break positions it returns don't represent both the start and end of the thing being iterated over. That is, a sentence-break iterator returns breaks that each represent the end of one sentence and the beginning of the next. With the word-break iterator, the characters between two boundaries might be a word, or they might be the punctuation or whitespace between two words. The above code uses a simple heuristic to determine which boundary is the beginning of a word: If the characters between this boundary and the next boundary include at least one letter (this can be an alphabetical letter, a CJK ideograph, a Hangul syllable, a Kana character, etc.), then the text between this boundary and the next is a word; otherwise, it's the material between words.)

@see CharacterIterator

    {
      return;
    }


    final CharacterEntityParser entityParser = HtmlCharacterEntities.getEntityParser();
    final BreakIterator instance = BreakIterator.getLineInstance();
    instance.setText(text);


    int start = instance.first();
    int end = instance.next();


    boolean flagStart = true;
    while (end != BreakIterator.DONE)
    {
      final String readLine = text.substring(start, end);
      start = end;
      end = instance.next();


      if (flagStart == true)
      {
        flagStart = false;
      }

View Full Code Here

     * Regular Expression: 
     *   Before C: [{cased==true}][{wordBoundary!=true}]*
     *   After C: !([{wordBoundary!=true}]*[{cased}])
     */
    private static boolean isFinalCased(String src, int index, Locale locale) {
  BreakIterator wordBoundary = BreakIterator.getWordInstance(locale);
  wordBoundary.setText(src);
  int ch;


  // Look for a preceding 'cased' letter
  for (int i = index; (i >= 0) && !wordBoundary.isBoundary(i);
    i -= Character.charCount(ch)) {


      ch = src.codePointBefore(i);
      if (isCased(ch)) {


    int len = src.length();
    // Check that there is no 'cased' letter after the index
    for (i = index + Character.charCount(src.codePointAt(index));
      (i < len) && !wordBoundary.isBoundary(i);
      i += Character.charCount(ch)) {


        ch = src.codePointAt(i);
        if (isCased(ch)) {
      return false;

View Full Code Here

     * regular expression (see the Javadoc for class Pattern). Need to avoid
     * both String.split and regular expressions, in order to compile against
     * JCL Foundation (bug 80053). Also need to do this in an NL-sensitive way.
     * The use of BreakIterator was suggested in bug 90579.
     */
    BreakIterator iter = BreakIterator.getWordInstance();
    iter.setText(text);
    int i = iter.first();
    while (i != java.text.BreakIterator.DONE && i < text.length()) {
      int j = iter.following(i);
      if (j == java.text.BreakIterator.DONE)
        j = text.length();


      /* match the word */
      if (Character.isLetterOrDigit(text.charAt(i))) {

View Full Code Here

     * regular expression (see the Javadoc for class Pattern). Need to avoid
     * both String.split and regular expressions, in order to compile against
     * JCL Foundation (bug 80053). Also need to do this in an NL-sensitive way.
     * The use of BreakIterator was suggested in bug 90579.
     */
    BreakIterator iter = BreakIterator.getWordInstance();
    iter.setText(text);
    int i = iter.first();
    while (i != java.text.BreakIterator.DONE && i < text.length()) {
      int j = iter.following(i);
      if (j == java.text.BreakIterator.DONE)
        j = text.length();


      /* match the word */
      if (Character.isLetterOrDigit(text.charAt(i))) {

View Full Code Here

     * regular expression (see the Javadoc for class Pattern). Need to avoid
     * both String.split and regular expressions, in order to compile against
     * JCL Foundation (bug 80053). Also need to do this in an NL-sensitive way.
     * The use of BreakIterator was suggested in bug 90579.
     */
    BreakIterator iter = BreakIterator.getWordInstance();
    iter.setText(text);
    int i = iter.first();
    while (i != java.text.BreakIterator.DONE && i < text.length()) {
      int j = iter.following(i);
      if (j == java.text.BreakIterator.DONE)
        j = text.length();


      /* match the word */
      if (Character.isLetterOrDigit(text.charAt(i))) {

View Full Code Here

     * regular expression (see the Javadoc for class Pattern). Need to avoid
     * both String.split and regular expressions, in order to compile against
     * JCL Foundation (bug 80053). Also need to do this in an NL-sensitive way.
     * The use of BreakIterator was suggested in bug 90579.
     */
    BreakIterator iter = BreakIterator.getWordInstance();
    iter.setText(text);
    int i = iter.first();
    while (i != java.text.BreakIterator.DONE && i < text.length()) {
      int j = iter.following(i);
      if (j == java.text.BreakIterator.DONE)
        j = text.length();


      /* match the word */
      if (Character.isLetterOrDigit(text.charAt(i))) {

View Full Code Here

            switch (part) {
            case AccessibleText.CHARACTER:
                return TextComponent.this.getText().substring(index, index+1);
            case AccessibleText.WORD:  {
                    String s = TextComponent.this.getText();
                    BreakIterator words = BreakIterator.getWordInstance();
                    words.setText(s);
                    int end = words.following(index);
                    return s.substring(words.previous(), end);
                }
            case AccessibleText.SENTENCE:  {
                    String s = TextComponent.this.getText();
                    BreakIterator sentence = BreakIterator.getSentenceInstance();
                    sentence.setText(s);
                    int end = sentence.following(index);
                    return s.substring(sentence.previous(), end);
                }
            default:
                return null;
            }
        }

View Full Code Here

       return null;
    }
                return TextComponent.this.getText().substring(index+1, index+2);
            case AccessibleText.WORD:  {
                    String s = TextComponent.this.getText();
                    BreakIterator words = BreakIterator.getWordInstance();
                    words.setText(s);
                    int start = findWordLimit(index, words, NEXT, s);
                    if (start == BreakIterator.DONE || start >= s.length()) {
                        return null;
                    }
                    int end = words.following(start);
                    if (end == BreakIterator.DONE || end >= s.length()) {
                        return null;
                    }
                    return s.substring(start, end);
                }
            case AccessibleText.SENTENCE:  {
                    String s = TextComponent.this.getText();
                    BreakIterator sentence = BreakIterator.getSentenceInstance();
                    sentence.setText(s);
                    int start = sentence.following(index);
                    if (start == BreakIterator.DONE || start >= s.length()) {
                        return null;
                    }
                    int end = sentence.following(start);
                    if (end == BreakIterator.DONE || end >= s.length()) {
                        return null;
                    }
                    return s.substring(start, end);
                }

View Full Code Here

        return null;
    }
                return TextComponent.this.getText().substring(index-1, index);
            case AccessibleText.WORD:  {
                    String s = TextComponent.this.getText();
                    BreakIterator words = BreakIterator.getWordInstance();
                    words.setText(s);
                    int end = findWordLimit(index, words, PREVIOUS, s);
                    if (end == BreakIterator.DONE) {
                        return null;
                    }
                    int start = words.preceding(end);
                    if (start == BreakIterator.DONE) {
                        return null;
                    }
                    return s.substring(start, end);
                }
            case AccessibleText.SENTENCE:  {
                    String s = TextComponent.this.getText();
                    BreakIterator sentence = BreakIterator.getSentenceInstance();
                    sentence.setText(s);
                    int end = sentence.following(index);
                    end = sentence.previous();
                    int start = sentence.previous();
                    if (start == BreakIterator.DONE) {
                        return null;
                    }
                    return s.substring(start, end);
                }

View Full Code Here

        text.length()
        );
      
      AttributedCharacterIterator lineParagraph = attributedText.getIterator();
      
      BreakIterator breakIterator = 
        isToTruncateAtChar() 
        ? BreakIterator.getCharacterInstance() 
        : BreakIterator.getLineInstance();
      LineBreakMeasurer lineMeasurer = 
        new LineBreakMeasurer(
          lineParagraph,
          breakIterator,
          getFontRenderContext()
          );


      if (renderNextLine(lineMeasurer, lineParagraph, null, new int[]{0}, new TabStop[]{null}, new boolean[]{false}))
      {
        int lastPos = lineMeasurer.getPosition();
        //test if the entire suffix fit
        if (lastPos == linePosition + truncateSuffx.length())
        {
          //subtract the suffix from the offset
          measuredState.textOffset -= truncateSuffx.length();
          measuredState.textSuffix = truncateSuffx;
          done = true;
        }
        else
        {
          linePosition = breakIterator.preceding(linePosition);
          if (linePosition == BreakIterator.DONE)
          {
            //if the text suffix did not fit the line, only the part of it that fits will show


            //truncate the suffix

View Full Code Here

0 1 2 3 4 5 6 7 8 9

TOP

Related Classes of java.text.BreakIterator

ae.java.awt.TextComponent$AccessibleAWTTextComponent

ariba.util.core.StringUtil

ch.swingfx.text.TextUtil

com.alibaba.antx.config.wizard.text.ConfigWizard

com.alibaba.maven.plugin.docbook.WordBreaker

com.google.gwt.benchmarks.BenchmarkReport

com.google.gwt.junit.benchmarks.BenchmarkReport

com.ibm.icu.dev.demo.impl.DemoTextBox

com.ibm.icu.dev.demo.rbnf.DemoTextField

com.ibm.icu.dev.tool.docs.ICUTaglet$ICUObsoleteTaglet

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.