Users are strongly encouraged to read the users guide for more information about the collation service before using this class.
Create a RuleBasedCollator from a locale by calling the getInstance(Locale) factory method in the base class Collator. Collator.getInstance(Locale) creates a RuleBasedCollator object based on the collation rules defined by the argument locale. If a customized collation ordering ar attributes is required, use the RuleBasedCollator(String) constructor with the appropriate rules. The customized RuleBasedCollator will base its ordering on UCA, while re-adjusting the attributes and orders of the characters in the specified rule accordingly.
RuleBasedCollator provides correct collation orders for most locales supported in ICU. If specific data for a locale is not available, the orders eventually falls back to the UCA collation order .
For information about the collation rule syntax and details about customization, please refer to the Collation customization section of the user's guide.
Note that there are some differences between the Collation rule syntax used in Java and ICU4J:
Modifier '!' : Turns on Thai/Lao vowel-consonant swapping. If this rule is in force when a Thai vowel of the range \U0E40-\U0E44 precedes a Thai consonant of the range \U0E01-\U0E2E OR a Lao vowel of the range \U0EC0-\U0EC4 precedes a Lao consonant of the range \U0E81-\U0EAE then the vowel is placed after the consonant for collation purposes.
If a rule is without the modifier '!', the Thai/Lao vowel-consonant swapping is not turned on.
ICU4J's RuleBasedCollator does not support turning off the Thai/Lao vowel-consonant swapping, since the UCA clearly states that it has to be supported to ensure a correct sorting order. If a '!' is encountered, it is ignored.
Examples
Creating Customized RuleBasedCollators:
Concatenating rules to combineString simple = "& a < b < c < d"; RuleBasedCollator simpleCollator = new RuleBasedCollator(simple); String norwegian = "& a , A < b , B < c , C < d , D < e , E " + "< f , F < g , G < h , H < i , I < j , " + "J < k , K < l , L < m , M < n , N < " + "o , O < p , P < q , Q < r , R < s , S < " + "t , T < u , U < v , V < w , W < x , X " + "< y , Y < z , Z < \u00E5 = a\u030A " + ", \u00C5 = A\u030A ; aa , AA < \u00E6 " + ", \u00C6 < \u00F8 , \u00D8"; RuleBasedCollator norwegianCollator = new RuleBasedCollator(norwegian);
Collator
s: Making changes to an existing RuleBasedCollator to create a new// Create an en_US Collator object RuleBasedCollator en_USCollator = (RuleBasedCollator) Collator.getInstance(new Locale("en", "US", "")); // Create a da_DK Collator object RuleBasedCollator da_DKCollator = (RuleBasedCollator) Collator.getInstance(new Locale("da", "DK", "")); // Combine the two // First, get the collation rules from en_USCollator String en_USRules = en_USCollator.getRules(); // Second, get the collation rules from da_DKCollator String da_DKRules = da_DKCollator.getRules(); RuleBasedCollator newCollator = new RuleBasedCollator(en_USRules + da_DKRules); // newCollator has the combined rules
Collator
object, by appending changes to the existing rule: How to change the order of non-spacing accents:// Create a new Collator object with additional rules String addRules = "& C < ch, cH, Ch, CH"; RuleBasedCollator myCollator = new RuleBasedCollator(en_USCollator.getRules() + addRules); // myCollator contains the new rules
Putting in a new primary ordering before the default setting, e.g. sort English characters before or after Japanese characters in the Japanese// old rule with main accents String oldRules = "= \u0301 ; \u0300 ; \u0302 ; \u0308 " + "; \u0327 ; \u0303 ; \u0304 ; \u0305 " + "; \u0306 ; \u0307 ; \u0309 ; \u030A " + "; \u030B ; \u030C ; \u030D ; \u030E " + "; \u030F ; \u0310 ; \u0311 ; \u0312 " + "< a , A ; ae, AE ; \u00e6 , \u00c6 " + "< b , B < c, C < e, E & C < d , D"; // change the order of accent characters String addOn = "& \u0300 ; \u0308 ; \u0302"; RuleBasedCollator myCollator = new RuleBasedCollator(oldRules + addOn);
Collator
: // get en_US Collator rules RuleBasedCollator en_USCollator = (RuleBasedCollator)Collator.getInstance(Locale.US); // add a few Japanese characters to sort before English characters // suppose the last character before the first base letter 'a' in // the English collation rule is \u2212 String jaString = "& \u2212 < \u3041, \u3042 < \u3043, " + "\u3044"; RuleBasedCollator myJapaneseCollator = new RuleBasedCollator(en_USCollator.getRules() + jaString);
This class is not subclassable
@author Syn Wee Quek @stable ICU 2.8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|