private final static String LILIUM_WIKI = "Lilium (members of which are true lilies) is a genus of herbaceous flowering plants growing from bulbs, all with large prominent flowers. Lilies are a group of flowering plants which are important in culture and literature in much of the world. Most species are native to the temperate northern hemisphere, though their range extends into the northern subtropics. Many other plants have \"lily\" in their common name but are not related to true lilies.";
private final static String ROSE_WIKI = "A rose is a woody perennial of the genus Rosa, within the family Rosaceae. There are over 100 species. They form a group of plants that can be erect shrubs, climbing or trailing with stems that are often armed with sharp prickles. Flowers vary in size and shape and are usually large and showy, in colours ranging from white through yellows and reds. Most species are native to Asia, with smaller numbers native to Europe, North America, and northwest Africa. Species, cultivars and hybrids are all widely grown for their beauty and often are fragrant. Rose plants range in size from compact, miniature roses, to climbers that can reach 7 meters in height. Different species hybridize easily, and this has been used in the development of the wide range of garden roses.";
@Test
public void testWithSmallWiki() {
EnglishTokenizer tokenizer = new EnglishTokenizer();
KLDClassifier kldClassifier = new KLDClassifier(2);
kldClassifier.update(0, tokenizer.tokenize(NOSQL_WIKI));
kldClassifier.update(0, tokenizer.tokenize(MYSQL_WIKI));
kldClassifier.update(1, tokenizer.tokenize(LILIUM_WIKI));
kldClassifier.update(1, tokenizer.tokenize(ROSE_WIKI));
assertEquals(0, (int) kldClassifier.classify(tokenizer.tokenize(DATABASE_WIKI)));
assertEquals(1, (int) kldClassifier.classify(tokenizer.tokenize(FLOWER_WIKI)));
}