Package us.codecraft.webmagic.pipeline

Examples of us.codecraft.webmagic.pipeline.JsonFilePageModelPipeline


    @ExtractByUrl
    private String url;

    public static void main(String[] args) {
        OOSpider.create(Site.me().addStartUrl("https://github.com/explore").setSleepTime(0).setRetryTimes(3),
                new JsonFilePageModelPipeline(), GithubRepo.class)
                .scheduler(new FileCacheQueueScheduler("/data/webmagic/cache/")).thread(15).run();
    }
View Full Code Here


    @ExtractBy("//div[@class='BlogStat']/regex('\\d+-\\d+-\\d+\\s+\\d+:\\d+')")
    private Date date;

    public static void main(String[] args) {
        //results will be saved to "/data/webmagic/" in json format
        OOSpider.create(Site.me(), new JsonFilePageModelPipeline("/data/webmagic/"), OschinaBlog.class)
                .addUrl("http://my.oschina.net/flashsword/blog").run();
    }
View Full Code Here

TOP

Related Classes of us.codecraft.webmagic.pipeline.JsonFilePageModelPipeline

Copyright © 2018 www.massapicom. All rights reserved.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.