Package com.scaleunlimited.helpful.operations

Examples of com.scaleunlimited.helpful.operations.ParseModMboxPageFunction


           
            BaseFetcher fetcher = new SimpleHttpFetcher(MAX_THREADS, userAgent);
            FetchPipe fetchPagePipe = new FetchPipe(importPipe, scorer, fetcher, NUM_REDUCERS);
           
            // Here's the pipe that will output UrlDatum tuples, by extracting URLs from the mod_mbox-generated page.
        Pipe mboxPagePipe = new Each(fetchPagePipe.getContentTailPipe(), new ParseModMboxPageFunction(), Fields.RESULTS);

        // Create a named pipe for the status of the mod_mbox-generated pages.
            Pipe mboxPageStatusPipe = new Pipe(MBOX_PAGE_STATUS_PIPE_NAME, fetchPagePipe.getStatusTailPipe());

            // Set up appropriate FetcherPolicy, where we increase the max content size (since mailbox files
View Full Code Here


    Pipe parsePipe = new Pipe("mod_mbox page parser", fetchedDatumProvider);
   
    // The fetchedDatumProvider will pass us a stream of FetchedDatum tuples. For each,
    // we want to parse the HTML and extract the actual mbox file URLs, which we'll
    // pass on as UrlDatum tuples.
    parsePipe = new Each(parsePipe, new ParseModMboxPageFunction());
   
    setTails(parsePipe);
  }
View Full Code Here

TOP

Related Classes of com.scaleunlimited.helpful.operations.ParseModMboxPageFunction

Copyright © 2018 www.massapicom. All rights reserved.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.