Package org.apache.nutch.tools

Interface Summary
PruneIndexTool.PruneChecker This interface can be used to implement additional checking on matching documents.
 

Class Summary
DmozParser Utility that converts DMOZ RDF into a flat file of URLs to be injected.
FreeGenerator This tool generates fetchlists (segments to be fetched) from plain text files containing one URL per line.
FreeGenerator.FG  
PruneIndexTool This tool prunes existing Nutch indexes of unwanted content.
PruneIndexTool.PrintFieldsChecker This checker's main function is just to print out selected field values from each document, just before they are deleted.
PruneIndexTool.StoreUrlsChecker This checker's main function is just to store the URLs of each document to be deleted in a text file.
 



Copyright © 2006 The Apache Software Foundation