|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use ParseResult | |
---|---|
org.apache.nutch.analysis.lang | Text document language identifier. |
org.apache.nutch.microformats.reltag | A microformats Rel-Tag Parser/Indexer/Querier plugin. |
org.apache.nutch.parse | |
org.apache.nutch.parse.ext | |
org.apache.nutch.parse.html | An HTML document parsing plugin. |
org.apache.nutch.parse.js | |
org.apache.nutch.parse.ms | Common API for Microsoft © documents parsing. |
org.apache.nutch.parse.msexcel | A Microsoft © Excel document parsing plugin. |
org.apache.nutch.parse.mspowerpoint | A Microsoft © PowerPoint document parsing plugin. |
org.apache.nutch.parse.msword | A Microsoft © Word document parsing plugin. |
org.apache.nutch.parse.oo | |
org.apache.nutch.parse.pdf | A pdf parsing plugin. |
org.apache.nutch.parse.rss | |
org.apache.nutch.parse.swf | |
org.apache.nutch.parse.text | A plain text parsing plugin. |
org.apache.nutch.parse.zip | |
org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Uses of ParseResult in org.apache.nutch.analysis.lang |
---|
Methods in org.apache.nutch.analysis.lang that return ParseResult | |
---|---|
ParseResult |
HTMLLanguageParser.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Scan the HTML document looking at possible indications of content language 1. |
Methods in org.apache.nutch.analysis.lang with parameters of type ParseResult | |
---|---|
ParseResult |
HTMLLanguageParser.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Scan the HTML document looking at possible indications of content language 1. |
Uses of ParseResult in org.apache.nutch.microformats.reltag |
---|
Methods in org.apache.nutch.microformats.reltag that return ParseResult | |
---|---|
ParseResult |
RelTagParser.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Scan the HTML document looking at possible rel-tags |
Methods in org.apache.nutch.microformats.reltag with parameters of type ParseResult | |
---|---|
ParseResult |
RelTagParser.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Scan the HTML document looking at possible rel-tags |
Uses of ParseResult in org.apache.nutch.parse |
---|
Methods in org.apache.nutch.parse that return ParseResult | |
---|---|
static ParseResult |
ParseResult.createParseResult(String url,
Parse parse)
Convenience method for obtaining ParseResult from a single
Parse output. |
ParseResult |
HtmlParseFilters.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Run all defined filters. |
ParseResult |
HtmlParseFilter.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page. |
ParseResult |
ParseStatus.getEmptyParseResult(String url,
org.apache.hadoop.conf.Configuration conf)
A convenience method. |
ParseResult |
Parser.getParse(Content c)
This method parses the given content and returns a map of <key, parse> pairs. |
ParseResult |
ParseUtil.parse(Content content)
Performs a parse by iterating through a List of preferred Parser s
until a successful parse is performed and a Parse object is
returned. |
ParseResult |
ParseUtil.parseByExtensionId(String extId,
Content content)
Method parses a Content object using the Parser specified
by the parameter extId , i.e., the Parser's extension ID. |
Methods in org.apache.nutch.parse with parameters of type ParseResult | |
---|---|
ParseResult |
HtmlParseFilters.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Run all defined filters. |
ParseResult |
HtmlParseFilter.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page. |
Uses of ParseResult in org.apache.nutch.parse.ext |
---|
Methods in org.apache.nutch.parse.ext that return ParseResult | |
---|---|
ParseResult |
ExtParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.html |
---|
Methods in org.apache.nutch.parse.html that return ParseResult | |
---|---|
ParseResult |
HtmlParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.js |
---|
Methods in org.apache.nutch.parse.js that return ParseResult | |
---|---|
ParseResult |
JSParseFilter.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
|
ParseResult |
JSParseFilter.getParse(Content c)
|
Methods in org.apache.nutch.parse.js with parameters of type ParseResult | |
---|---|
ParseResult |
JSParseFilter.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
|
Uses of ParseResult in org.apache.nutch.parse.ms |
---|
Methods in org.apache.nutch.parse.ms that return ParseResult | |
---|---|
protected ParseResult |
MSBaseParser.getParse(MSExtractor extractor,
Content content)
Parses a Content with a specific Microsoft document
extractor . |
Uses of ParseResult in org.apache.nutch.parse.msexcel |
---|
Methods in org.apache.nutch.parse.msexcel that return ParseResult | |
---|---|
ParseResult |
MSExcelParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.mspowerpoint |
---|
Methods in org.apache.nutch.parse.mspowerpoint that return ParseResult | |
---|---|
ParseResult |
MSPowerPointParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.msword |
---|
Methods in org.apache.nutch.parse.msword that return ParseResult | |
---|---|
ParseResult |
MSWordParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.oo |
---|
Methods in org.apache.nutch.parse.oo that return ParseResult | |
---|---|
ParseResult |
OOParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.pdf |
---|
Methods in org.apache.nutch.parse.pdf that return ParseResult | |
---|---|
ParseResult |
PdfParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.rss |
---|
Methods in org.apache.nutch.parse.rss that return ParseResult | |
---|---|
ParseResult |
RSSParser.getParse(Content content)
Implementation method, parses the RSS content, and then returns a ParseImpl . |
Uses of ParseResult in org.apache.nutch.parse.swf |
---|
Methods in org.apache.nutch.parse.swf that return ParseResult | |
---|---|
ParseResult |
SWFParser.getParse(Content content)
|
Uses of ParseResult in org.apache.nutch.parse.text |
---|
Methods in org.apache.nutch.parse.text that return ParseResult | |
---|---|
ParseResult |
TextParser.getParse(Content content)
Parses plain text document. |
Uses of ParseResult in org.apache.nutch.parse.zip |
---|
Methods in org.apache.nutch.parse.zip that return ParseResult | |
---|---|
ParseResult |
ZipParser.getParse(Content content)
|
Uses of ParseResult in org.creativecommons.nutch |
---|
Methods in org.creativecommons.nutch that return ParseResult | |
---|---|
ParseResult |
CCParseFilter.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page. |
Methods in org.creativecommons.nutch with parameters of type ParseResult | |
---|---|
ParseResult |
CCParseFilter.filter(Content content,
ParseResult parseResult,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page. |
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |