org.apache.nutch.indexer
Class DeleteDuplicates.InputFormat

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc>
      extended by org.apache.nutch.indexer.DeleteDuplicates.InputFormat
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc>
Enclosing class:
DeleteDuplicates

public static class DeleteDuplicates.InputFormat
extends org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc>


Nested Class Summary
 class DeleteDuplicates.InputFormat.DDRecordReader
           
 
Field Summary
 
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
LOG
 
Constructor Summary
DeleteDuplicates.InputFormat()
           
 
Method Summary
 org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter)
          Return each index as a split.
 org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits)
          Return each index as a split.
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, isSplitable, listPaths, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize, validateInput
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DeleteDuplicates.InputFormat

public DeleteDuplicates.InputFormat()
Method Detail

getSplits

public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
                                                       int numSplits)
                                                throws IOException
Return each index as a split.

Specified by:
getSplits in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc>
Overrides:
getSplits in class org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc>
Throws:
IOException

getRecordReader

public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                                                  org.apache.hadoop.mapred.JobConf job,
                                                                                                                  org.apache.hadoop.mapred.Reporter reporter)
                                                                                                           throws IOException
Return each index as a split.

Specified by:
getRecordReader in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc>
Specified by:
getRecordReader in class org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.Text,DeleteDuplicates.IndexDoc>
Throws:
IOException


Copyright © 2006 The Apache Software Foundation