public class TokenRegexStage extends TextStage
Modifier and Type | Field and Description |
---|---|
static String |
OUT_NAME
The output folder name used returned by
Stage.outname() |
static String |
REGEX_KEY
The key where regexs are stored
|
Constructor and Description |
---|
TokenRegexStage(List<String> rstrings,
String[] args) |
Modifier and Type | Method and Description |
---|---|
Class<? extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>> |
mapper()
By default this method returns the
IdentityMapper class. |
String |
outname() |
void |
setup(org.apache.hadoop.mapreduce.Job job)
Add any final adjustments to the job's config
|
combiner, finished, lzoCompress, reducer, setCombinerClass, setMapperClass, setReducerClass, stage
public static final String REGEX_KEY
public static final String OUT_NAME
Stage.outname()
public TokenRegexStage(List<String> rstrings, String[] args)
rstrings
- the list of regexes to matchargs
- the arguments sent to the toolpublic void setup(org.apache.hadoop.mapreduce.Job job)
Stage
setup
in class Stage<org.apache.hadoop.mapreduce.lib.input.TextInputFormat,org.apache.hadoop.mapreduce.lib.output.TextOutputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>,org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
public Class<? extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>> mapper()
Stage
IdentityMapper
class. This
mapper outputs the values handed as they are.mapper
in class Stage<org.apache.hadoop.mapreduce.lib.input.TextInputFormat,org.apache.hadoop.mapreduce.lib.output.TextOutputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>,org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
public String outname()
outname
in class Stage<org.apache.hadoop.mapreduce.lib.input.TextInputFormat,org.apache.hadoop.mapreduce.lib.output.TextOutputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>,org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>