Class StreamingStep
- java.lang.Object
-
- com.amazonaws.services.elasticmapreduce.util.StreamingStep
-
public class StreamingStep extends Object
Class that makes it easy to define Hadoop Streaming steps.See also: Hadoop Streaming
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey); AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(credentials); HadoopJarStepConfig config = new StreamingStep() .withInputs("s3://elasticmapreduce/samples/wordcount/input") .withOutput("s3://my-bucket/output/") .withMapper("s3://elasticmapreduce/samples/wordcount/wordSplitter.py") .withReducer("aggregate") .toHadoopJarStepConfig(); StepConfig wordCount = new StepConfig() .withName("Word Count") .withActionOnFailure("TERMINATE_JOB_FLOW") .withHadoopJarStep(config); RunJobFlowRequest request = new RunJobFlowRequest() .withName("Word Count") .withSteps(wordCount) .withLogUri("s3://log-bucket/") .withInstances(new JobFlowInstancesConfig() .withEc2KeyName("keypairt") .withHadoopVersion("0.20") .withInstanceCount(5) .withKeepJobFlowAliveWhenNoSteps(true) .withMasterInstanceType("m1.small") .withSlaveInstanceType("m1.small")); RunJobFlowResult result = emr.runJobFlow(request);
-
-
Constructor Summary
Constructors Constructor Description StreamingStep()Creates a new default StreamingStep.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Map<String,String>getHadoopConfig()Get the Hadoop config overrides (-D values).List<String>getInputs()Get list of step input paths.StringgetMapper()Get the mapper.StringgetOutput()Get output path.StringgetReducer()Get the reducervoidsetHadoopConfig(Map<String,String> hadoopConfig)Set the Hadoop config overrides (-D values).voidsetInputs(Collection<String> inputs)Set the list of step input paths.voidsetMapper(String mapper)Set the mapper.voidsetOutput(String output)Set the output path for this step.voidsetReducer(String reducer)Set the reducerHadoopJarStepConfigtoHadoopJarStepConfig()Creates the final HadoopJarStepConfig once you are done configuring the step.StreamingStepwithHadoopConfig(String key, String value)Add a Hadoop config override (-D value).StreamingStepwithInputs(String... inputs)Add more input paths to this step.StreamingStepwithMapper(String mapper)Set the mapperStreamingStepwithOutput(String output)Set the output path for this step.StreamingStepwithReducer(String reducer)Set the reducer
-
-
-
Method Detail
-
getInputs
public List<String> getInputs()
Get list of step input paths.- Returns:
- List of step inputs
-
setInputs
public void setInputs(Collection<String> inputs)
Set the list of step input paths.- Parameters:
inputs- List of step inputs.
-
withInputs
public StreamingStep withInputs(String... inputs)
Add more input paths to this step.- Parameters:
inputs- A list of inputs to this step.- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getOutput
public String getOutput()
Get output path.- Returns:
- Output path.
-
setOutput
public void setOutput(String output)
Set the output path for this step.- Parameters:
output- Output path.
-
withOutput
public StreamingStep withOutput(String output)
Set the output path for this step.- Parameters:
output- Output path- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getMapper
public String getMapper()
Get the mapper.- Returns:
- Mapper.
-
setMapper
public void setMapper(String mapper)
Set the mapper.- Parameters:
mapper- Mapper
-
withMapper
public StreamingStep withMapper(String mapper)
Set the mapper- Parameters:
mapper- Mapper- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getReducer
public String getReducer()
Get the reducer- Returns:
- Reducer
-
setReducer
public void setReducer(String reducer)
Set the reducer- Parameters:
reducer- Reducer
-
withReducer
public StreamingStep withReducer(String reducer)
Set the reducer- Parameters:
reducer- Reducer- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getHadoopConfig
public Map<String,String> getHadoopConfig()
Get the Hadoop config overrides (-D values).- Returns:
- Hadoop config.
-
setHadoopConfig
public void setHadoopConfig(Map<String,String> hadoopConfig)
Set the Hadoop config overrides (-D values).- Parameters:
hadoopConfig- Hadoop config.
-
withHadoopConfig
public StreamingStep withHadoopConfig(String key, String value)
Add a Hadoop config override (-D value).- Parameters:
key- Hadoop configuration key.value- Configuration value.- Returns:
- A reference to this updated object so that method calls can be chained together.
-
toHadoopJarStepConfig
public HadoopJarStepConfig toHadoopJarStepConfig()
Creates the final HadoopJarStepConfig once you are done configuring the step. You can use this as you would any other HadoopJarStepConfig.- Returns:
- HadoopJarStepConfig representing this streaming step.
-
-