Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

could you provide a 4mc example for flink #36

Open
wangjian2019 opened this issue Dec 22, 2018 · 1 comment
Open

could you provide a 4mc example for flink #36

wangjian2019 opened this issue Dec 22, 2018 · 1 comment

Comments

@wangjian2019
Copy link

could you provide a 4mc example for flink when flink read 4mc data on HDFS files?

@carlomedas
Copy link
Collaborator

I don't have a Flink example ready to go but can give you some input.
We are going to use it soon in our data chain and when we have a decent example will also add in examples folder.

// 1) create hadoop config and set your hadoop host/stuff
Configuration hadoopConfig = new Configuration();
hadoopConfig.set("fs.defaultFS", "hdfs://yourHdfsHost:8020");
hadoopConfig.set("io.compression.codecs", "...."); // make sure to set codecs

// 2) get job conf and configure 4mc for your proto message
Job jobConf = Job.getInstance(hadoopConfig);
FourMcEbProtoInputFormat.setInputFormatClass(YOURMSG.YourProtoMessage.class, jobConf);

// 3) create input from hdfs
DataSet<Tuple2<LongWritable, ProtobufWritable>> input =
env.readHadoopFile(new FourMcEbProtoInputFormat(), LongWritable.class, ProtobufWritable.class, "hdfs://path_to_your_file.4mc", jobConf);

// 4) add more in union or selecting multiple files at once

// 5) use the data set

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants