一、封裝類:
public class Bean implements Writable {
private String orderId;
private String userId;
private String name;
private String age;
private String userName;
private String firstName;
public Bean() {
}
public Bean(String orderId, String userId, String name, String age, String userName, String firstName) {
this.orderId = orderId;
this.userId = userId;
this.name = name;
this.age = age;
this.userName = userName;
this.firstName = firstName;
}
public String getOrderId() {
return orderId;
}
public void setOrderId(String orderId) {
this.orderId = orderId;
}
public String getUserId() {
return userId;
}
public void setUserId(String userId) {
this.userId = userId;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getAge() {
return age;
}
public void setAge(String age) {
this.age = age;
}
public String getUserName() {
return userName;
}
public void setUserName(String userName) {
this.userName = userName;
}
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
@Override
public String toString() {
return "orderId='" + orderId + '\'' +
", userId='" + userId + '\'' +
", name='" + name + '\'' +
", age='" + age + '\'' +
", userName='" + userName + '\'' +
", firstName='" + firstName + '\'' ;
}
@Override
public void write(DataOutput dataOutput) throws IOException {
dataOutput.writeUTF(this.orderId);
dataOutput.writeUTF(this.userId);
dataOutput.writeUTF(this.name);
dataOutput.writeUTF(this.age);
dataOutput.writeUTF(this.userName);
dataOutput.writeUTF(this.firstName);
}
@Override
public void readFields(DataInput dataInput) throws IOException {
this.orderId = dataInput.readUTF();
this.userId = dataInput.readUTF();
this.name = dataInput.readUTF();
this.age = dataInput.readUTF();
this.userName = dataInput.readUTF();
this.firstName = dataInput.readUTF();
}
}
複製程式碼
二、Mapper類:
public class GoodsMapper extends Mapper<LongWritable,Text,Text,JoinBean> {
private String fileName;
JoinBean joinBean = new JoinBean();
@Override
protected void setup(Context context) throws IOException, InterruptedException {
FileSplit split = (FileSplit) context.getInputSplit();
fileName = split.getPath().getName();
}
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String[] lien = value.toString().split(",");
if (fileName.startsWith("order")){
joinBean.setOrderId(lien[0]);
joinBean.setUserId(lien[1]);
joinBean.setAge("NULL");
joinBean.setName("NULL");
joinBean.setUserName("NULL");
joinBean.setFistName("order");
}
if (fileName.startsWith("user")){
joinBean.setOrderId("NULL");
joinBean.setUserId(lien[0]);
joinBean.setAge(lien[2]);
joinBean.setName(lien[1]);
joinBean.setUserName(lien[3]);
joinBean.setFistName("user");
}
//以使用者名稱為key ,把資料進行分檔案組合成javabean,然後放入context中,shuffle會自動分割槽,排序map階段的資料。
context.write(new Text(joinBean.getUserId()),joinBean);
}
}
複製程式碼
三、Reducer類:
1.注意for迴圈迭代器Values的括號(引發錯誤:重複K,null)!
2.注意將BeanUtils.寫在if比較語句的裡面!簡便了bean.setOrderId(value.getOrderId());
3.注意遍歷連結串列裡的Bean時,context輸出的K,V要寫在for括號的裡面!
4.注意複製檔案BeanUtils.copyProperties(joinBean1,value);
public class GoodsReducer extends Reducer<Text,JoinBean,JoinBean,NullWritable> {
@Override
protected void reduce(Text key, Iterable<JoinBean> values, Context context) throws IOException, InterruptedException {
//user類集合
List<JoinBean> joinBeans = new ArrayList<>();
//儲存orderid的物件
JoinBean joinBean = new JoinBean();
//區分物件是order還是user
for (JoinBean value : values) {
if (value.getFistName().equals("order")){
joinBean.setOrderId(value.getOrderId());
joinBean.setUserId(value.getUserId());
}
if (value.getFistName().equals("user")){
JoinBean joinBean1 = new JoinBean();
try {
BeanUtils.copyProperties(joinBean1,value);
} catch (IllegalAccessException e) {
e.printStackTrace();
} catch (InvocationTargetException e) {
e.printStackTrace();
}
joinBeans.add(joinBean1);
}
}
for (JoinBean bean : joinBeans) {
bean.setOrderId(joinBean.getOrderId());
context.write(bean,NullWritable.get());
}
}
}
複製程式碼
四、測試類:
1.建立Configuration環境
2.建立Job新增conf環境
3.設定整個job所用的那些類在哪個Jar包
4.設定job使用的mapper和reducer的類
5.設定mapper的輸出資料KV型別
6.指定reducer的輸出資料KV型別
7.指定要處理的輸入資料存放路徑
8.指定要處理的輸出資料存放路徑
9.將job提交給叢集執行