Hive中配置與編寫自定義UDF函式

loveheping發表於2018-01-12
環境介紹:CentOS7+hive-1.1.0-cdh5.7.0+IntelliJ IDEA+Maven3.3.9
1、建立工程
   開啟IntelliJ IDEA
     File-->New-->Project...-->Maven選擇Create from archetye-->org.apache.maven.archety:maven-archetype-quitkstart

2、配置
  2.1、增加內容如下:
   在工程中找到pom.xml檔案中hadoop-common 、hive-exec 、hive-jdbc

點選(此處)摺疊或開啟

  1. <properties>
  2.     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>


  3.     <!-- hadoop、hive版本資訊 -->
  4.     <hadoop.version>2.6.0-cdh5.7.0</hadoop.version>
  5.     <hive.version>1.1.0-cdh5.7.0</hive.version>
  6.   </properties>


  7.     <!-- hadoop依賴 -->
  8.     <dependency>
  9.       <groupId>org.apache.hadoop</groupId>
  10.       <artifactId>hadoop-common</artifactId>
  11.       <version>${hadoop.version}</version>
  12.     </dependency>


  13.     <!-- hive依賴 -->
  14.     <dependency>
  15.       <groupId>org.apache.hive</groupId>
  16.       <artifactId>hive-exec</artifactId>
  17.       <version>${hive.version}</version>
  18.     </dependency>
  19.   </dependencies>
  20.   
  21.   <!-- cdn資源倉庫 -->
  22.   <repositories>
  23.     <repository>
  24.       <id>cloudera</id>
  25.       <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
  26.     </repository>
  27.   </repositories>
  2.2、在<repository></repository>中修改內容如下:

點選(此處)摺疊或開啟

  1. <dependency>
  2.       <groupId>junit</groupId>
  3.       <artifactId>junit</artifactId>
  4.       <version>4.10</version>
  5.       <scope>test</scope>
  6.     </dependency>
3、建立類,並編寫一個UDF HelloUDF.java,程式碼如下:

點選(此處)摺疊或開啟

  1. package org.apache.hadoop.hive.ql.udf;


  2. import org.apache.hadoop.hive.ql.exec.UDF;
  3. import org.apache.hadoop.io.Text;


  4. public class HelloUDF extends UDF {
  5.     public Text evaluate(Text input) {
  6.         return new Text("Hello:" + input);
  7.     }


  8.     public static void main(String args[]){
  9.         HelloUDF helloUDF = new HelloUDF();
  10.         Text rs = helloUDF.evaluate(new Text("zhangsan"));
  11.         System.out.println(rs.toString());
  12.     }
  13. }
4、測試UDF類,在上又鍵選擇Run 'HelloUDF.main()'

5、打包:
   在IDEA選單中選擇view-->Tool Windows-->Maven Projects,然後在Maven Projects視窗中選擇【工程名】-->Lifecycle-->package,在package中右鍵選擇Run Maven Build開始打包
   執行成功後在日誌中找:
     [INFO] Building jar: D:\software\ruozedata_workspace\basic02-hive\target\hive-1.0.jar

若澤大資料交流群:671914634

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31511218/viewspace-2150099/,如需轉載,請註明出處,否則將追究法律責任。

相關文章