大資料協作框架Hue

forrestxingyunfei發表於2016-03-29

原文網址 : https://blog.csdn.net/youfashion/article/details/51002563

大資料協作框架Hue

大資料協作框架Hue

一，概述

1，參考文件

http://gethue.com/    官網
http://github.com/cloudera/hue   原始碼
http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6/   hue安裝指南

3，特性：

free&open source
be productive
100%compatible
4dynamic search dashboar with solr(動態的solr整合)
spark and hadoop notebooks

4,結構示意圖：
001.png-49.4kB

二，Hue的安裝和部署

1，下載原始碼包CDH5.3.6版本：

http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6.tar.gz

2,虛擬機器連線網際網路
3，安裝hue所以依賴的系統包，針對不同的unix系統,需要root許可權

yum install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel

4，解壓hue原始碼包到指定的目錄下

[root@xingyunfei001 app]# tar zxf hue-3.7.0-cdh5.3.6.tar.gz

5,編譯原始碼包

[root@xingyunfei001 app]# cd hue-3.7.0-cdh5.3.6/

[root@xingyunfei001 hue-3.7.0-cdh5.3.6]# make apps

6，修改配置檔案hue.ini

vi /opt/app/hue-3.7.0-cdh5.3.6/desktop/conf/hue.ini

  # Set this to a random string, the longer the better.
  # This is used for secure hashing in the session store.
  secret_key=qpbdxoewsqlkhztybvfidtvwekftusgdlofbcfghaswuicmqp

  # Webserver listens on this address and port
  http_host=xingyunfei001.comcn
  http_port=8888

  # Time zone name
  time_zone=Asia/Shanghai

7,啟動hue

[hadoop001@xingyunfei001 hue-3.7.0-cdh5.3.6]$ build/env/bin/supervisor

8,瀏覽器檢視

http://xingyunfei001.com.cn:8888/accounts/login/?next=/

001.jpg-28.5kB

三，hue整合hadoop2.x

1,修改hadoop的hdfs-site.xml配置檔案：

<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>

2,修改hadoop的core-site.xml配置檔案

<property>
  <name>hadoop.proxyuser.hue.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.hue.groups</name>
  <value>*</value>
</property>

3,修改hue的hue.ini配置檔案

[hadoop]

  # Configuration for HDFS NameNode
  # ------------------------------------------------------------------------
  [[hdfs_clusters]]
    # HA support by using HttpFs

    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://xingyunfei001.com.cn:8020

      # NameNode logical name.
      ## logical_name=

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      webhdfs_url=http://xingyunfei001.com.cn:50070/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false

      # Default umask for file and directory creation, specified in an octal value.
      ## umask=022

      # Directory of the Hadoop configuration
      hadoop_hdfs_home=/opt/app/hadoop_2.5.0_cdh

      hadoop_bin=/opt/app/hadoop_2.5.0_cdh/bin

      hadoop_conf_dir=/opt/app/hadoop_2.5.0_cdh/etc/hadoop

  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=xingyunfei001.com.cn

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032

      # Whether to submit jobs to this cluster
      submit_to=True

      # Resource Manager logical name (required for HA)
      ## logical_name=

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      resourcemanager_api_url=http://xingyunfei001.com.cn:8088

      # URL of the ProxyServer API
      proxy_api_url=http://xingyunfei001.com.cn:8088

      # URL of the HistoryServer API
      history_server_api_url=http://xingyunfei001.com.cn:19888

      # In secure mode (HTTPS), if SSL certificates from Resource Manager's
      # Rest Server have to be verified against certificate authority
      ## ssl_cert_ca_verify=False

    # HA support by specifying multiple clusters
    # e.g.

    # [[[ha]]]
      # Resource Manager logical name (required for HA)
      ## logical_name=my-rm-name

4,重新啟動hdfs

[hadoop001@xingyunfei001 hadoop_2.5.0_cdh]$ sbin/start-all.sh

[hadoop001@xingyunfei001 hadoop_2.5.0_cdh]$ sbin/mr-jobhistory-daemon.sh start historyserver

5,重新啟動hue伺服器

[hadoop001@xingyunfei001 hue-3.7.0-cdh5.3.6]$ build/env/bin/supervisor

6,檢視測試結果
001.jpg-34.4kB

四，hue整合hive

1,配置hue.ini配置檔案

[beeswax]

  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=xingyunfei001.com.cn

  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/opt/app/hive_0.13.1_cdh/conf

  # Timeout in seconds for thrift calls to Hive service
  server_conn_timeout=120

  # Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs.
  # If false, Hue will use the FetchResults() thrift call instead.
  ## use_get_log_api=true

  # Set a LIMIT clause when browsing a partitioned table.
  # A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
  ## browse_partitioned_table_limit=250

  # A limit to the number of rows that can be downloaded from a query.
  # A value of -1 means there will be no limit.
  # A maximum of 65,000 is applied to XLS downloads.
  ## download_row_limit=1000000

  # Hue will try to close the Hive query when the user leaves the editor page.
  # This will free all the query resources in HiveServer2, but also make its results inaccessible.
  ## close_queries=false

  # Thrift version to use when communicating with HiveServer2
  ## thrift_version=5

2,修改hive的hive-site.xml檔案配置metastore server

<property>
    <name>hive.metastore.uris</name>
    <value>thrift://xingyunfei001.com.cn:9083</value>
</property>

3,啟動metastore server(先啟動)和hiveserver2

nohup bin/hive --service metastore &

[hadoop001@xingyunfei001 hive_0.13.1_cdh]$ bin/hiveserver2

4,修改hdfs檔案系統的/tmp許可權

[hadoop001@xingyunfei001 hadoop_2.5.0_cdh]$ bin/hdfs dfs -chmod -R o+x /tmp

5,檢視配置是否生效

select id,url,referer from track_log limit 10;

001.jpg-59kB

五，hive整合RDBMS

1,修改hue.ini配置檔案

  [[databases]]
    # sqlite configuration.
    [[[sqlite]]]
      # Name to show in the UI.
      nice_name=SQLite

      # For SQLite, name defines the path to the database.
      name=/opt/app/hue-3.7.0-cdh5.3.6/desktop/desktop.db

      # Database backend to use.
      engine=sqlite

      # Database options to send to the server when connecting.
      # https://docs.djangoproject.com/en/1.4/ref/databases/
      ## options={}

    # mysql, oracle, or postgresql configuration.
    [[[mysql]]]
      # Name to show in the UI.
      nice_name="My SQL DB"

      # For MySQL and PostgreSQL, name is the name of the database.
      # For Oracle, Name is instance of the Oracle server. For express edition
      # this is 'xe' by default.
      ## name=mysqldb

      # Database backend to use. This can be:
      # 1. mysql
      # 2. postgresql
      # 3. oracle
      engine=mysql

      # IP or hostname of the database to connect to.
      host=xingyunfei001.com.cn

      # Port the database server is listening to. Defaults are:
      # 1. MySQL: 3306
      # 2. PostgreSQL: 5432
      # 3. Oracle Express Edition: 1521
      port=3306

      # Username to authenticate with when connecting to the database.
      user=root

      # Password matching the username to authenticate with when
      # connecting to the database.
      password=root

      # Database options to send to the server when connecting.
      # https://docs.djangoproject.com/en/1.4/ref/databases/
      ## options={}

2,重新啟動hue

build/env/bin/supervisor

3,檢視配置是否生效
001.jpg-34.4kB

大資料專案實踐（五）——Hue安裝
2018-08-27
大資料
大資料作業
2024-10-27
大資料
Spark如何與深度學習框架協作，處理非結構化資料
2020-06-18
Spark深度學習框架
細說資料庫協作運維
2023-04-17
資料庫運維
淺析大資料框架 Hadoop
2021-07-27
大資料框架Hadoop
大資料框架原理簡介
2020-12-30
大資料框架
大資料常用處理框架
2020-12-22
大資料框架
多變數資料協同可視探索框架
2019-01-14
變數框架
前端三大框架：資料繫結與資料流
2020-07-26
前端框架
Hadoop大資料開發框架學習
2018-08-31
Hadoop大資料框架
Lotame：全球視角下資料協作的狀態
2025-01-15
京東大資料：高溫天氣夏日清涼作戰大資料
2019-08-04
大資料
零基礎大資料學習框架
2019-05-31
大資料框架
Winterberry Group：2021年協作資料解決方案報告
2021-03-17
Hue安裝依賴
2018-08-20
DKhadoop大資料平臺基礎框架方案概述
2018-10-31
Hadoop大資料框架
大資料生態圈技術框架總攬
2019-01-07
大資料框架
資料視覺化大屏製作須知
2023-03-02
視覺化
Hue--整合Hive與Impala
2021-01-04
Hive
大快搜尋的大資料一體化開發框架下的大資料爬蟲安裝教程
2018-08-24
大資料框架爬蟲
基於AI的資料架構：業務在前，協作在後
2024-02-26
AI架構
取人類與大模型之長，人機協作式智慧軟體開發框架AgileGen來了
2024-11-19
大模型框架
DKHadoop大資料開發框架的構成模組
2018-10-19
Hadoop大資料框架
大資料“重磅炸彈”：實時計算框架 Flink
2019-04-08
大資料框架
盤點五種主流的大資料計算框架
2023-12-11
大資料框架
Spark大資料處理框架入門(單機版)
2021-05-21
Spark大資料框架
大資料框架對比 - Hadoop、Spark、Storm、Samza、Spark、Flink
2023-02-07
大資料框架HadoopSparkORM
大資料元件-Hive部署基於MySQL作為後設資料儲存
2022-04-16
大資料元件HiveMySql
大資料技術之Hadoop（入門）第2章從Hadoop框架討論大資料生態
2018-08-08
大資料Hadoop框架
Hue3.9 搭建整合【HDFS】【Hive】
2018-07-30
Hive
Hue--介紹及安裝
2021-01-04
全國資訊教育資料實時展示大屏製作教程
2022-03-08
團隊協作的五大障礙？用飛項輕鬆解決團隊協作難題
2021-07-27
重磅！Netflix開源大資料發現服務框架Metacat
2018-06-15
大資料框架
大資料系統框架中hadoop服務角色介紹
2018-09-11
大資料框架Hadoop
Python爬蟲框架：scrapy爬取高考派大學資料
2019-10-07
Python爬蟲框架
Hadoop大資料實戰系列文章之Mapreduce 計算框架
2020-11-10
Hadoop大資料框架
大資料框架之一——Hadoop學習第四天
2024-08-09
大資料框架Hadoop
EasyRelation釋出，簡單強大的資料關聯框架
2023-03-27
框架

大資料協作框架Hue

大資料協作框架Hue

一，概述

二，Hue的安裝和部署

三，hue整合hadoop2.x

四，hue整合hive

五，hive整合RDBMS

相關文章