DataSophon整合StreamPark2.1.5

xiongshengxiao發表於2024-11-28

為DataSophon製作streampark-2.1.5安裝包.md

下載並解壓streampark 2.1.5安裝包

StreamPark官網下載

wget -O  /opt/datasophon/DDP/packages/apache-streampark_2.12-2.1.5-incubating-bin.tar.gz https://www.apache.org/dyn/closer.lua/incubator/streampark/2.1.5/apache-streampark_2.12-2.1.5-incubating-bin.tar.gz?action=download

cd /opt/datasophon/DDP/packages/

tar -xzvf apache-streampark_2.12-2.1.5-incubating-bin.tar.gz

修改安裝包目錄名稱

保持和service_ddl.json中 decompressPackageName 一致

mv apache-streampark_2.12-2.1.5-incubating-bin streampark-2.1.5

修改 conf/config.yaml【可選】

  • 修改連線資訊

進入到 conf 下,修改 conf/config.yaml,找到 spring 這一項,找到 profiles.active 的配置,資料庫修改成MySQL即可,如下:

vi /opt/datasophon/DDP/packages/streampark-2.1.5/conf/config.yaml

spring:
  profiles.active: mysql #[h2,pgsql,mysql]
  application.name: StreamPark
  devtools.restart.enabled: false
  mvc.pathmatch.matching-strategy: ant_path_matcher
  servlet:
    multipart:
      enabled: true
      max-file-size: 500MB
      max-request-size: 500MB
  aop.proxy-target-class: true
  messages.encoding: utf-8
  jackson:
    date-format: yyyy-MM-dd HH:mm:ss
    time-zone: GMT+8
  main:
    allow-circular-references: true
    banner-mode: off

修改streampark-2.1.5/bin/jvm_opts.sh檔案

image-20241128121134489

(約21行)中增加prometheus_javaagent配置,如下:

vi /opt/datasophon/DDP/packages/streampark-2.1.5/bin/jvm_opts.sh

# 增加以下內容
-javaagent:$APP_HOME/jmx/jmx_prometheus_javaagent-0.16.1.jar=10086:$APP_HOME/jmx/prometheus_config.yml

如果是低一些的版本或者StreamPark2.1.1,需要直接在bin/streampark.sh檔案增加prometheus_javaagent,那個時候就可以直接把下面的內容增加進去。

image-20241127192856733

  • streampark.sh檔案在DEFAULT_OPTS(約271行)中增加prometheus_javaagent配置,如下:

vi /opt/datasophon/DDP/packages/streampark-2.1.5/bin/streampark.sh

DEFAULT_OPTS="""
  -ea
  -server
  -javaagent:$APP_HOME/jmx/jmx_prometheus_javaagent-0.16.1.jar=10086:$APP_HOME/jmx/prometheus_config.yml
  -Xms1024m
  -Xmx1024m
  -Xmn256m
  -XX:NewSize=100m
  -XX:+UseConcMarkSweepGC
  -XX:CMSInitiatingOccupancyFraction=70
  -XX:ThreadStackSize=512
  -Xloggc:${APP_HOME}/logs/gc.log
  """

注意:為什麼jvm_opts.sh檔案中只是需要加上一行內容,其他的不要了呢?據群裡大佬反饋:加了個引數壓根都沒用到

修改streampark-2.1.5/bin/streampark.sh檔案

vi /opt/datasophon/DDP/packages/streampark-2.1.5/bin/streampark.sh

增加prometheus_javaagent【省略】-新版本已把jvm配置外接到jvm_opts.sh

StreamPark2.1.5版本已經把jvm配置外接到jvm_opts.sh,直接參考步驟4修改即可,這裡可以直接不用看了。

修改start函式

image-20241127193258642

  • 在start函式中,local workspace=...略(約390行)下一行,增加 mkdir-p $workspace,如下
local workspace=`$_RUNJAVA -cp "$APP_LIB/*" $BASH_UTIL --get_yaml "streampark.workspace.local" "$CONFIG"`
   mkdir -p $workspace
   if [[ ! -d $workspace ]]; then
     echo_r "ERROR: streampark.workspace.local: \"$workspace\" is an invalid path, please reconfigure in $CONFIG"
     echo_r "NOTE: \"streampark.workspace.local\" should not be set under APP_HOME($APP_HOME) directory. Set it to a secure directory outside of APP_HOME."
     exit 1;
   fi

  if [[ ! -w $workspace ]] || [[ ! -r $workspace ]]; then
      echo_r "ERROR: streampark.workspace.local: \"$workspace\" Permission denied! "
      exit 1;
  fi

修改status函式

image-20241127193444824

  • 修改status函式(約564行)中增加exit 1,如下:
status() {
  # shellcheck disable=SC2155
  # shellcheck disable=SC2006
  local PID=$(get_pid)
  if [[ $PID -eq 0 ]]; then
    echo_r "StreamPark is not running"
	exit 1
  else
    echo_g "StreamPark is running pid is: $PID"
  fi
}

--完整streampark.sh檔案參考如下:

#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#
# -----------------------------------------------------------------------------
# Control Script for the StreamPark Server
#
# Environment Variable Prerequisites
#
#   APP_HOME   May point at your StreamPark "build" directory.
#
#   APP_BASE   (Optional) Base directory for resolving dynamic portions
#                   of a StreamPark installation.  If not present, resolves to
#                   the same directory that APP_HOME points to.
#
#   APP_CONF    (Optional) config path
#
#   APP_PID    (Optional) Path of the file which should contains the pid
#                   of the StreamPark startup java process, when start (fork) is
#                   used
# -----------------------------------------------------------------------------

# Bugzilla 37848: When no TTY is available, don't output to console
have_tty=0
# shellcheck disable=SC2006
if [[ "`tty`" != "not a tty" ]]; then
    have_tty=1
fi

# Bugzilla 37848: When no TTY is available, don't output to console
have_tty=0
# shellcheck disable=SC2006
if [[ "`tty`" != "not a tty" ]]; then
    have_tty=1
fi

 # Only use colors if connected to a terminal
if [[ ${have_tty} -eq 1 ]]; then
  PRIMARY=$(printf '\033[38;5;082m')
  RED=$(printf '\033[31m')
  GREEN=$(printf '\033[32m')
  YELLOW=$(printf '\033[33m')
  BLUE=$(printf '\033[34m')
  BOLD=$(printf '\033[1m')
  RESET=$(printf '\033[0m')
else
  PRIMARY=""
  RED=""
  GREEN=""
  YELLOW=""
  BLUE=""
  BOLD=""
  RESET=""
fi

echo_r () {
    # Color red: Error, Failed
    [[ $# -ne 1 ]] && return 1
    # shellcheck disable=SC2059
    printf "[%sStreamPark%s] %s$1%s\n"  $BLUE $RESET $RED $RESET
}

echo_g () {
    # Color green: Success
    [[ $# -ne 1 ]] && return 1
    # shellcheck disable=SC2059
    printf "[%sStreamPark%s] %s$1%s\n"  $BLUE $RESET $GREEN $RESET
}

echo_y () {
    # Color yellow: Warning
    [[ $# -ne 1 ]] && return 1
    # shellcheck disable=SC2059
    printf "[%sStreamPark%s] %s$1%s\n"  $BLUE $RESET $YELLOW $RESET
}

echo_w () {
    # Color yellow: White
    [[ $# -ne 1 ]] && return 1
    # shellcheck disable=SC2059
    printf "[%sStreamPark%s] %s$1%s\n"  $BLUE $RESET $WHITE $RESET
}

# OS specific support.  $var _must_ be set to either true or false.
cygwin=false
os400=false
# shellcheck disable=SC2006
case "`uname`" in
CYGWIN*) cygwin=true;;
OS400*) os400=true;;
esac

# resolve links - $0 may be a softlink
PRG="$0"

while [[ -h "$PRG" ]]; do
  # shellcheck disable=SC2006
  ls=`ls -ld "$PRG"`
  # shellcheck disable=SC2006
  link=`expr "$ls" : '.*-> \(.*\)$'`
  if expr "$link" : '/.*' > /dev/null; then
    PRG="$link"
  else
    # shellcheck disable=SC2006
    PRG=`dirname "$PRG"`/"$link"
  fi
done

# Get standard environment variables
# shellcheck disable=SC2006
PRG_DIR=`dirname "$PRG"`

# shellcheck disable=SC2006
# shellcheck disable=SC2164
APP_HOME=`cd "$PRG_DIR/.." >/dev/null; pwd`
APP_BASE="$APP_HOME"
APP_CONF="$APP_BASE"/conf
APP_LIB="$APP_BASE"/lib
APP_LOG="$APP_BASE"/logs
APP_PID="$APP_BASE"/.pid
APP_OUT="$APP_LOG"/streampark.out
# shellcheck disable=SC2034
APP_TMPDIR="$APP_BASE"/temp

# Ensure that any user defined CLASSPATH variables are not used on startup,
# but allow them to be specified in setenv.sh, in rare case when it is needed.
CLASSPATH=

if [[ -r "$APP_BASE/bin/setenv.sh" ]]; then
  # shellcheck disable=SC1090
  . "$APP_BASE/bin/setenv.sh"
elif [[ -r "$APP_HOME/bin/setenv.sh" ]]; then
  # shellcheck disable=SC1090
  . "$APP_HOME/bin/setenv.sh"
fi

# For Cygwin, ensure paths are in UNIX format before anything is touched
if ${cygwin}; then
  # shellcheck disable=SC2006
  [[ -n "$JAVA_HOME" ]] && JAVA_HOME=`cygpath --unix "$JAVA_HOME"`
  # shellcheck disable=SC2006
  [[ -n "$JRE_HOME" ]] && JRE_HOME=`cygpath --unix "$JRE_HOME"`
  # shellcheck disable=SC2006
  [[ -n "$APP_HOME" ]] && APP_HOME=`cygpath --unix "$APP_HOME"`
  # shellcheck disable=SC2006
  [[ -n "$APP_BASE" ]] && APP_BASE=`cygpath --unix "$APP_BASE"`
  # shellcheck disable=SC2006
  [[ -n "$CLASSPATH" ]] && CLASSPATH=`cygpath --path --unix "$CLASSPATH"`
fi

# Ensure that neither APP_HOME nor APP_BASE contains a colon
# as this is used as the separator in the classpath and Java provides no
# mechanism for escaping if the same character appears in the path.
case ${APP_HOME} in
  *:*) echo "Using APP_HOME:   $APP_HOME";
       echo "Unable to start as APP_HOME contains a colon (:) character";
       exit 1;
esac
case ${APP_BASE} in
  *:*) echo "Using APP_BASE:   $APP_BASE";
       echo "Unable to start as APP_BASE contains a colon (:) character";
       exit 1;
esac

# For OS400
if ${os400}; then
  # Set job priority to standard for interactive (interactive - 6) by using
  # the interactive priority - 6, the helper threads that respond to requests
  # will be running at the same priority as interactive jobs.
  COMMAND='chgjob job('${JOBNAME}') runpty(6)'
  system "${COMMAND}"

  # Enable multi threading
  export QIBM_MULTI_THREADED=Y
fi

# Get standard Java environment variables
if ${os400}; then
  # -r will Only work on the os400 if the files are:
  # 1. owned by the user
  # 2. owned by the PRIMARY group of the user
  # this will not work if the user belongs in secondary groups
  # shellcheck disable=SC1090
  . "$APP_HOME"/bin/setclasspath.sh
else
  if [[ -r "$APP_HOME"/bin/setclasspath.sh ]]; then
    # shellcheck disable=SC1090
    . "$APP_HOME"/bin/setclasspath.sh
  else
    echo "Cannot find $APP_HOME/bin/setclasspath.sh"
    echo "This file is needed to run this program"
    exit 1
  fi
fi

# Add on extra jar files to CLASSPATH
# shellcheck disable=SC2236
if [[ ! -z "$CLASSPATH" ]]; then
  CLASSPATH="$CLASSPATH":
fi
CLASSPATH="$CLASSPATH"

# For Cygwin, switch paths to Windows format before running java
if ${cygwin}; then
  # shellcheck disable=SC2006
  JAVA_HOME=`cygpath --absolute --windows "$JAVA_HOME"`
  # shellcheck disable=SC2006
  JRE_HOME=`cygpath --absolute --windows "$JRE_HOME"`
  # shellcheck disable=SC2006
  APP_HOME=`cygpath --absolute --windows "$APP_HOME"`
  # shellcheck disable=SC2006
  APP_BASE=`cygpath --absolute --windows "$APP_BASE"`
  # shellcheck disable=SC2006
  CLASSPATH=`cygpath --path --windows "$CLASSPATH"`
fi

# get jdk version, return version as an Integer.
JDK_VERSION=$("$_RUNJAVA" -version 2>&1 | grep -i 'version' | head -n 1 | cut -d '"' -f 2)
MAJOR_VER=$(echo "$JDK_VERSION" 2>&1 | cut -d '.' -f 1)
[[ $MAJOR_VER -eq 1 ]] && MAJOR_VER=$(echo "$JDK_VERSION" 2>&1 | cut -d '.' -f 2)
MIN_VERSION=8

if [[ $MAJOR_VER -lt $MIN_VERSION ]]; then
  echo "JDK Version: \"${JDK_VERSION}\", the version cannot be lower than 1.8"
  exit 1
fi

if [[ -z "$USE_NOHUP" ]]; then
  if $hpux; then
    USE_NOHUP="true"
  else
    USE_NOHUP="false"
  fi
fi
unset NOHUP
if [[ "$USE_NOHUP" = "true" ]]; then
  NOHUP="nohup"
fi

CONFIG="${APP_CONF}/application.yml"
# shellcheck disable=SC2006
if [[ -f "$CONFIG" ]] ; then
  echo_y """[WARN] in the \"conf\" directory, found the \"application.yml\" file. The \"application.yml\" file is deprecated.
     For compatibility, this application.yml will be used preferentially. The latest configuration file is \"config.yaml\". It is recommended to use \"config.yaml\".
     Note: \"application.yml\" will be completely deprecated in version 2.2.0. """
else
  CONFIG="${APP_CONF}/config.yaml"
  if [[ ! -f "$CONFIG" ]] ; then
    echo_r "can not found config.yaml in \"conf\" directory, please check."
    exit 1;
  fi
fi

BASH_UTIL="org.apache.streampark.console.base.util.BashJavaUtils"
APP_MAIN="org.apache.streampark.console.StreamParkConsoleBootstrap"
JVM_OPTS_FILE=${APP_HOME}/bin/jvm_opts.sh

JVM_ARGS=""
if [[ -f $JVM_OPTS_FILE ]]; then
  while read line
  do
      if [[ "$line" == -* ]]; then
        JVM_ARGS="${JVM_ARGS} $line"
      fi
  done < $JVM_OPTS_FILE
fi

JAVA_OPTS=${JAVA_OPTS:-"${JVM_ARGS}"}
JAVA_OPTS="$JAVA_OPTS -XX:HeapDumpPath=${APP_HOME}/logs/dump.hprof"
JAVA_OPTS="$JAVA_OPTS -Xloggc:${APP_HOME}/logs/gc.log"
[[ $MAJOR_VER -gt $MIN_VERSION ]] && JAVA_OPTS="$JAVA_OPTS --add-opens java.base/jdk.internal.loader=ALL-UNNAMED --add-opens jdk.zipfs/jdk.nio.zipfs=ALL-UNNAMED"
[[ $MAJOR_VER -ge 17 ]] && JAVA_OPTS="$JAVA_OPTS -Djava.security.manager=allow"

SERVER_PORT=$($_RUNJAVA -cp "$APP_LIB/*" $BASH_UTIL --get_yaml "server.port" "$CONFIG")
# ----- Execute The Requested Command -----------------------------------------

print_logo() {
  printf '\n'
  printf '      %s    _____ __                                             __       %s\n'          $PRIMARY $RESET
  printf '      %s   / ___// /_________  ____ _____ ___  ____  ____ ______/ /__     %s\n'          $PRIMARY $RESET
  printf '      %s   \__ \/ __/ ___/ _ \/ __ `/ __ `__ \/ __ \  __ `/ ___/ //_/     %s\n'          $PRIMARY $RESET
  printf '      %s  ___/ / /_/ /  /  __/ /_/ / / / / / / /_/ / /_/ / /  / ,<        %s\n'          $PRIMARY $RESET
  printf '      %s /____/\__/_/   \___/\__,_/_/ /_/ /_/ ____/\__,_/_/  /_/|_|       %s\n'          $PRIMARY $RESET
  printf '      %s                                   /_/                            %s\n\n'        $PRIMARY $RESET
  printf '      %s   Version:  2.1.5 %s\n'                                                         $BLUE   $RESET
  printf '      %s   WebSite:  https://streampark.apache.org%s\n'                                  $BLUE   $RESET
  printf '      %s   GitHub :  http://github.com/apache/streampark%s\n\n'                          $BLUE   $RESET
  printf '      %s   ──────── Apache StreamPark, Make stream processing easier ô~ô!%s\n\n'         $PRIMARY  $RESET
  if [[ "$1"x == "start"x ]]; then
    printf '      %s                   http://localhost:%s %s\n\n'                                 $PRIMARY $SERVER_PORT   $RESET
  fi
}

# shellcheck disable=SC2120
get_pid() {
  if [[ -f "$APP_PID" ]]; then
    if [[ -s "$APP_PID" ]]; then
      # shellcheck disable=SC2155
      # shellcheck disable=SC2006
      local PID=`cat "$APP_PID"`
      kill -0 $PID >/dev/null 2>&1
      # shellcheck disable=SC2181
      if [[ $? -eq 0 ]]; then
        echo $PID
        exit 0
      fi
    else
      rm -f "$APP_PID" >/dev/null 2>&1
    fi
  fi

  # shellcheck disable=SC2006
  if [[ "${SERVER_PORT}"x == ""x ]]; then
    echo_r "server.port is required, please check $CONFIG"
    exit 1;
  else
     # shellcheck disable=SC2006
      # shellcheck disable=SC2155
      local used=`$_RUNJAVA -cp "$APP_LIB/*" $BASH_UTIL --check_port "$SERVER_PORT"`
      if [[ "${used}"x == "used"x ]]; then
        # shellcheck disable=SC2006
        local PID=`jps -l | grep "$APP_MAIN" | awk '{print $1}'`
        # shellcheck disable=SC2236
        if [[ ! -z $PID ]]; then
          echo "$PID"
        else
          echo 0
        fi
      else
        echo 0
      fi
  fi
}

# shellcheck disable=SC2120
start() {
  # shellcheck disable=SC2006
  local PID=$(get_pid)

  if [[ $PID -gt 0 ]]; then
    # shellcheck disable=SC2006
    echo_r "StreamPark is already running pid: $PID , start aborted!"
    exit 1
  fi

  # Bugzilla 37848: only output this if we have a TTY
  if [[ ${have_tty} -eq 1 ]]; then
    echo_w "Using APP_BASE:   $APP_BASE"
    echo_w "Using APP_HOME:   $APP_HOME"
    if [[ "$1" = "debug" ]] ; then
      echo_w "Using JAVA_HOME:   $JAVA_HOME"
    else
      echo_w "Using JRE_HOME:   $JRE_HOME"
    fi
    echo_w "Using APP_PID:   $APP_PID"
  fi

   # shellcheck disable=SC2006
   local workspace=`$_RUNJAVA -cp "$APP_LIB/*" $BASH_UTIL --get_yaml "streampark.workspace.local" "$CONFIG"`
   mkdir -p $workspace
   if [[ ! -d $workspace ]]; then
     echo_r "ERROR: streampark.workspace.local: \"$workspace\" is an invalid path, please reconfigure in $CONFIG"
     echo_r "NOTE: \"streampark.workspace.local\" should not be set under APP_HOME($APP_HOME) directory. Set it to a secure directory outside of APP_HOME."
     exit 1;
   fi

  if [[ ! -w $workspace ]] || [[ ! -r $workspace ]]; then
      echo_r "ERROR: streampark.workspace.local: \"$workspace\" Permission denied! "
      exit 1;
  fi

  if [[ "${HADOOP_HOME}"x == ""x ]]; then
    echo_y "WARN: HADOOP_HOME is undefined on your system env."
  else
    echo_w "Using HADOOP_HOME:   ${HADOOP_HOME}"
  fi

  #
  # classpath options:
  # 1): java env (lib and jre/lib)
  # 2): StreamPark
  # 3): hadoop conf
  # shellcheck disable=SC2091
  local APP_CLASSPATH=".:${JAVA_HOME}/lib:${JAVA_HOME}/jre/lib"
  # shellcheck disable=SC2206
  # shellcheck disable=SC2010
  local JARS=$(ls "$APP_LIB"/*.jar | grep -v "$APP_LIB/streampark-flink-shims_.*.jar$")
  # shellcheck disable=SC2128
  for jar in $JARS;do
    APP_CLASSPATH=$APP_CLASSPATH:$jar
  done

  if [[ -n "${HADOOP_CONF_DIR}" ]] && [[ -d "${HADOOP_CONF_DIR}" ]]; then
    echo_w "Using HADOOP_CONF_DIR:   ${HADOOP_CONF_DIR}"
    APP_CLASSPATH+=":${HADOOP_CONF_DIR}"
  else
    APP_CLASSPATH+=":${HADOOP_HOME}/etc/hadoop"
  fi

  echo_g "JAVA_OPTS:  ${JAVA_OPTS}"

  eval $NOHUP $_RUNJAVA $JAVA_OPTS \
    -classpath "$APP_CLASSPATH" \
    -Dapp.home="${APP_HOME}" \
    -Djava.io.tmpdir="$APP_TMPDIR" \
    -Dlogging.config="${APP_CONF}/logback-spring.xml" \
    $APP_MAIN "$@" >> "$APP_OUT" 2>&1 "&"

    local PID=$!
    local IS_NUMBER="^[0-9]+$"

    # Add to pid file if successful start
    if [[ ${PID} =~ ${IS_NUMBER} ]] && kill -0 $PID > /dev/null 2>&1 ; then
        echo $PID > "$APP_PID"
        # shellcheck disable=SC2006
        echo_g "StreamPark start successful. pid: $PID"
    else
        echo_r "StreamPark start failed."
        exit 1
    fi
}

# shellcheck disable=SC2120
start_docker() {
  # Bugzilla 37848: only output this if we have a TTY
  if [[ ${have_tty} -eq 1 ]]; then
    echo_w "Using APP_BASE:   $APP_BASE"
    echo_w "Using APP_HOME:   $APP_HOME"
    if [[ "$1" = "debug" ]] ; then
      echo_w "Using JAVA_HOME:   $JAVA_HOME"
    else
      echo_w "Using JRE_HOME:   $JRE_HOME"
    fi
    echo_w "Using APP_PID:   $APP_PID"
  fi

  if [[ "${HADOOP_HOME}"x == ""x ]]; then
    echo_y "WARN: HADOOP_HOME is undefined on your system env,please check it."
  else
    echo_w "Using HADOOP_HOME:   ${HADOOP_HOME}"
  fi

  # classpath options:
  # 1): java env (lib and jre/lib)
  # 2): StreamPark
  # 3): hadoop conf
  # shellcheck disable=SC2091
  local APP_CLASSPATH=".:${JAVA_HOME}/lib:${JAVA_HOME}/jre/lib"
  # shellcheck disable=SC2206
  # shellcheck disable=SC2010
  local JARS=$(ls "$APP_LIB"/*.jar | grep -v "$APP_LIB/streampark-flink-shims_.*.jar$")
  # shellcheck disable=SC2128
  for jar in $JARS;do
    APP_CLASSPATH=$APP_CLASSPATH:$jar
  done

  if [[ -n "${HADOOP_CONF_DIR}" ]] && [[ -d "${HADOOP_CONF_DIR}" ]]; then
    echo_w "Using HADOOP_CONF_DIR:   ${HADOOP_CONF_DIR}"
    APP_CLASSPATH+=":${HADOOP_CONF_DIR}"
  else
    APP_CLASSPATH+=":${HADOOP_HOME}/etc/hadoop"
  fi

  JAVA_OPTS="$JAVA_OPTS -XX:-UseContainerSupport"

  echo_g "JAVA_OPTS:  ${JAVA_OPTS}"

  $_RUNJAVA $JAVA_OPTS \
    -classpath "$APP_CLASSPATH" \
    -Dapp.home="${APP_HOME}" \
    -Djava.io.tmpdir="$APP_TMPDIR" \
    -Dlogging.config="${APP_CONF}/logback-spring.xml" \
    $APP_MAIN

}

# shellcheck disable=SC2120
stop() {
  # shellcheck disable=SC2155
  # shellcheck disable=SC2006
  local PID=$(get_pid)

  if [[ $PID -eq 0 ]]; then
    echo_r "StreamPark is not running. stop aborted."
    exit 1
  fi

  shift

  local SLEEP=3

  # shellcheck disable=SC2006
  echo_g "StreamPark stopping with the PID: $PID"

  kill -9 "$PID"

  while [ $SLEEP -ge 0 ]; do
    # shellcheck disable=SC2046
    # shellcheck disable=SC2006
    kill -0 "$PID" >/dev/null 2>&1
    # shellcheck disable=SC2181
    if [[ $? -gt 0 ]]; then
      rm -f "$APP_PID" >/dev/null 2>&1
      if [[ $? != 0 ]]; then
        if [[ -w "$APP_PID" ]]; then
          cat /dev/null > "$APP_PID"
        else
          echo_r "The PID file could not be removed."
        fi
      fi
      echo_g "StreamPark stopped."
      break
    fi

    if [[ $SLEEP -gt 0 ]]; then
       sleep 1
    fi
    # shellcheck disable=SC2006
    # shellcheck disable=SC2003
    SLEEP=`expr $SLEEP - 1 `
  done

  if [[ "$SLEEP" -lt 0 ]]; then
     echo_r "StreamPark has not been killed completely yet. The process might be waiting on some system call or might be UNINTERRUPTIBLE."
  fi
}

status() {
  # shellcheck disable=SC2155
  # shellcheck disable=SC2006
  local PID=$(get_pid)
  if [[ $PID -eq 0 ]]; then
    echo_r "StreamPark is not running"
	exit 1
  else
    echo_g "StreamPark is running pid is: $PID"
  fi
}

restart() {
  # shellcheck disable=SC2119
  stop
  # shellcheck disable=SC2119
  start
}

main() {
  case "$1" in
    "start")
        shift
        start "$@"
        [[ $? -eq 0 ]] && print_logo "start"
        ;;
    "start_docker")
        print_logo
        start_docker
        ;;
    "stop")
        print_logo
        stop
        ;;
    "status")
        print_logo
        status
        ;;
    "restart")
        restart
        [[ $? -eq 0 ]] && print_logo "start"
        ;;
    *)
        echo_r "Unknown command: $1"
        echo_w "Usage: streampark.sh ( commands ... )"
        echo_w "commands:"
        echo_w "  start \$conf              Start StreamPark with application config."
        echo_w "  stop                      Stop StreamPark, wait up to 3 seconds and then use kill -KILL if still running"
        echo_w "  start_docker              start in docker or k8s mode"
        echo_w "  status                    StreamPark status"
        echo_w "  restart \$conf            restart StreamPark with application config."
        exit 0
        ;;
  esac
}

main "$@"

增加jmx資料夾

cp -r /opt/datasophon/hadoop-3.3.3/jmx /opt/datasophon/DDP/packages/streampark-2.1.5/

下載MySQL8驅動包至lib目錄

(streampark從某個版本後把mysql驅動包移除了)

wget -O /opt/datasophon/DDP/packages/streampark-2.1.5/lib/mysql-connector-java-8.0.29.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.29/mysql-connector-java-8.0.29.jar

複製streampark的 mysql-schema.sql 和 mysql-data.sql 指令碼出來備用

cp /opt/datasophon/DDP/packages/streampark-2.1.5/script/schema/mysql-schema.sql /opt/datasophon/DDP/packages/streampark_mysql-schema.sql

cp /opt/datasophon/DDP/packages/streampark-2.1.5/script/data/mysql-data.sql /opt/datasophon/DDP/packages/streampark_mysql-data.sql

打壓縮包並生成md5

tar -czf  streampark-2.1.5.tar.gz   streampark-2.1.5
md5sum streampark-2.1.5.tar.gz | awk '{print $1}' >streampark-2.1.5.tar.gz.md5

修改配置檔案service_ddl.json

image-20241127190554580

# 修改streampark的版本號
vi /opt/datasophon/datasophon-manager-1.2.1/conf/meta/DDP-1.2.1/STREAMPARK/service_ddl.json
{
  "name": "STREAMPARK",
  "label": "StreamPark",
  "description": "流處理極速開發框架,流批一體&湖倉一體的雲原生平臺,一站式流處理計算平臺",
  "version": "2.1.5",
  "sortNum": 13,
  "dependencies":[],
  "packageName": "streampark-2.1.5.tar.gz",
  "decompressPackageName": "streampark-2.1.5",
  "roles": [
    {
      "name": "StreamPark",
      "label": "StreamPark",
      "roleType": "master",
      "cardinality": "1",
      "logFile": "logs/streampark.out",
      "jmxPort": 10086,
      "startRunner": {
        "timeout": "60",
        "program": "bin/startup.sh",
        "args": [
        ]
      },
      "stopRunner": {
        "timeout": "600",
        "program": "bin/shutdown.sh",
        "args": [
        ]
      },
      "statusRunner": {
        "timeout": "60",
        "program": "bin/streampark.sh",
        "args": [
          "status"
        ]
      },
      "restartRunner": {
        "timeout": "60",
        "program": "bin/streampark.sh",
        "args": [
          "restart"
        ]
      },
      "externalLink": {
        "name": "StreamPark Ui",
        "label": "StreamPark Ui",
        "url": "http://${host}:${serverPort}"
      }
    }
  ],
  "configWriter": {
    "generators": [
      {
        "filename": "config.yaml",
        "configFormat": "custom",
        "outputDirectory": "conf",
        "templateName": "streampark.ftl",
        "includeParams": [
          "databaseUrl",
          "username",
          "password",
          "serverPort",
          "hadoopUserName",
          "workspaceLocal",
          "workspaceRemote"
        ]
      }
    ]
  },
  "parameters": [
    {
      "name": "databaseUrl",
      "label": "StreamPark資料庫地址",
      "description": "",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "jdbc:mysql://${apiHost}:3306/streampark?useSSL=false&useUnicode=true&characterEncoding=UTF-8&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8"
    },
    {
      "name": "username",
      "label": "StreamPark資料庫使用者名稱",
      "description": "",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "root"
    },
    {
      "name": "password",
      "label": "StreamPark資料庫密碼",
      "description": "",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "root"
    },
    {
      "name": "serverPort",
      "label": "StreamPark服務埠",
      "description": "",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "10000"
    },
    {
      "name": "hadoopUserName",
      "label": "StreamPark Hadoop操作使用者",
      "description": "",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "root"
    },
    {
      "name": "workspaceLocal",
      "label": "StreamPark本地工作空間目錄",
      "description": "自行建立,用於存放專案原始碼,構建的目錄等",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "/data/streampark/workspace"
    },
    {
      "name": "workspaceRemote",
      "label": "StreamPark HDFS工作空間目錄",
      "description": "HDFS工作空間目錄",
      "configType": "map",
      "required": true,
      "type": "input",
      "value": "",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "hdfs://${dfs.nameservices}/user/yarn/nodeLabelsstreampark"
    }
  ]
}

各節點修改streampark.ftl檔案

vi /opt/datasophon/datasophon-worker/conf/templates/streampark.ftl

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

logging:
  level:
    root: info

server:
  port: ${serverPort}
  session:
    # The user's login session has a validity period. If it exceeds this time, the user will be automatically logout
    # unit: s|m|h|d, s: second, m:minute, h:hour, d: day
    ttl: 2h # unit[s|m|h|d], e.g: 24h, 2d....
  undertow: # see: https://github.com/undertow-io/undertow/blob/master/core/src/main/java/io/undertow/Undertow.java
    buffer-size: 1024
    direct-buffers: true
    threads:
      io: 16
      worker: 256

# system database, default h2, mysql|pgsql|h2
datasource:
  dialect: mysql  #h2, mysql, pgsql
  h2-data-dir: ~/streampark/h2-data # if datasource.dialect is h2, you can configure the data dir
  # if datasource.dialect is mysql or pgsql, you need to configure the following connection information
  # mysql/postgresql/h2 connect user
  username: ${username}
  # mysql/postgresql/h2 connect password
  password: ${password}
  # mysql/postgresql connect jdbcURL
  # mysql example: datasource.url: jdbc:mysql://localhost:3306/streampark?useUnicode=true&characterEncoding=UTF-8&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8
  # postgresql example: jdbc:postgresql://localhost:5432/streampark?stringtype=unspecified
  url: ${databaseUrl}

streampark:
  workspace:
    # Local workspace, storage directory of clone projects and compiled projects,Do not set under $APP_HOME. Set it to a directory outside of $APP_HOME.
    local: ${workspaceLocal}
    # The root hdfs path of the jars, Same as yarn.provided.lib.dirs for flink on yarn-application and Same as --jars for spark on yarn
    remote: ${workspaceRemote}
  proxy:
    # lark proxy address, default https://open.feishu.cn
    lark-url:
    # hadoop yarn proxy path, e.g: knox process address https://streampark.com:8443/proxy/yarn
    yarn-url:
  yarn:
    # flink on yarn or spark on yarn, monitoring job status from yarn, it is necessary to set hadoop.http.authentication.type
    http-auth: 'simple'  # default simple, or kerberos
  # flink on yarn or spark on yarn, HADOOP_USER_NAME
  hadoop-user-name: ${hadoopUserName}
  project:
    # Number of projects allowed to be running at the same time , If there is no limit, -1 can be configured
    max-build: 16
  #openapi white-list, You can define multiple openAPI, separated by spaces(" ") or comma(,).
  openapi.white-list:

# flink on yarn or spark on yarn, when the hadoop cluster enable kerberos authentication, it is necessary to set Kerberos authentication parameters.
security:
  kerberos:
    login:
      debug: false
      enable: false
      keytab:
      krb5:
      principal:
    ttl: 2h # unit [s|m|h|d]

# sign streampark with ldap.
ldap:
  base-dn: dc=streampark,dc=com  # Login Account
  enable: false  # ldap enabled'
  username: cn=Manager,dc=streampark,dc=com
  password: streampark
  urls: ldap://99.99.99.99:389 #AD server IP, default port 389
  user:
    email-attribute: mail
    identity-attribute: uid

重啟

各節點worker重啟

sh /opt/datasophon/datasophon-worker/bin/datasophon-worker.sh restart worker

主節點重啟api

sh /opt/datasophon/datasophon-manager-1.2.1/bin/datasophon-api.sh restart api

手動建立資料庫並且執行初始化SQL

初始化StreamPark資料庫。

mysql -u root -p -e "CREATE DATABASE streampark DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci"

執行/opt/datasophon/DDP/packages目錄下streampark.sql建立streampark資料庫表。

use streampark;
source /opt/datasophon/DDP/packages/streampark.sql
source /opt/datasophon/DDP/packages/streampark_mysql-schema.sql
source /opt/datasophon/DDP/packages/streampark_mysql-data.sql

安裝StreamPark

新增StreamPark。

image-20241128200027251

分配streampark角色,根據實際選擇安裝StreamPark在哪個節點機器

image-20241128200144809

image-20241128200215403

直接下一步

根據實際情況修改相關配置。

image-20241128200249271

根據實際情況,修改streampark配置。

image-20241128200546043

到此,DataSophon整合StreamPark2.1.5成功!

相關文章