flink~使用shell終端

蔡大遠發表於2020-10-21

前言

在spark中可以使用shell終端直接使用scala進行除錯和測試。在flink中也有類似的shell終端。

scala-shell local模式

Flink附帶了一個整合的互動式Scala Shell。它可以在本地模式和群集模式中使用。

要將shell與整合的Flink叢集一起使用,只需執行:

bin/start-scala-shell.sh local

注意:該命令整合了flink的執行環境,所以不需要啟動flink叢集。

scala-shell整合環境說明

shell支援Batch(是批處理)和Streaming(是流處理)。啟動後會自動預先繫結兩個不同的執行環境。可以使用"benv"和"senv"變數來分別訪問Batch和Streaming環境。

使用Batch環境

在scala shell中執行wordcount

啟動:bin/start-scala-shell.sh local
scala> val text = benv.fromElements(
     |   "To be, or not to be,--that is the question:--",
     |   "Whether 'tis nobler in the mind to suffer",
     |   "The slings and arrows of outrageous fortune",
     |   "Or to take arms against a sea of troubles,")
text: org.apache.flink.api.scala.DataSet[String] = org.apache.flink.api.scala.DataSet@479f738a

scala> val counts = text
counts: org.apache.flink.api.scala.DataSet[String] = org.apache.flink.api.scala.DataSet@479f738a

scala> val counts = text.flatMap { _.toLowerCase.split("\\W+") }.map { (_, 1) }.groupBy(0).sum(1)
counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = org.apache.flink.api.scala.AggregateDataSet@44f4c619

scala> counts.print()
(a,1)
(against,1)
(and,1)
(arms,1)
(arrows,1)
(be,2)
(fortune,1)
(in,1)
(is,1)
(mind,1)
(nobler,1)
(not,1)
(of,2)
(or,2)
(outrageous,1)
(question,1)
(sea,1)
(slings,1)
(suffer,1)
(take,1)
(that,1)
(the,3)
(tis,1)
(to,4)
(troubles,1)
(whether,1)

print() 命令會自動將指定的任務傳送到JobManager執行,並在終端中顯示計算結果。
也可以將結果寫入檔案。但是,在這種情況下,就需要呼叫execute,來執行您的程式:

Scala-Flink> benv.execute("MyProgram")

注意:只有在local模式下才會把輸出列印到終端,若是叢集模式,將不會列印到終端。

使用Streaming環境

在scala shell中通過DataStream API來計算wordcount

scala> val textStreaming = senv.fromElements(
     |   "To be, or not to be,--that is the question:--",
     |   "Whether 'tis nobler in the mind to suffer",
     |   "The slings and arrows of outrageous fortune",
     |   "Or to take arms against a sea of troubles,")
textStreaming: org.apache.flink.streaming.api.scala.DataStream[String] = org.apache.flink.streaming.api.scala.DataStream@22717282

scala> val countsStreaming = textStreaming .flatMap { _.toLowerCase.split("\\W+") } .map { (_, 1) }.keyBy(0).sum(1)
countsStreaming: org.apache.flink.streaming.api.scala.DataStream[(String, Int)] = org.apache.flink.streaming.api.scala.DataStream@4daa4a5a

scala> countsStreaming.print()
res7: org.apache.flink.streaming.api.datastream.DataStreamSink[(String, Int)] = org.apache.flink.streaming.api.datastream.DataStreamSink@7d957c96

scala> senv.execute("Streaming Wordcount")
(to,1)
(be,1)
(or,1)
(not,1)
(to,2)
(be,2)
(that,1)
(is,1)
(the,1)
(question,1)
(whether,1)
(tis,1)
(nobler,1)
(in,1)
(the,2)
(mind,1)
(to,3)
(suffer,1)
(the,3)
(slings,1)
(and,1)
(arrows,1)
(of,1)
(outrageous,1)
(fortune,1)
(or,2)
(to,4)
(take,1)
(arms,1)
(against,1)
(a,1)
(sea,1)
(of,2)
(troubles,1)
res8: org.apache.flink.api.common.JobExecutionResult = org.apache.flink.api.common.JobExecutionResult@7c950d4f

注意:在Streaming環境下,列印操作不會直接觸發執行。

bin/start-scala-shell.sh --help

通過以上幫助資訊可以知道,該命令支援remote和yarn叢集模式,並且可以通過該命令來新增外部依賴的jar包。

相關文章