Spark SQL,正則,regexp_replace

weixin_34019929發表於2018-06-03
val simpleColors=Seq("black","white","red","green","blue")
val regexString=simpleColors.map(_.toUpperCase).mkString("|")
df.select(regexp_replace(col("Description"),regexString,"COLOR")
.as("color_clean"),col("Description"))
.show(2)
spark.sql("select regexp_replace(Description,'black|white|red|green|blue','COLOR') 
as color_clean,Description from dfTable ").show(2)

相關文章