Spark修煉之道(基礎篇)——Linux大資料開發基礎:第十五節:基礎正規表示式(一)

五柳-先生發表於2015-11-15

參考書目:鳥哥的LINUX私房菜基礎學習篇(第三版) 
Linux Shell Scripting Cookbook

本節主要內容

  1. 基礎正規表示式

1. 基礎正規表示式

(1)^行開始符

^匹配一行的開始,例如’^Spark’ 匹配所有Spark開始的行

<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//grep -n表示查詢到的結果顯示行號
root@sparkslave02:~/ShellLearning<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '^Spark' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>:Spark is <span class="hljs-operator" style="box-sizing: border-box;">a</span> fast <span class="hljs-operator" style="box-sizing: border-box;">and</span> general cluster computing <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">system</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> Big Data. It provides
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">22</span>:Spark is built <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">using</span> [Apache Maven](<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">http</span>://maven.apache.org/).
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">53</span>:Spark also comes <span class="hljs-operator" style="box-sizing: border-box;">with</span> several sample programs <span class="hljs-operator" style="box-sizing: border-box;">in</span> <span class="hljs-operator" style="box-sizing: border-box;">the</span> `examples` <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">directory</span>.
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">83</span>:Spark uses <span class="hljs-operator" style="box-sizing: border-box;">the</span> Hadoop core library <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">to</span> talk <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">to</span> HDFS <span class="hljs-operator" style="box-sizing: border-box;">and</span> other Hadoop-supported
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>

這裡寫圖片描述

(2)$行結束符

Spark[Math Processing Error]’ 匹配所有以Spark結束的行

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n 'Spark$' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Apache Spark</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">## Building Spark</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>

這裡寫圖片描述

(3).匹配任意一個字元

例如 Spa.k可以匹配Spark、Spaak等

<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root@sparkslave02:~/ShellLearning<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n 'Spa.k' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>:<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Apache Spark</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>:Spark is <span class="hljs-operator" style="box-sizing: border-box;">a</span> fast <span class="hljs-operator" style="box-sizing: border-box;">and</span> general cluster computing <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">system</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> Big Data. It provides
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>:rich <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">set</span> <span class="hljs-operator" style="box-sizing: border-box;">of</span> higher-level tools including Spark SQL <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> SQL <span class="hljs-operator" style="box-sizing: border-box;">and</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">
//其它省略</span>
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>

這裡寫圖片描述

上面沒有匹配小寫spark,要匹配可以採用

<code class="hljs mel has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//-i選項表示忽略大小寫</span>
root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span>:~/ShellLearning# grep -<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'Spa.k'</span> /hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/README.md </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>

這裡寫圖片描述

(4)[]匹配其中一個

[Ss]park只匹配Spark和spark

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[Ss]park' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Apache Spark</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:Spark</span> is a fast <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">and</span> general cluster computing system <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span class="hljs-constant" style="box-sizing: border-box;">Big</span> <span class="hljs-constant" style="box-sizing: border-box;">Data</span>. <span class="hljs-constant" style="box-sizing: border-box;">It</span> provides
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:rich</span> set of higher-level tools including <span class="hljs-constant" style="box-sizing: border-box;">Spark</span> <span class="hljs-constant" style="box-sizing: border-box;">SQL</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span class="hljs-constant" style="box-sizing: border-box;">SQL</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">and</span> <span class="hljs-constant" style="box-sizing: border-box;">DataFrames</span>,
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">8</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:and</span> <span class="hljs-constant" style="box-sizing: border-box;">Spark</span> <span class="hljs-constant" style="box-sizing: border-box;">Streaming</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> stream processing.
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:<http</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">://spark</span>.apache.org/>
<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">//</span>其它省略
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

這裡寫圖片描述

(5) [^]不匹配[]中的任何一個字元

例如 ‘[^T]he’ ,不匹配The,但可匹配 the、che等

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[^T]he' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>

這裡寫圖片描述

(6) [-]匹配固定範圍的字元

例如[a-h]he,只匹配ahe、bhe、che…hhe,不匹配ihe、the等

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[a-h]he' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Apache Spark</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:rich</span> set of higher-level tools including <span class="hljs-constant" style="box-sizing: border-box;">Spark</span> <span class="hljs-constant" style="box-sizing: border-box;">SQL</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span class="hljs-constant" style="box-sizing: border-box;">SQL</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">and</span> <span class="hljs-constant" style="box-sizing: border-box;">DataFrames</span>,
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:<http</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">://spark</span>.apache.org/>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">16</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:guide</span>, on the [project web page](<span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">http:</span>/<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/spark.apache.org/documentation</span>.html)
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>

這裡寫圖片描述

(7)? 匹配0次或1次

例如t?he只匹配he和the,不匹配tthe

<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//?屬於特殊符號,需要\進行轉義
root@sparkslave02:~/ShellLearning<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n 't\?he' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>:<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Apache Spark</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>:rich <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">set</span> <span class="hljs-operator" style="box-sizing: border-box;">of</span> higher-level tools including Spark SQL <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> SQL <span class="hljs-operator" style="box-sizing: border-box;">and</span> DataFrames,
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>:<<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">http</span>://spark.apache.org/>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>:You can find <span class="hljs-operator" style="box-sizing: border-box;">the</span> latest Spark documentation, including <span class="hljs-operator" style="box-sizing: border-box;">a</span> programming
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">16</span>:guide, <span class="hljs-command" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">on</span> <span class="hljs-title" style="box-sizing: border-box;">the</span> [<span class="hljs-title" style="box-sizing: border-box;">project</span> <span class="hljs-title" style="box-sizing: border-box;">web</span> <span class="hljs-title" style="box-sizing: border-box;">page</span>](<span class="hljs-title" style="box-sizing: border-box;">http</span>://<span class="hljs-title" style="box-sizing: border-box;">spark</span>.<span class="hljs-title" style="box-sizing: border-box;">apache</span>.<span class="hljs-title" style="box-sizing: border-box;">org</span>/<span class="hljs-title" style="box-sizing: border-box;">documentation</span>.<span class="hljs-title" style="box-sizing: border-box;">html</span>)</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">
//其它省略</span>
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li></ul>

這裡寫圖片描述

(8)+ 至少匹配一次

‘S+park’可以匹配Spark、SSpark、SSSpark等

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n 'S\+park' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>

這裡寫圖片描述

(9) * 匹配零次或多少

‘S*park’可匹配park、Spark、SSpark、SSSpark等

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n 'S*park' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Apache Spark</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:Spark</span> is a fast <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">and</span> general cluster computing system <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span class="hljs-constant" style="box-sizing: border-box;">Big</span> <span class="hljs-constant" style="box-sizing: border-box;">Data</span>. <span class="hljs-constant" style="box-sizing: border-box;">It</span> provides
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:rich</span> set of higher-level tools including <span class="hljs-constant" style="box-sizing: border-box;">Spark</span> <span class="hljs-constant" style="box-sizing: border-box;">SQL</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span class="hljs-constant" style="box-sizing: border-box;">SQL</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">and</span> <span class="hljs-constant" style="box-sizing: border-box;">DataFrames</span>,
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">8</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:and</span> <span class="hljs-constant" style="box-sizing: border-box;">Spark</span> <span class="hljs-constant" style="box-sizing: border-box;">Streaming</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> stream processing.
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:<http</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">://spark</span>.apache.org/>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:You</span> can find the latest <span class="hljs-constant" style="box-sizing: border-box;">Spark</span> documentation, including a programming
/<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/其它省略</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

這裡寫圖片描述

(10) {n},匹配n次

例如[a-z]{3},匹配任意3個小寫字母,等同於[a-z][a-z][a-z]

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[a-z]\{3\}' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Apache Spark</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:Spark</span> is a fast <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">and</span> general cluster computing system <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> <span class="hljs-constant" style="box-sizing: border-box;">Big</span> <span class="hljs-constant" style="box-sizing: border-box;">Data</span>. <span class="hljs-constant" style="box-sizing: border-box;">It</span> provides
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li></ul>

這裡寫圖片描述

(11) 其它限定次數匹配

{n, }至少匹配n次 
{n, m}至少匹配n次,最多匹配m次

(13) 轉義字元\

Ubuntu Linux ?,+,(,), {,}是特殊字元,在使用正規表示式時,如果不加轉義符,會匹配將其視為一般字元,如果要設定為正規表示式式符,需要使用\進行轉義,前面的例子已經給出示例。

(14) ()匹配一組字元

例如Sp(ar)\?k 匹配Spark和Spk,

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># echo "Spark Spk Spak" | grep -n 'Sp\(ar\)\?k'</span>
<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:Spark</span> <span class="hljs-constant" style="box-sizing: border-box;">Spk</span> <span class="hljs-constant" style="box-sizing: border-box;">Spak</span>
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>

(15) URL匹配實戰

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning/Chapter15</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[A-Za-z]*://[A-Za-z]*\.\(\([A-Za-z]*\)\.\?\)*' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span>

</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>

這裡寫圖片描述

上面整個例子可以分下列步驟完成: 
(1)匹配http://

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning/Chapter15</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[A-Za-z]*://' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

這裡寫圖片描述

(2)匹配域名

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning/Chapter15</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[A-Za-z]*://[A-Za-z]*\.[A-Za-z]*' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

這裡寫圖片描述

(3)處理重複部分

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkslave02</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:~/ShellLearning/Chapter15</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># grep -n '[A-Za-z]*://[A-Za-z]*\.\(\([A-Za-z]*\)\.\?\)*' /hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md </span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

轉載: http://blog.csdn.net/lovehuangjiaju/article/details/48952457

相關文章