sql語句中where一定要放在group by 之前

一隻勤奮愛思考的豬發表於2018-08-08

sql語句中where一定要放在group by 之前
分組查詢出來的結果是根據第一個被查詢出來的資料結果作為結果的。
所以如果在查詢的時候需要加入條件,那麼就一定要在分組之前把條件加進去。

select litigant_name,count(1) as defendant_judgedoc_cnt from df1 where litigant_type = '被告' group by litigant_name 

如果where寫在group by之後,會報如下錯誤:
[192.168.31.10] out:     select litigant_name,count(1) as defendant_judgedoc_cnt from df1 group by litigant_name where litigant_type = '被告'""")
[192.168.31.10] out:   File "/opt/local/spark-2.0.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/session.py", line 543, in sql
[192.168.31.10] out:   File "/opt/local/spark-2.0.1-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1133, in __call__
[192.168.31.10] out:   File "/opt/local/spark-2.0.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 73, in deco
[192.168.31.10] out: pyspark.sql.utils.ParseException: u"\nmismatched input 'where' expecting {<EOF>, ',', '.', '[', 'GROUPING', 'ORDER', 'HAVING', 'LIMIT', 'OR', 'AND', 'IN', NOT, 'BETWEEN', 'LIKE', RLIKE, 'IS', 'WINDOW', 'WITH', 'UNION', 'EXCEPT', 'INTERSECT', EQ, '<=>', '<>', '!=', '<', LTE, '>', GTE, '+', '-', '*', '/', '%', 'DIV', '&', '|', '^', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(line 2, pos 100)\n\n== SQL ==\n\n            select litigant_name,count(1) as defendant_judgedoc_cnt from df1 group by litigant_name where litigant_type = '\u88ab\u544a'\n----------------------------------------------------------------------------------------------------^^^\n"

相關文章