我們知道sql執行是一個複雜的過程,從sql到邏輯計劃,到物理計劃,規則重組,優化,執行引擎,都是很複雜的。尤其是優化一節,更是內容繁多。那麼,是否我們本篇要來討論這個問題呢?答案是否定的,我們只特定場景的特定優化問題。
1. 應用場景描述
應用場景是:我們提供一個功能,允許使用者從提供的欄位列表中,選擇任意欄位,然後配置好規則,後端根據這些規則,查詢出相應的主鍵資料出來。可以理解為簡單的視覺化查詢元件。
2. 整體思路解析
一般地,為了讓前端規則支援任意配置,我們基本很難做到一種特定的資料結構,將每個查詢條件拆分到一個個的rdb表欄位中。所以,簡單地,就是讓前端超脫提交一個使用者的整個查詢規則上來就好了,這和資料庫中的sql其實是一個道理,只是這裡只有where條件。所以,我們要做的就是,如何根據where條件,構建出相應的sql問題了。
只有where條件規則的模式,既有好處也有不好的。首先說不好處,就是隻有where條件,如何構建其他部分相對麻煩點,但因為我們蛤考慮主鍵查詢,所以相對簡單。其次好處是,我們可以在這個where條件的後面,任意轉換成各種資料庫查詢,即我們儲存層是可替換的,這就給了我們很多想像的空間。比如,最開始業務量小,我們用簡單rdb,後來可以換成es,再可以換成其他資料庫。這就很方便了。
那麼,如何從一串where條件中,提取出關鍵資訊,然後反解出整體概念呢?很簡單,只需要將條件分詞,分析一下就知道了。更直接的,我們從條件中提取出相應的後設資料資訊,再根據這些後設資料就可以推匯出上下文了。然後,再連線上where條件,自然就構建出完整的查詢語句了。
但是有個問題,如果我們的整體where結構是不變的,那麼好辦,直接拼接或者簡單的改寫區域性即可。但,如果我們想拆解成儘可能小的執行步驟呢?也就是本文標題指向,想盡可能把符合一個表的條件放在一塊,甚至分解一家大的where條件為多個子查詢,那又該如何呢?
解題思路:
1. 分解出所有的後設資料資訊;
2. 構建出所有查詢的抽象語法樹,樹儘可能簡單;
3. 根據後設資料資訊,將同表的查詢條件聚合在一起;
4. 根據條件運算優先順序,將相同優化級的條件聚合在一起;
5. 重新構建整體sql;
說得有點抽象,來個圖表示一下。
規則定義為:a1.f1 > 1 and a2.f2 < 0 and a1.f3 <> 'x' or a2.f4 > 'q'
原始規則樹如下:
改寫後的規則如下:
3. 實現步驟細節
大體上就是幹這麼一件事,以及如何幹成。看起來好像蠻簡單的,相信聰明如你也許早就做出來了。但是我還是要提醒大家,需要注意的事情其實也有那麼幾件。
1. 分詞倒是簡單,但也得做;
2. 如何構建原始二叉樹?注意需要保持各優先順序問題;
3. 改寫的前提是什麼?不是所有改寫都能成立;
也許,我們應該用到一些開源的框架,以便讓我們事半功倍。比如,我使用calcite框架的實現思路,將各單詞構建出二叉樹,因為calcite中有各種測試好的優先順序定義,我只拿過來就即可。比如我暫且定義優先順序如下:
/** * 獲取各操作符的優先順序定義 * and -> 24 * or -> 22 * <,<=,=,>,>= -> 30 * NOT -> 26 * IN -> 32 * IS_NULL -> 58 * IS_NOT_NULL -> - * LIKE -> 32 * BETWEEN -> 32 * +,- -> 40 * /,* -> 60 * between..and -> 90 * * @see PrecedenceClimbingParser #highest() calcite 優先順序定義 * org.apache.calcite.sql.fun.SqlStdOperatorTable */ private static int getTokenPrecedence(SyntaxToken token) { if(token.getTokenType() == SyntaxTokenTypeEnum.COMPARE_OPERATOR) { return 30; } String keyword = token.getRawWord(); if("and".equalsIgnoreCase(keyword)) { token.changeTokenType(SyntaxTokenTypeEnum.SQL_AND); return 24; } if("or".equalsIgnoreCase(keyword)) { token.changeTokenType(SyntaxTokenTypeEnum.SQL_OR); return 22; } if("in".equalsIgnoreCase(keyword) || "not in".equalsIgnoreCase(keyword)) { return 32; } if("is".equalsIgnoreCase(keyword)) { token.changeTokenType(SyntaxTokenTypeEnum.SQL_IS); return 58; } if("like".equalsIgnoreCase(keyword) || "not like".equalsIgnoreCase(keyword)) { return 32; } if("+".equalsIgnoreCase(keyword) || "-".equalsIgnoreCase(keyword)) { return 40; } if("*".equalsIgnoreCase(keyword) || "/".equalsIgnoreCase(keyword)) { return 60; } if(token.getTokenType() == SyntaxTokenTypeEnum.SQL_BETWEEN_AND) { return 90; } // 非操作符 return -1; }
然後,我需要假設場景再簡單些,比如所有小規則都被括號包裹(這其實就不太靈活了,不過沒關係,至少我們實踐初期是可行的)。這樣的話,構建原始規則樹就容易了。
/** * 按照括號進行語法分隔構建語法樹 * * @param rawList 原始單詞列表 * @param treeList 目標樹存放處 * @return 當次處理讀取的單詞數 */ private static int buildByPart(List<Object> rawList, List<PrecedenceClimbingParser.Token> treeList) { int len = rawList.size(); PrecedenceClimbingParser parser; PrecedenceClimbingParser.Builder builder = new PrecedenceClimbingParser.Builder(); int i = 0; for (i = 0; i < len; i++) { Object stmtOrToken = rawList.get(i); if(stmtOrToken instanceof SyntaxStatement) { String stmtRawWord = stmtOrToken.toString(); if(stmtRawWord.startsWith("between ") || stmtRawWord.startsWith("case ")) { if(stmtRawWord.startsWith("between ")) { SyntaxToken op = new SyntaxToken("[between]", SyntaxTokenTypeEnum.SQL_BETWEEN_AND); addOperatorToParserBuilder(builder, op); } else if(stmtRawWord.startsWith("case ")) { SyntaxToken op = new SyntaxToken("[case..when]", SyntaxTokenTypeEnum.SQL_CASE_WHEN); addOperatorToParserBuilder(builder, op); } builder.atom(new MySqlCustomHandler((SyntaxStatement)stmtOrToken)); // 立即觸發一次運算構建 parser = builder.build(); parser.partialParse(); PrecedenceClimbingParser.Token token = parser.all().get(0); treeList.add(token); continue; } builder.atom(new MySqlCustomHandler((SyntaxStatement)stmtOrToken)); continue; } SyntaxToken raw1 = (SyntaxToken)rawList.get(i); if("in".equals(raw1.getRawWord()) || ("not".equals(raw1.getRawWord()) && "in".equals(((SyntaxToken)rawList.get(i + 1)).getRawWord()))) { // field in (1, 2, 3...) i++; if("not".equals(raw1.getRawWord())) { ++i; raw1 = new SyntaxToken("not in", SyntaxTokenTypeEnum.SQL_NOT_IN); } if("(".equals(rawList.get(i).toString())) { List<MySqlNode> inList = new ArrayList<>(); do { ++i; Object stmtOrToken2 = rawList.get(i); if(stmtOrToken2 instanceof SyntaxStatement) { builder.atom(new MySqlCustomHandler((SyntaxStatement)stmtOrToken2)); continue; } SyntaxToken nextToken = (SyntaxToken) stmtOrToken2; if(")".equals(nextToken.getRawWord())) { break; } if(",".equals(nextToken.getRawWord())) { continue; } inList.add(new MySqlLiteral(nextToken.getTokenType().name(), nextToken)); } while (i < len); // 新增 in 解析 addOperatorToParserBuilder(builder, raw1); // 新增 in item 解析 MySqlNodeList nodeList = new MySqlNodeList(inList); builder.atom(nodeList); continue; } // 位置還原,理論上已經出現不支援的語法 i--; } if ("(".equals(raw1.getRawWord())) { // 遞迴進入 int skipLen = buildByPart(rawList.subList(i + 1, len), treeList); i += skipLen + 1; PrecedenceClimbingParser.Token innerToken = treeList.get(treeList.size() - 1); if(innerToken instanceof PrecedenceClimbingParser.Call) { PrecedenceClimbingParser.Call call = (PrecedenceClimbingParser.Call) innerToken; builder.call(call.op, call.args.get(0), call.args.get(1)); continue; } else if(innerToken != null && innerToken.type == PrecedenceClimbingParser.Type.ATOM) { builder.atom(innerToken.o); continue; } log.warn("非call的構造返回,請檢查, {}", innerToken); throw new BizException("非call的構造返回"); } if(")".equals(raw1.getRawWord())) { // 彈出返回 parser = builder.build(); parser.partialParse(); if(parser.all().isEmpty()) { throw new BizException("不支援空括號(無規則)配置,請檢查!"); } PrecedenceClimbingParser.Token token = parser.all().get(0); treeList.add(token); break; } if("not".equals(raw1.getRawWord())) { if("like".equals(((SyntaxToken)rawList.get(i + 1)).getRawWord())) { ++i; raw1 = new SyntaxToken("not like", SyntaxTokenTypeEnum.SQL_NOT_LIKE); } } addOperatorToParserBuilder(builder, raw1); } // 構建全量的語句 parser = builder.build(); parser.partialParse(); PrecedenceClimbingParser.Token token = parser.all().get(0); if(treeList.size() == 1) { treeList.add(0, token); } else { if(treeList.isEmpty()) { log.warn("規則解析失敗: 構造語法語法樹失敗, 樹節點為空,可能是不帶括號導致:{}", token.toString()); throw new BizException(300427, "規則解析失敗: 構造語法語法樹失敗, 請確認括號問題"); } treeList.set(0, token); } return i; }
最後是重寫的問題,重寫我們的目標是同表規則儘可能重組到一塊,以便可以更少次數的遍歷表資料。但是,如果優先順序不同,則不能合併,因為這會改變原始語義。這個問題,說起來簡單,也就是通過遍歷所有節點,然後重組語句完成。但實現起來可能還是有很多不連貫的地方。需要的可以參考後續的完整例項。
4. 完整規則重組實現參考
完整的實現樣例參考如下:(主要為了將普通寬表的查詢語句改寫成以bitmap和寬表共同組成的查詢語句)
4.1. 語法解析為樹結構(分詞及樹優先順序構建)
@Slf4j public class SyntaxParser { /** * 嚴格模式解析語法, 解析為樹狀node結構 * * @see #parse(String, boolean) */ public static MyRuleSqlNodeParsedClause parseAsTree(String rawClause) { log.info("開始解析: " + rawClause); if(rawClause == null) { log.warn("不支援空規則配置:{}, 或者解析分詞出錯,請排查", rawClause); throw new BizException("不支援空規則配置:" + rawClause); } List<SyntaxToken> tokens = tokenize(rawClause, true); if(tokens.isEmpty()) { log.warn("不支援空規則配置:{}, 或者解析分詞出錯,請排查", rawClause); throw new BizException("不支援空規則配置:" + rawClause); } Map<String, FieldInfoDescriptor> myIdList = enhanceTokenType(tokens); List<PrecedenceClimbingParser.Token> root = new ArrayList<>(); List<Object> coalesceSpecialTokenTokenList = flattenTokenByStmt(tokens); // list -> tree 轉換, (a=1) 會得到 (a=1),(a=1), 但取0值就沒問題了 buildByPart(coalesceSpecialTokenTokenList, root); MySqlNode node = convert(root.get(0)); return new MyRuleSqlNodeParsedClause(myIdList, node, tokens); } /** * 將token樹轉換為 node 樹,方便後續操作 * * @param token token 詞法樹 * @return node 樹 */ private static MySqlNode convert(PrecedenceClimbingParser.Token token) { switch (token.type) { case ATOM: Object o = token.o; if(o instanceof MySqlNode) { return (MySqlNode) o; } if(o instanceof SyntaxToken) { return new MySqlLiteral(((SyntaxToken) o).getTokenType().name(), (SyntaxToken) o); } return (MySqlNode) o; case CALL: final PrecedenceClimbingParser.Call call = (PrecedenceClimbingParser.Call) token; final List<MySqlNode> list = new ArrayList<>(); for (PrecedenceClimbingParser.Token arg : call.args) { list.add(convert(arg)); } MySqlOperator operator = (MySqlOperator)call.op.o; return new MySqlBasicCall(operator, list.toArray(new MySqlNode[0])); default: log.warn("語法解析錯誤,非預期的token型別:{}, {}", token.type, token); throw new BizException("非預期的token型別" + token.type); } } /** * 按照括號進行語法分隔構建語法樹 * * @param rawList 原始單詞列表 * @param treeList 目標樹存放處 * @return 當次處理讀取的單詞數 */ private static int buildByPart(List<Object> rawList, List<PrecedenceClimbingParser.Token> treeList) { int len = rawList.size(); PrecedenceClimbingParser parser; PrecedenceClimbingParser.Builder builder = new PrecedenceClimbingParser.Builder(); int i = 0; for (i = 0; i < len; i++) { Object stmtOrToken = rawList.get(i); if(stmtOrToken instanceof SyntaxStatement) { String stmtRawWord = stmtOrToken.toString(); if(stmtRawWord.startsWith("between ") || stmtRawWord.startsWith("case ")) { if(stmtRawWord.startsWith("between ")) { SyntaxToken op = new SyntaxToken("[between]", SyntaxTokenTypeEnum.SQL_BETWEEN_AND); addOperatorToParserBuilder(builder, op); } else if(stmtRawWord.startsWith("case ")) { SyntaxToken op = new SyntaxToken("[case..when]", SyntaxTokenTypeEnum.SQL_CASE_WHEN); addOperatorToParserBuilder(builder, op); } builder.atom(new MySqlCustomHandler((SyntaxStatement)stmtOrToken)); // 立即觸發一次運算構建 parser = builder.build(); parser.partialParse(); PrecedenceClimbingParser.Token token = parser.all().get(0); treeList.add(token); continue; } builder.atom(new MySqlCustomHandler((SyntaxStatement)stmtOrToken)); continue; } SyntaxToken raw1 = (SyntaxToken)rawList.get(i); if("in".equals(raw1.getRawWord()) || ("not".equals(raw1.getRawWord()) && "in".equals(((SyntaxToken)rawList.get(i + 1)).getRawWord()))) { // field in (1, 2, 3...) i++; if("not".equals(raw1.getRawWord())) { ++i; raw1 = new SyntaxToken("not in", SyntaxTokenTypeEnum.SQL_NOT_IN); } if("(".equals(rawList.get(i).toString())) { List<MySqlNode> inList = new ArrayList<>(); do { ++i; Object stmtOrToken2 = rawList.get(i); if(stmtOrToken2 instanceof SyntaxStatement) { builder.atom(new MySqlCustomHandler((SyntaxStatement)stmtOrToken2)); continue; } SyntaxToken nextToken = (SyntaxToken) stmtOrToken2; if(")".equals(nextToken.getRawWord())) { break; } if(",".equals(nextToken.getRawWord())) { continue; } inList.add(new MySqlLiteral(nextToken.getTokenType().name(), nextToken)); } while (i < len); // 新增 in 解析 addOperatorToParserBuilder(builder, raw1); // 新增 in item 解析 MySqlNodeList nodeList = new MySqlNodeList(inList); builder.atom(nodeList); continue; } // 位置還原,理論上已經出現不支援的語法 i--; } if ("(".equals(raw1.getRawWord())) { // 遞迴進入 int skipLen = buildByPart(rawList.subList(i + 1, len), treeList); i += skipLen + 1; PrecedenceClimbingParser.Token innerToken = treeList.get(treeList.size() - 1); if(innerToken instanceof PrecedenceClimbingParser.Call) { PrecedenceClimbingParser.Call call = (PrecedenceClimbingParser.Call) innerToken; builder.call(call.op, call.args.get(0), call.args.get(1)); continue; } else if(innerToken != null && innerToken.type == PrecedenceClimbingParser.Type.ATOM) { builder.atom(innerToken.o); continue; } log.warn("非call的構造返回,請檢查, {}", innerToken); throw new BizException("非call的構造返回"); } if(")".equals(raw1.getRawWord())) { // 彈出返回 parser = builder.build(); parser.partialParse(); if(parser.all().isEmpty()) { throw new BizException("不支援空括號(無規則)配置,請檢查!"); } PrecedenceClimbingParser.Token token = parser.all().get(0); treeList.add(token); break; } if("not".equals(raw1.getRawWord())) { if("like".equals(((SyntaxToken)rawList.get(i + 1)).getRawWord())) { ++i; raw1 = new SyntaxToken("not like", SyntaxTokenTypeEnum.SQL_NOT_LIKE); } } addOperatorToParserBuilder(builder, raw1); } // 構建全量的語句 parser = builder.build(); parser.partialParse(); PrecedenceClimbingParser.Token token = parser.all().get(0); if(treeList.size() == 1) { treeList.add(0, token); } else { if(treeList.isEmpty()) { log.warn("規則解析失敗: 構造語法語法樹失敗, 樹節點為空,可能是不帶括號導致:{}", token.toString()); throw new BizException(300427, "規則解析失敗: 構造語法語法樹失敗, 請確認括號問題"); } treeList.set(0, token); } return i; } /** * 新增一個操作符到builder中 * * @param builder parser builder * @param raw1 原始單詞 */ private static void addOperatorToParserBuilder(PrecedenceClimbingParser.Builder builder, SyntaxToken raw1) { int prec = getTokenPrecedence(raw1); if(prec == -1) { builder.atom(new MySqlLiteral(raw1.getTokenType().name(), raw1)); } else { PrecedenceClimbingParser.Token lastToken = builder.getLastToken(); if(lastToken != null && lastToken.o instanceof MySqlOperator) { throw new BizException(300423, "規則配置錯誤:【" + lastToken.toString() + "】後配置了另一符號【" + raw1.getRawWord() + "】"); } builder.infix(new MySqlOperator(raw1.getRawWord(), raw1.getTokenType()), prec, true); } } /** * 獲取各操作符的優先順序定義 * and,xand -> 24 * or,xor -> 22 * <,<=,=,>,>= -> 30 * NOT -> 26 * IN -> 32 * IS_NULL -> 58 * IS_NOT_NULL -> - * LIKE -> 32 * BETWEEN -> 32 * +,- -> 40 * /,* -> 60 * between..and -> 90 * * @param token 給定單詞 * @see PrecedenceClimbingParser #highest() calcite 優先順序定義 * org.apache.calcite.sql.fun.SqlStdOperatorTable */ private static int getTokenPrecedence(SyntaxToken token) { if(token.getTokenType() == SyntaxTokenTypeEnum.COMPARE_OPERATOR) { return 30; } String keyword = token.getRawWord(); if("and".equalsIgnoreCase(keyword) || "xand".equalsIgnoreCase(keyword)) { token.changeTokenType(SyntaxTokenTypeEnum.SQL_AND); return 24; } if("or".equalsIgnoreCase(keyword) || "xor".equalsIgnoreCase(keyword)) { token.changeTokenType(SyntaxTokenTypeEnum.SQL_OR); return 22; } if("in".equalsIgnoreCase(keyword) || "not in".equalsIgnoreCase(keyword)) { return 32; } if("is".equalsIgnoreCase(keyword)) { token.changeTokenType(SyntaxTokenTypeEnum.SQL_IS); return 58; } if("like".equalsIgnoreCase(keyword) || "not like".equalsIgnoreCase(keyword)) { return 32; } if("+".equalsIgnoreCase(keyword) || "-".equalsIgnoreCase(keyword)) { return 40; } if("*".equalsIgnoreCase(keyword) || "/".equalsIgnoreCase(keyword)) { return 60; } if(token.getTokenType() == SyntaxTokenTypeEnum.SQL_BETWEEN_AND) { return 90; } // 非操作符 return -1; } /** * 複用原有小語句翻譯能力,其他語句則保留原樣,進行樹的構造 * * @param tokens 原始片語列表 * @return 帶stmt實現的token列表 */ private static List<Object> flattenTokenByStmt(List<SyntaxToken> tokens) { List<Object> treesFlat = new ArrayList<>(tokens.size()); for (int i = 0; i < tokens.size(); i++) { SyntaxToken token = tokens.get(i); String word = token.getRawWord(); SyntaxTokenTypeEnum tokenType = token.getTokenType(); SyntaxStatement branch = null; switch (tokenType) { case FUNCTION_SYS_CUSTOM: String funcName = word.substring(0, word.indexOf('(')); SyntaxStatementHandlerFactory handlerFactory = SyntaxSymbolTable.getUdfHandlerFactory(funcName); branch = handlerFactory.newHandler(token, i, tokenType, tokens); break; case KEYWORD_SYS_CUSTOM: // 替換關鍵字資訊 branch = SyntaxSymbolTable.getSysKeywordHandlerFactory() .newHandler(token, i, tokenType, tokens); break; case KEYWORD_SQL: // 替換關鍵字資訊 branch = SyntaxSymbolTable.getSqlKeywordHandlerFactory() .newHandler(token, i, tokenType, tokens); break; default: treesFlat.add(token); break; } if(branch != null) { i += branch.getTokensSize() - 1; treesFlat.add(branch); } } return treesFlat; } /** * 語義增強處理 * * 加強token型別描述,並返回 myId 資訊 */ private static Map<String, FieldInfoDescriptor> enhanceTokenType(List<SyntaxToken> tokens) { Map<String, FieldInfoDescriptor> myIdList = new LinkedHashMap<>(); for (int i = 0; i < tokens.size(); i++) { SyntaxToken token = tokens.get(i); String word = token.getRawWord().toLowerCase(); SyntaxTokenTypeEnum newTokenType = token.getTokenType(); switch (token.getTokenType()) { case WORD_NORMAL: if (word.startsWith("$")) { newTokenType = SyntaxTokenTypeEnum.MY_ID; myIdList.put(word, null); } else if (NumberUtils.isCreatable(word)) { newTokenType = SyntaxTokenTypeEnum.WORD_NUMBER; } else { newTokenType = SyntaxSymbolTable.keywordTypeOf(word); } if (newTokenType == SyntaxTokenTypeEnum.WORD_NORMAL) { // 以fieldKey形式儲存欄位資訊,用於反查資料 if (!"not".equals(word) && !"is".equals(word) && !"and".equals(word) && !"or".equals(word) && !"null".equals(word) && !"case".equals(word) && !"when".equals(word) && !"then".equals(word) && !"else".equals(word) && !"end".equals(word) && !"from".equals(word) && !"xand".equals(word) && !"xor".equals(word) && !"between".equals(word) && !"in".equals(word) && !"like".equals(word) ) { myIdList.put(word, null); newTokenType = SyntaxTokenTypeEnum.MY_ID_NAME; } } if("is".equals(word) && "not".equals(tokens.get(i + 1).getRawWord())) { if("null".equals(tokens.get(i + 2).getRawWord())) { SyntaxToken notNullToken = new SyntaxToken("not null", SyntaxTokenTypeEnum.SQL_NOT_NULL); tokens.remove(i + 1); tokens.set(i + 1, notNullToken); i += 1; } // throw new BizException("is not 後面只能跟 null"); } token.changeTokenType(newTokenType); break; case WORD_STRING: // 被引號包圍的關鍵字,如 '%#{monthpart}%' String innerSysCustomKeyword = CommonUtil.readSplitWord( word.toCharArray(), 1, "#{", "}"); if (innerSysCustomKeyword.length() > 3) { newTokenType = SyntaxTokenTypeEnum.KEYWORD_SYS_CUSTOM; } token.changeTokenType(newTokenType); break; case FUNCTION_NORMAL: newTokenType = SyntaxSymbolTable.functionTypeOf(word); List<SyntaxToken> params = parseFunctionParams(token.getRawWord()); params.forEach(r -> { String fieldName = null; SyntaxTokenTypeEnum paramTokenType = r.getTokenType(); if (paramTokenType == SyntaxTokenTypeEnum.MY_ID || paramTokenType == SyntaxTokenTypeEnum.MY_ID_NAME) { fieldName = r.getRawWord(); } else if (paramTokenType == SyntaxTokenTypeEnum.WORD_ARRAY) { fieldName = parseExtendMyFieldInfo(r); } else if (paramTokenType == SyntaxTokenTypeEnum.MY_ID_EXTEND) { // 函式中的擴充套件,無開關強制解析 fieldName = parseExtendMyFieldInfo(r, false); if(fieldName != null) { fieldName = fieldName.toLowerCase(); } } if (fieldName != null) { myIdList.put(r.getRawWord(), null); } }); token.changeTokenType(newTokenType); break; case WORD_ARRAY: case MY_ID_EXTEND: String fieldName = parseExtendMyFieldInfo(token); if (fieldName != null) { myIdList.put(fieldName, null); } break; } } return myIdList; } /** * 解析字串函式的引數列表 (可解析內嵌函式,但並未標記) * * @param func 如: substring(a_field ,0 ,1, 'abc') * @return [a_field, 0, 1, 'abc'] */ public static List<SyntaxToken> parseFunctionParams(String func) { String paramStr = func.substring(func.indexOf("(") + 1, func.lastIndexOf(")")); List<StringBuilder> paramList = new ArrayList<>(); StringBuilder wrapParam = null; boolean sQuotation = false; boolean lastSpace = false; boolean lastComma = false; // 前置空格,忽略 paramStr = paramStr.trim(); for(int i = 0; i < paramStr.length(); i++) { char ch = paramStr.charAt(i); if(i == 0) { wrapParam = new StringBuilder().append(ch); paramList.add(wrapParam); if(ch == '\'') { sQuotation = !sQuotation; } continue; } if(ch == '\'') { lastComma = false; lastSpace = false; wrapParam.append(ch); sQuotation = !sQuotation; continue; } if(sQuotation) { wrapParam.append(ch); continue; } if(ch == ' ') { if(lastSpace) { continue; } lastSpace = true; continue; } if(ch == ',') { if(lastComma) { throw new BizException("函式中含有連續多個分隔號:,"); } wrapParam = new StringBuilder(); paramList.add(wrapParam); lastComma = true; lastSpace = false; continue; } lastComma = false; lastSpace = false; if(ch == '(' || ch == ')') { wrapParam = new StringBuilder().append(ch); paramList.add(wrapParam); wrapParam = new StringBuilder(); paramList.add(wrapParam); continue; } wrapParam.append(ch); } List<SyntaxToken> paramTokenList = new ArrayList<>(); for (StringBuilder p1 : paramList) { if(p1.length() == 0) { continue; } String p1Str = p1.toString(); char ch = p1Str.charAt(0); if(ch == '\'' || ch == '"') { paramTokenList.add( new SyntaxToken(p1Str, SyntaxTokenTypeEnum.WORD_STRING)); continue; } if(ch == '$') { paramTokenList.add( new SyntaxToken(p1Str, SyntaxTokenTypeEnum.MY_ID)); continue; } if(NumberUtils.isCreatable(p1Str)) { paramTokenList.add( new SyntaxToken(p1Str, SyntaxTokenTypeEnum.WORD_NUMBER)); continue; } if(p1Str.contains("['")) { paramTokenList.add( new SyntaxToken(p1Str, SyntaxTokenTypeEnum.WORD_ARRAY)); continue; } if(p1Str.equals("(")) { // 將上一個函式名,拼接上當前括號,作為分隔符 int lastIndex = paramTokenList.size() - 1; SyntaxToken lastParam = paramTokenList.get(lastIndex); paramTokenList.set(lastIndex, new SyntaxToken(lastParam.getRawWord() + p1Str, SyntaxTokenTypeEnum.FUNCTION_NORMAL)); continue; } if(p1Str.equals(")")) { paramTokenList.add( new SyntaxToken(p1Str, SyntaxTokenTypeEnum.CLAUSE_SEPARATOR)); continue; } if("current_timestamp".equalsIgnoreCase(p1Str)) { paramTokenList.add( new SyntaxToken(p1Str, SyntaxTokenTypeEnum.KEYWORD_SYS_CUSTOM)); continue; } // 忽略其他關鍵字,直接認為是欄位資訊 paramTokenList.add( new SyntaxToken(p1Str, SyntaxTokenTypeEnum.MY_ID_NAME)); } return paramTokenList; } /** * 查詢語句分詞操作 * * 拆分為單個細粒度的詞如: * 單詞 * 分隔符 * 運算子 * 陣列 * 函式 * * @param rawClause 原始查詢語句 * @param strictMode 是否是嚴格模式, true:是, false:否 * @return token化的單詞 */ private static List<SyntaxToken> tokenize(String rawClause, boolean strictMode) { char[] clauseItr = rawClause.toCharArray(); List<SyntaxToken> parsedTokenList = new ArrayList<>(); Stack<ClauseLineNumTable> specialSeparatorStack = new Stack<>(); int clauseLength = clauseItr.length; StringBuilder field; String fieldGot; char nextChar; outer: for (int i = 0; i < clauseLength; ) { char currentChar = clauseItr[i]; switch (currentChar) { case '\'': case '\"': fieldGot = readSplitWord(clauseItr, i, currentChar, currentChar); i += fieldGot.length(); parsedTokenList.add( new SyntaxToken(fieldGot, SyntaxTokenTypeEnum.WORD_STRING)); continue outer; case '[': case ']': case '(': case ')': case '{': case '}': case ',': if(specialSeparatorStack.empty()) { specialSeparatorStack.push( ClauseLineNumTable.newData(i, currentChar)); parsedTokenList.add( new SyntaxToken(currentChar, SyntaxTokenTypeEnum.CLAUSE_SEPARATOR)); break; } parsedTokenList.add( new SyntaxToken(currentChar, SyntaxTokenTypeEnum.CLAUSE_SEPARATOR)); char topSpecial = specialSeparatorStack.peek().getKeyword().charAt(0); if(topSpecial == '(' && currentChar == ')' || topSpecial == '[' && currentChar == ']' || topSpecial == '{' && currentChar == '}') { specialSeparatorStack.pop(); break; } if(',' != currentChar) { specialSeparatorStack.push( ClauseLineNumTable.newData(i, currentChar)); } break; case ' ': case '\t': case '\r': case '\n': // 空格忽略 break; case '@': nextChar = clauseItr[i + 1]; // @{} 擴充套件, 暫不解析, 原樣返回 if(nextChar == '{') { fieldGot = CommonUtil.readSplitWord(clauseItr, i, "@{", "}@"); i += fieldGot.length(); parsedTokenList.add( new SyntaxToken(fieldGot, SyntaxTokenTypeEnum.MY_ID_EXTEND)); continue outer; } break; case '#': nextChar = clauseItr[i + 1]; // #{} 系統關鍵字標識 if(nextChar == '{') { fieldGot = CommonUtil.readSplitWord(clauseItr, i, "#{", "}"); i += fieldGot.length(); parsedTokenList.add( new SyntaxToken(fieldGot, SyntaxTokenTypeEnum.KEYWORD_SYS_CUSTOM)); continue outer; } break; case '+': case '-': case '*': case '/': nextChar = clauseItr[i + 1]; if(currentChar == '-' && nextChar >= '0' && nextChar <= '9') { StringBuilder numberBuff = new StringBuilder(currentChar + "" + nextChar); ++i; while ((i + 1) < clauseLength){ nextChar = clauseItr[i + 1]; if(nextChar >= '0' && nextChar <= '9' || nextChar == '.') { ++i; numberBuff.append(nextChar); continue; } break; } parsedTokenList.add( new SyntaxToken(numberBuff.toString(), SyntaxTokenTypeEnum.WORD_NUMBER)); break; } parsedTokenList.add( new SyntaxToken(currentChar, SyntaxTokenTypeEnum.SIMPLE_MATH_OPERATOR)); break; case '=': case '>': case '<': case '!': // >=, <=, !=, <> nextChar = clauseItr[i + 1]; if(nextChar == '=' || currentChar == '<' && nextChar == '>') { ++i; parsedTokenList.add( new SyntaxToken(currentChar + "" + nextChar, SyntaxTokenTypeEnum.COMPARE_OPERATOR)); break; } parsedTokenList.add( new SyntaxToken(currentChar, SyntaxTokenTypeEnum.COMPARE_OPERATOR)); break; default: field = new StringBuilder(); SyntaxTokenTypeEnum tokenType = SyntaxTokenTypeEnum.WORD_NORMAL; do { currentChar = clauseItr[i]; field.append(currentChar); if(i + 1 < clauseLength) { // 去除函式前置名後置空格 if(SyntaxSymbolTable.isUdfPrefix(field.toString())) { do { if(clauseItr[i + 1] != ' ') { break; } ++i; } while (i + 1 < clauseLength); } nextChar = clauseItr[i + 1]; if(nextChar == '(') { fieldGot = readSplitWordWithQuote(clauseItr, i + 1, nextChar, ')'); field.append(fieldGot); tokenType = SyntaxTokenTypeEnum.FUNCTION_NORMAL; i += fieldGot.length(); break; } if(nextChar == '[') { fieldGot = readSplitWord(clauseItr, i + 1, nextChar, ']'); field.append(fieldGot); tokenType = SyntaxTokenTypeEnum.WORD_ARRAY; i += fieldGot.length(); break; } if(isSpecialChar(nextChar) // 多個關鍵詞情況, 實際上以上解析應以字元或數字作為判定 || nextChar == '#') { // 嚴格模式下,要求 -+ 符號前後必須帶空格, 即會將所有字母后緊連的 -+ 視為字元連線號 // 非嚴格模式下, 即只要是分隔符即停止字元解析(非標準分隔) if(!strictMode || nextChar != '-' && nextChar != '+') { break; } } ++i; continue; } break; } while (i < clauseLength); parsedTokenList.add( new SyntaxToken(field.toString(), tokenType)); break; } // 正常單字解析迭代 i++; } if(!specialSeparatorStack.empty()) { ClauseLineNumTable lineNumTableTop = specialSeparatorStack.peek(); throw new BizException("組合規則配置錯誤:檢測到未閉合的符號, near '" + lineNumTableTop.getKeyword()+ "' at column " + lineNumTableTop.getColumnNum()); } return parsedTokenList; } /** * 從源陣列中讀取某類詞資料 * * @param src 資料來源 * @param offset 要搜尋的起始位置 offset * @param openChar word 的開始字元,用於避免迴圈巢狀 如: '(' * @param closeChar word 的閉合字元 如: ')' * @return 解析出的字元 * @throws BizException 解析不到正確的單詞時丟擲 */ private static String readSplitWord(char[] src, int offset, char openChar, char closeChar) throws BizException { StringBuilder builder = new StringBuilder(); for (int i = offset; i < src.length; i++) { if(openChar == src[i]) { int aroundOpenCharNum = -1; do { builder.append(src[i]); // 注意 openChar 可以 等於 closeChar if(src[i] == openChar) { aroundOpenCharNum++; } if(src[i] == closeChar) { aroundOpenCharNum--; } } while (++i < src.length && (aroundOpenCharNum > 0 || src[i] != closeChar)); if(aroundOpenCharNum > 0 || (openChar == closeChar && aroundOpenCharNum != -1)) { throw new BizException("syntax error, un closed clause near '" + builder.toString() + "' at column " + --i); } builder.append(closeChar); return builder.toString(); } } // 未找到匹配 return " "; } /** * 從源陣列中讀取某類詞資料 (將 'xx' 作為一個單詞處理) * * @param src 資料來源 * @param offset 要搜尋的起始位置 offset * @param openChar word 的開始字元,用於避免迴圈巢狀 如: '(' * @param closeChar word 的閉合字元 如: ')' * @return 解析出的字元 * @throws BizException 解析不到正確的單詞時丟擲 */ private static String readSplitWordWithQuote(char[] src, int offset, char openChar, char closeChar) throws BizException { StringBuilder builder = new StringBuilder(); for (int i = offset; i < src.length; i++) { if(openChar == src[i]) { int aroundOpenCharNum = -1; do { char ch = src[i]; if(ch == '\'') { String strQuoted = readSplitWord(src, i, ch, ch); builder.append(strQuoted); i += strQuoted.length() - 1; continue; } builder.append(ch); // 注意 openChar 可以 等於 closeChar if(ch == openChar) { aroundOpenCharNum++; } if(ch == closeChar) { aroundOpenCharNum--; } } while (++i < src.length && (aroundOpenCharNum > 0 || src[i] != closeChar)); if(aroundOpenCharNum > 0 || (openChar == closeChar && aroundOpenCharNum != -1)) { throw new BizException("syntax error, un closed clause near '" + builder.toString() + "' at column " + --i); } builder.append(closeChar); return builder.toString(); } } // 未找到匹配 return " "; } /** * 檢測字元是否特殊運算子 * * @param value 給定檢測字元 * @return true:是特殊字元, false:普通 */ private static boolean isSpecialChar(char value) { return SyntaxSymbolTable.OPERATOR_ALL.indexOf(value) != -1; } }
4.2. 規則重組優化
@Slf4j public class MyRuleSqlNodeParsedClause extends MyParsedClause { /** * 規則語法樹 */ private MySqlNode binTreeRoot; public MyRuleSqlNodeParsedClause(Map<String, FieldInfoDescriptor> idMapping, MySqlNode binTreeRoot, List<SyntaxToken> rawTokens) { super(idMapping, null, rawTokens); this.binTreeRoot = binTreeRoot; } /** * 生成一個空的解析類 */ private static MyRuleSqlNodeParsedClause EMPTY_CLAUSE = new MyRuleSqlNodeParsedClause( Collections.emptyMap(), null, Collections.emptyList()); public static MyRuleSqlNodeParsedClause emptyParsedClause() { return EMPTY_CLAUSE; } /** * 轉換語言表示式 (新的實現) * * @param sqlType sql型別 * @see MyDialectTypeEnum * @return 翻譯後的sql語句 */ @Override public String translateTo(MyDialectTypeEnum sqlType) { boolean needHandleWhitespace = false; String targetCode = binTreeRoot.toString(); log.info("翻譯成目標語言:{}, targetCode: {}", sqlType, targetCode); return targetCode; } /** * 專用實現翻譯成ck sql (未經優化版本的) * * @return ck sql (bitmap) */ public String translateToFullCKSql(boolean onlyCnt) { resetFlagContainers(); ClickHouseSqlBuilder sqlBuilder = new ClickHouseSqlBuilder(); if(binTreeRoot instanceof MySqlBasicCall) { // 混合型規則配置操作 visitCallNode((MySqlBasicCall) binTreeRoot, sqlBuilder); } else if(binTreeRoot instanceof MySqlCustomHandler) { log.warn("純自定義函式轉換尚未開發完成,請等待佳音:{}", binTreeRoot.toString()); throw new BizException(300132, "暫不支援的操作哦:" + binTreeRoot.toString()); } else { log.error("不支援的轉換規則:{}", binTreeRoot.toString()); throw new BizException(300131, "不支援的轉換規則:" + binTreeRoot.toString()); } QueryTableDto primaryTable = queryTableDtoList.get(0); sqlBuilder.from(primaryTable.getTableAlia(), null); for (int i = 0; i < queryTableDtoList.size(); i++) { QueryTableDto tableDto = queryTableDtoList.get(i); if(i == 0) { continue; } sqlBuilder.join(tableDto.getTableAlia(), null).on(tableDto.getTableAlia(), primaryTable.getTableAlia() + "." + primaryTable.getJoinField() + "=" + tableDto.getTableAlia() + "." + tableDto.getJoinField()); } // arrayJoin(bitmapToArray(bitmapOrCardinality(user0, user1))) as list if(onlyCnt) { sqlBuilder.select("bitmapCardinality(" + bitmapFunctionNodeContainer.get(binTreeRoot) + ")", "cnt"); } else { sqlBuilder.select("arrayJoin(bitmapToArray(" + bitmapFunctionNodeContainer.get(binTreeRoot) + "))", "cust_no"); // select 其餘欄位,該場景僅適用於預覽,取數並不適合 queryTableDtoList.forEach(tableDto -> tableDto.getQueryFields().forEach(r -> { if (r.getAlia().equals("join_id")) { return; } // groupBitmapState(uv) ... 之類的運算輔助欄位忽略 if(r.getField().contains("(")) { return; } sqlBuilder.select(r.getAlia()); })); } sqlBuilder.limit(100); return sqlBuilder.build(); } /** * 當前正在構建的邏輯計劃 */ private List<SqlNodeSingleTableLogicalPlan> logicalContainer = new ArrayList<>(); /** * 整體sql 構建輔助結構 */ private AtomicInteger tableCounter = new AtomicInteger(0); private List<QueryTableDto> queryTableDtoList = new ArrayList<>(); private Map<MySqlNode /* node */, String /* bitmap function or param */> bitmapFunctionNodeContainer = new HashMap<>(); private Map<MySqlNode /* node */, Boolean /* reverse bitmap */> reverseBitmapFlagContainer = new HashMap<>(); /** * 重置各容器,避免變數汙染 */ private void resetFlagContainers() { logicalContainer = new ArrayList<>(); tableCounter = new AtomicInteger(0); bitmapFunctionNodeContainer = new HashMap<>(); reverseBitmapFlagContainer = new HashMap<>(); queryTableDtoList = new ArrayList<>(); } /** * 激進優化版本的生成sql方法 * * @param onlyCnt 是否是進行計數 * @return 生成的完整sql */ public String translateFullCkSqlProf(boolean onlyCnt) { resetFlagContainers(); LogicalPlan logicalPlan = binToRelConvert(); ClickHouseSqlBuilder sqlBuilder = new ClickHouseSqlBuilder(); return translateCkByLogicalPlan(logicalPlan, sqlBuilder, onlyCnt); } // 根據邏輯計劃生成ck-sql private String translateCkByLogicalPlan(LogicalPlan logicalPlan, ClickHouseSqlBuilder sqlBuilder, boolean onlyCnt) { int joinMethod = ApolloUtil.getIntProperty("manage_ck_preview_join_method", 1); String bitmapFieldPri = buildSqlByLogical(logicalPlan, sqlBuilder, joinMethod); QueryTableDto primaryTable = queryTableDtoList.get(0); sqlBuilder.from(primaryTable.getTableAlia(), null); for (int i = 0; i < queryTableDtoList.size(); i++) { QueryTableDto tableDto = queryTableDtoList.get(i); if(i == 0) { continue; } sqlBuilder.join(tableDto.getTableAlia(), null).on(tableDto.getTableAlia(), primaryTable.getTableAlia() + "." + primaryTable.getJoinField() + "=" + tableDto.getTableAlia() + "." + tableDto.getJoinField()); } // arrayJoin(bitmapToArray(bitmapOrCardinality(user0, user1))) as list if(joinMethod == 1) { if(onlyCnt) { sqlBuilder.select("bitmapCardinality(" + bitmapFieldPri + ")", "cnt"); } else { sqlBuilder.select("arrayJoin(bitmapToArray(" + bitmapFieldPri + "))", "cust_no"); // select 其餘欄位,該場景僅適用於預覽,取數並不適合 queryTableDtoList.forEach(tableDto -> tableDto.getQueryFields().forEach(r -> { if(r.getAlia() == null) { sqlBuilder.select(r.getField()); return; } if (r.getAlia().equals("join_id")) { return; } // groupBitmapState(uv) ... 之類的運算輔助欄位忽略 if(r.getField().contains("(")) { return; } sqlBuilder.select(r.getAlia()); })); } } else if(joinMethod == 2) { } sqlBuilder.limit(100); // todo: 其他附加欄位如何取值?? return sqlBuilder.build(); } // 可以使用bitmap式的join,也可以使用寬表式的join。 // 1: 使用bitmap式join,即全部使用1 as join_id 進行關聯,寬表使用bitmapBuild()進行轉換 // 2. 使用寬表進行join,即將bitmap展開為array多行,而寬表則使用主鍵進行join // 3. 待定備用,全部為bitmap時,使用bitmap,全部為寬表時,使用寬表join private String buildSqlByLogical(LogicalPlan plan, ClickHouseSqlBuilder sqlBuilder, int joinMethod) { if(plan instanceof LineUpTwoTablePlan) { // 多表查詢 LineUpTwoTablePlan twoTablePlan = (LineUpTwoTablePlan)plan; String leftPri = buildSqlByLogical(twoTablePlan.getLeft(), sqlBuilder, joinMethod); String rightPri = buildSqlByLogical(twoTablePlan.getRight(), sqlBuilder, joinMethod); // 此處join僅僅是為了處理 bitmap 關係, 其他join關係已在單表時處理完成 // groupBitmap(), 左值如何得,右值如何得? return joinLeftAndRight(leftPri, rightPri, twoTablePlan.getJoinType(), joinMethod); } if(plan instanceof SqlNodeSingleTableLogicalPlan) { SqlNodeSingleTableLogicalPlan planReal = (SqlNodeSingleTableLogicalPlan) plan; // sqlBuilder.with(planReal.getTableName()) // .select(plan.fields()); if(planReal.isUseBitmap()) { if(planReal.getRoot() instanceof MySqlCustomHandler) { MySqlCustomHandler handler = (MySqlCustomHandler) planReal.getRoot(); String where = handler.getStmt().translateTo( MyDialectTypeEnum.CLICK_HOUSE, getMyIdMapping()); int tableCounterIndex = tableCounter.getAndIncrement(); String tableAlias = "t" + tableCounterIndex; String userBitmapAlias = "user" + tableCounterIndex; Set<QueryFieldDto> fields = new HashSet<>(); fields.add(QueryFieldDto.newField("1", "join_id")); fields.add(QueryFieldDto.newField("groupBitmapMergeState(uv)", userBitmapAlias)); sqlBuilder.with(tableAlias, fields, planReal.getTableName() + " where " + where); // 只能拉取客戶號欄位,其他欄位會由於bitmap的加入而損失 QueryTableDto tableInfoOld = planReal.getTableInfo(); planReal.setTableAlias(tableAlias); QueryTableDto tableInfoNew = new QueryTableDto(tableInfoOld.getTableName(), tableAlias, null, fields, "join_id", null, false); queryTableDtoList.add(tableInfoNew); return userBitmapAlias; } MySqlBasicCall root = (MySqlBasicCall)planReal.getRoot(); MySqlNode left = root.getOperands()[0]; SyntaxToken NameToken = null; if(left.getKind() == SyntaxTokenTypeEnum.STATEMENT_HANDLER) { List<SyntaxToken> IdList = ((MySqlCustomHandler)left).getHoldingMyIdList(); NameToken = IdList.get(0); } else { NameToken = ((MySqlLiteral)left).getValue(); } FieldInfoDescriptor fieldInfoDescriptor = CommonMyConditionAstHandler.getFieldInfoByToken( NameToken, getMyIdMapping()); return buildSimpleBitmapTmpTable(root, left, root.getOperands()[1], fieldInfoDescriptor, sqlBuilder); } // 使用寬表進行join // 以下為錯誤的樣例,因為只考慮到一個單一的條件情況,實際上該規則下包含了n個條件組合 if(joinMethod == 1) { int tableCounterIndex = tableCounter.getAndIncrement(); String tableAlias = "t" + tableCounterIndex; // 使用新的別名,生成寬表別名 QueryTableDto tableInfoOld = planReal.getTableInfo(); if(tableInfoOld == null) { log.warn("未獲取到表相關資訊:{}, 請檢查是否是apollo配置,或者讀取欄位後設資料異常", planReal.getRoot()); throw new BizException(300329, "未獲取到表相關資訊:" + planReal.getRoot()); } String priField = tableInfoOld.getJoinField(); tableInfoOld.setTableAlia(tableAlias); planReal.setTableAlias(tableInfoOld.getTableAlia()); MySqlNode curWhereRoot = planReal.getRoot(); planReal.setTableAlias(tableAlias); String userBitmapAlias = "user" + tableCounterIndex; Set<QueryFieldDto> fields = new HashSet<>(); fields.add(QueryFieldDto.newField("1", "join_id")); fields.add(QueryFieldDto.newField("groupBitmapState(toUInt64OrZero(" + priField + "))", userBitmapAlias)); QueryTableDto tableInfoNew = new QueryTableDto(tableInfoOld.getTableName(), tableAlias, null, fields, "join_id", null, false); queryTableDtoList.add(tableInfoNew); // 只能拉取客戶號欄位,其他欄位會由於bitmap的加入而損失 planReal.setTableInfo(tableInfoNew); // todo: hive寬錶轉換為ck寬表,加分散式表分割槽字尾... // todo: 將寬表確定提前至每個寬表生成時 String wideTableName = planReal.getTableName(); if(wideTableName.contains(".")) { wideTableName = wideTableName.substring(wideTableName.lastIndexOf('.') + 1); } sqlBuilder.with(tableAlias, fields, wideTableName + " where " + translateRuleNodePartAsPlainText(curWhereRoot)); // fields.add(QueryFieldDto.newField(Name, Name)); return userBitmapAlias; } if(joinMethod == 2) { return null; } } throw new BizException(300232, "不支援的表型別:" + plan); } /** * 將規則節點翻譯為普通文字(即where條件簡單翻譯) * * @param root 當前節點 * @return 翻譯後的規則,如 $123 = 1 將翻譯成 cust_no = 1 */ private String translateRuleNodePartAsPlainText(MySqlNode root) { if(!(root instanceof MySqlBasicCall)) { if(root.getKind() == SyntaxTokenTypeEnum._ID || root.getKind() == SyntaxTokenTypeEnum._ID_NAME) { FieldInfoDescriptor fieldInfo = CommonMyConditionAstHandler.getFieldInfoByToken( ((MySqlLiteral) root).getValue(), getMyIdMapping()); if(fieldInfo == null) { // 這種情況一般upp會攔截 log.warn("主鍵id未找到對應後設資料資訊或規則配置錯誤:{}, 如錯誤地將值解析為主鍵id,這個情況一般是因為數值類右值被配置了字串值", root.toString()); throw new BizException(300235, "主鍵id未找到對應後設資料資訊或規則配置錯誤:" + root.toString()); } return fieldInfo.getMyName(); } if(root instanceof MySqlCustomHandler) { MySqlCustomHandler handler = (MySqlCustomHandler) root; SyntaxStatement stmt = handler.getStmt(); return stmt.translateTo(MyDialectTypeEnum.CLICK_HOUSE, getMyIdMapping()); } return root.toString(); } MySqlBasicCall rootCall = (MySqlBasicCall) root; MySqlNode[] childs = rootCall.getOperands(); if(childs.length != 2) { log.warn("規則不支援非2個運算子, 個數不支援:{}, root:{}", childs.length, root); throw new BizException(300130, "規則不支援非2個運算子"); } String subRulePlainLeft = translateRuleNodePartAsPlainText(childs[0]); String subRulePlainRight = translateRuleNodePartAsPlainText(childs[1]); MySqlOperator operator = rootCall.getOperator(); switch (operator.getKind()) { case SQL_AND: case SQL_AND_NOT: return subRulePlainLeft + " and " + subRulePlainRight; case SQL_OR: return "( " + subRulePlainLeft + " or " + subRulePlainRight + " )"; case SIMPLE_MATH_OPERATOR: // 簡單四則運算,忽略,可能記錄些東西 default: // 語法已被後續handler處理,此處僅為佔位符 if(isOperatorAsReplaceHolder(operator)) { return "( " + subRulePlainLeft + " " + subRulePlainRight + " )"; } return "( " + subRulePlainLeft + " " + operator.toString() + " " + subRulePlainRight + " )"; } } /** * 判定符號是否是佔位符,或者識別符號,此種情況邏輯會被後續或字首token處理 * 如 between..and.. 的 between * * @param operator 當前符號 * @return true:是 */ private boolean isOperatorAsReplaceHolder(MySqlOperator operator) { return operator.getKind() == SyntaxTokenTypeEnum.SQL_BETWEEN_AND || operator.getKind() == SyntaxTokenTypeEnum.SQL_CASE_WHEN; } // 兩個bitmap 節點的邏輯連線主鍵欄位處理 private String joinLeftAndRight(String leftPri, String rightPri, SyntaxTokenTypeEnum joinType, int joinMethod) { if(joinMethod == 2) { return null; } if(joinType == SyntaxTokenTypeEnum.SQL_AND) { return "bitmapAnd(" + leftPri + ", " + rightPri + ")"; } if(joinType == SyntaxTokenTypeEnum.SQL_OR) { return "bitmapOr(" + leftPri + ", " + rightPri + ")"; } if(joinType == SyntaxTokenTypeEnum.SQL_AND_NOT) { return "bitmapAndnot(" + leftPri + ", " + rightPri + ")"; } throw new BizException(300233, "不支援的邏輯運算:" + joinType); } /** * 普通sqlNode 轉換成 帶一定關係型的節點表示(邏輯計劃) * */ private LogicalPlan binToRelConvert() { // 1. 遍歷到所有葉子節點 // 2. 如果葉子節點是表bitmap表示,則立即生成一個邏輯計劃,即當前節點就是一個臨時表查詢 // 3. 如果是寬表節點,則標記當前節點未被使用,且暫不生成邏輯計劃,繼續向上查詢 // 4. 如果向上回滾時,有以下情況之一的,立即將之前的節點劃為一個邏輯計劃結束,並開啟一個新的邏輯計劃 // 4.1. 當前節點是bitmap, 立即結束之前的計劃,並生成一個當前計劃,並立即結束 // 4.2. 如果是寬表,但與上一節點或者當前計劃中對應的表不是一同一個的,結束之前的計劃,開啟新的; // 4.3. 如果當前節點是右子節點時,且上一邏輯計劃已結束,則立即將當前節點(寬表)作為臨時表新的邏輯計劃生成並結束; // 最後,再進行遍歷邏輯計劃時,就不會存在太多表關聯問題了(當然比最開始全是寬表的情況還是會多出很多臨時表出來) // 不用遞迴 ClickHouseSqlBuilder sqlBuilder = new ClickHouseSqlBuilder(); LogicalPlan logicalPlan = null; if(binTreeRoot instanceof MySqlBasicCall) { // 混合型規則配置操作 markNodeDepth((MySqlBasicCall) binTreeRoot, sqlBuilder); logicalPlan = buildCallNodeLogical((MySqlBasicCall) binTreeRoot); } else if(binTreeRoot instanceof MySqlCustomHandler) { // 單自定義函式類操作如 ui_lbs_range('', '') > 10 logicalPlan = buildLogicalWithCustomHandlerNode( (MySqlCustomHandler) binTreeRoot); } else { log.error("不支援的轉換規則:{}", binTreeRoot.toString()); throw new BizException(300131, "不支援的轉換規則:" + binTreeRoot.toString()); } log.info("result:{}, logicalPlan:{}", binTreeRoot, logicalPlan); return logicalPlan; } /** * 標記各節點的深度,為下一步處理做準備 * * @param root 根節點 */ private LogicalPlan buildCallNodeLogical(MySqlBasicCall root) { MySqlNode[] childs = root.getOperands(); if(childs.length != 2) { log.warn("規則不支援非2個運算子, 個數不支援:{}, root:{}", childs.length, root); throw new BizException(300130, "規則不支援非2個運算子"); } LogicalPlan[] curPlans = new LogicalPlan[2]; for (int i = 0; i < childs.length; i++) { MySqlNode node = childs[i]; // todo: 如何判定當前節點是葉子節點? if (node instanceof MySqlBasicCall) { curPlans[i] = buildCallNodeLogical((MySqlBasicCall) node); continue; } // todo: 自定義函式待完善 if (node instanceof MySqlCustomHandler) { curPlans[i] = buildLogicalWithCustomHandlerNode((MySqlCustomHandler) node); // 此處有問題,待解決 continue; } // node.setDistanceFromLeaf(0); // if(node instanceof ATOM) { // // 以你為起點 // } } // 1. 遍歷到所有葉子節點 // 2. 如果葉子節點是表bitmap表示,則立即生成一個邏輯計劃,即當前節點就是一個臨時表查詢 // 3. 如果是寬表節點,則標記當前節點未被使用,且暫不生成邏輯計劃,繼續向上查詢 // 4. 如果向上回滾時,有以下情況之一的,立即將之前的節點劃為一個邏輯計劃結束,並開啟一個新的邏輯計劃 // 4.1. 當前節點是bitmap, 立即結束之前的計劃,並生成一個當前計劃,並立即結束 // 4.2. 如果是寬表,但與上一節點或者當前計劃中對應的表不是一同一個的,結束之前的計劃,開啟新的; // 4.3. 如果當前節點是右子節點時,且上一邏輯計劃已結束,則立即將當前節點(寬表)作為臨時表新的邏輯計劃生成並結束; // 最後,再進行遍歷邏輯計劃時,就不會存在太多表關聯問題了(當然比最開始全是寬表的情況還是會多出很多臨時表出來) // 不用遞迴 return buildLogicalWithBinaryOperator(root, curPlans[0], curPlans[1], childs[0], childs[1]); } /** * udf類規則解析生成 邏輯計劃 */ private LogicalPlan buildLogicalWithCustomHandlerNode(MySqlCustomHandler handler) { SyntaxStatement stmt = handler.getStmt(); MyRuleStmtMixedType mixedType = stmt.getStmtType(); if(mixedType == MyRuleStmtMixedType.UDF_HANDLER || mixedType == MyRuleStmtMixedType.LBS_FUNCTION) { String partialSql = stmt.translateTo( MyDialectTypeEnum.CLICK_HOUSE, getMyIdMapping()); log.info("ck partial function sql:{}", partialSql); List<SyntaxToken> IdContainer = handler.getHoldingMyIdList(); checkMyIdNum(IdContainer); // 僅有一主鍵id運算的情況下,where條件就是整個語句(無需拆分) String where = partialSql; FieldInfoDescriptor fieldInfoDescriptor = CommonMyConditionAstHandler.getFieldInfoByToken( IdContainer.get(0), getMyIdMapping()); return newLogicalPlan(handler, fieldInfoDescriptor); // 根據主鍵後設資料資訊,構建外圍sql } log.warn("CK不支援的函式操作:{}, 或者純關鍵詞解析(值解析),無需生成plan", handler.toString()); // throw new BizException(300436, "CK不支援的函式操作:" + handler.toString()); return null; } /** * 對細粒度的二元操作,構建sql * * @param root 當前節點root * @param leftPlan 左子樹邏輯計劃,葉子節點時為空 * @param rightPlan 右子樹邏輯計劃,葉子節點時為空 * @param left 左子節點 * @param right 右子節點 */ private LogicalPlan buildLogicalWithBinaryOperator( MySqlBasicCall root, LogicalPlan leftPlan, LogicalPlan rightPlan, MySqlNode left, MySqlNode right) { MySqlOperator operator = root.getOperator(); // 如 (a > 1) and (b <= '2021-09-08') // 走bitmapAndOr() 函式 SyntaxTokenTypeEnum joinType = operator.getKind(); switch (operator.getKind()) { case SQL_AND: // 做 剔除操作 if(reverseBitmapFlagContainer.get(right) != null && reverseBitmapFlagContainer.get(right)) { joinType = SyntaxTokenTypeEnum.SQL_AND_NOT; } return mergePlan(root, leftPlan, rightPlan, joinType); case SQL_OR: if(reverseBitmapFlagContainer.get(right) != null && reverseBitmapFlagContainer.get(right)) { log.warn("不支援的剔除操作, 或 不屬於白名單 型別的操作不支援, {}", root.toString()); throw new BizException(300209, "不支援的剔除操作"); } return mergePlan(root, leftPlan, rightPlan, joinType); case SIMPLE_MATH_OPERATOR: // 簡單四則運算,忽略,可能記錄些東西 log.warn("暫不支援四則運算的表示式:{}", root.toString()); throw new BizException(300520, "暫不支援四則運算的表示式:" + root.toString()); } // a +-*/ 1 四則運算,暫時忽略 // 如 a > 1, 走子查詢邏輯 // String sql = "SELECT 1 AS join_id, groupBitmapState(uv) AS users1 FROM _string_distribute WHERE name = '%s<主鍵名>' and value <operator> '%s<主鍵值>'"; if(left.getKind() == SyntaxTokenTypeEnum._ID || left.getKind() == SyntaxTokenTypeEnum._ID_NAME || left.getKind() == SyntaxTokenTypeEnum._ID_EXTEND) { // todo: 解析出主鍵後設資料資訊,決定使用哪一個型別的bitmap表,或者寬表 FieldInfoDescriptor fieldInfo = CommonMyConditionAstHandler.getFieldInfoByToken( ((MySqlLiteral)left).getValue(), getMyIdMapping()); if(fieldInfo == null) { log.warn("未找到主鍵後設資料資訊:{},請檢查", left.toString()); throw new BizException(300518, "未找到主鍵後設資料資訊:" + left.toString()); } if(fieldInfo.isWhitelistMy()) { String rightValue = right.toString(); if("0".equals(rightValue)) { reverseBitmapFlagContainer.put(root, true); } } // 剔除問題請參考原有實現 return newLogicalPlan(root, fieldInfo); } else if(left.getKind() == SyntaxTokenTypeEnum.STATEMENT_HANDLER) { MySqlCustomHandler handler = (MySqlCustomHandler)left; List<SyntaxToken> IdList = handler.getHoldingMyIdList(); checkMyIdNum(IdList); FieldInfoDescriptor fieldInfo = CommonMyConditionAstHandler.getFieldInfoByToken( IdList.get(0), getMyIdMapping()); if(fieldInfo == null) { log.warn("未找到主鍵後設資料資訊:{},請檢查", left.toString()); throw new BizException(300518, "未找到主鍵後設資料資訊:" + left.toString()); } // 剔除問題請參考原有實現 return newLogicalPlan(root, fieldInfo); } throw new BizException(300519, "分支判定錯誤, 未知的主鍵id:" + root.toString()); } // 合併兩個單表查詢計劃 private LogicalPlan mergePlan(MySqlNode mergeOp, LogicalPlan left, LogicalPlan right, SyntaxTokenTypeEnum joinType) { // 最末的子節點規則合併 if(left instanceof SqlNodeSingleTableLogicalPlan && right instanceof SqlNodeSingleTableLogicalPlan) { SqlNodeSingleTableLogicalPlan leftReal = (SqlNodeSingleTableLogicalPlan) left; SqlNodeSingleTableLogicalPlan rightReal = (SqlNodeSingleTableLogicalPlan) right; if(!leftReal.isUseBitmap() && !rightReal.isUseBitmap() && leftReal.getTableName().equals(rightReal.getTableName())) { MySqlNode[] operands = new MySqlNode[2]; operands[0] = leftReal.getRoot(); operands[1] = rightReal.getRoot(); MySqlBasicCall root = new MySqlBasicCall(new MySqlOperator( joinType.name(), joinType), operands); leftReal.setRoot(root); leftReal.addFields(rightReal.getFields()); leftReal.getTableInfo().getQueryFields().addAll(rightReal.getFields()); leftReal.setLogicalOp(joinType); return leftReal; } return new LineUpTwoTablePlan(left, right, joinType); } // 左子樹為葉子節點或者已經經過合併後的結點 if(left instanceof SqlNodeSingleTableLogicalPlan) { // 遞迴搜尋右節點,如果找到同級別寬表,則返回right, 否則合併left+right int backoffMergeDepth = mergeDeeplyRuleByExchangeLaw(mergeOp, (SqlNodeSingleTableLogicalPlan) left, (LineUpTwoTablePlan) right, joinType); if(backoffMergeDepth == -1) { return new LineUpTwoTablePlan(left, right, joinType); } return right; } // 右子樹為葉子節點或者已經經過合併後的結點 if(right instanceof SqlNodeSingleTableLogicalPlan) { // 遞迴搜尋左節點,如果找到同級別寬表,則返回left, 否則合併left+right int backoffMergeDepth = mergeDeeplyRuleByExchangeLaw(mergeOp, (SqlNodeSingleTableLogicalPlan) right, (LineUpTwoTablePlan) left, joinType); if(backoffMergeDepth == -1) { return new LineUpTwoTablePlan(left, right, joinType); } return left; } return new LineUpTwoTablePlan(left, right, joinType); } /** * 多層次同寬表合併邏輯計劃 * * @param mergeOp 當前計劃節點合併方式,實際影響和 joinType 類似 * @param leafPlan 葉子計劃(待合併的計劃) * @param masterPlan 主控計劃(主分支,其他計劃往其上合併) * @param joinType 本次計算型別 * @return 被合併成功的深度(-1代表未合併成功,需外部再處理) */ private int mergeDeeplyRuleByExchangeLaw(MySqlNode mergeOp, SqlNodeSingleTableLogicalPlan leafPlan, LineUpTwoTablePlan masterPlan, SyntaxTokenTypeEnum joinType) { if(masterPlan == null) { log.warn("主規則為空,請檢查:{} -> {}, 可能遇到了不解析的規則", joinType, mergeOp); throw new BizException(300326, "主規則為空,請檢查:" + joinType + " -> " + mergeOp); } if(leafPlan == null) { log.warn("葉子規則為空,請檢查:{} -> {}, 可能遇到了不解析的規則", joinType, mergeOp); throw new BizException(300326, "葉子規則為空,請檢查:" + joinType + " -> " + mergeOp); } if(leafPlan.isUseBitmap()) { return -1; } int depth = 0; String leafTableName = leafPlan.getTableName(); if (masterPlan.getJoinType() == joinType) { depth++; // 先遞迴左節點 if(masterPlan.getLeft() instanceof SqlNodeSingleTableLogicalPlan) { SqlNodeSingleTableLogicalPlan targetPlan = (SqlNodeSingleTableLogicalPlan) masterPlan.getLeft(); if(leafTableName.equals(targetPlan.getTableName())) { mergeSingleTablePlan(targetPlan, leafPlan, mergeOp, joinType); return depth; } } else if(masterPlan.getLeft() instanceof LineUpTwoTablePlan) { int findDepth = mergeDeeplyRuleByExchangeLaw(mergeOp, leafPlan, (LineUpTwoTablePlan)masterPlan.getLeft(), joinType); if(findDepth != -1) { return depth + findDepth; } } // 再遞迴右節點 if(masterPlan.getRight() instanceof SqlNodeSingleTableLogicalPlan) { SqlNodeSingleTableLogicalPlan targetPlan = (SqlNodeSingleTableLogicalPlan) masterPlan.getRight(); if(leafTableName.equals(targetPlan.getTableName())) { mergeSingleTablePlan(targetPlan, leafPlan, mergeOp, joinType); return depth; } } else if(masterPlan.getRight() instanceof LineUpTwoTablePlan) { int findDepth = mergeDeeplyRuleByExchangeLaw(mergeOp, leafPlan, (LineUpTwoTablePlan)masterPlan.getRight(), joinType); if(findDepth != -1) { return depth + findDepth; } } } return -1; } /** * 合併單表查詢計劃 * * @param target 目標plan(往上合併) * @param from 要合併的計劃 * @param fromRoot 要合併的節點root */ private void mergeSingleTablePlan(SqlNodeSingleTableLogicalPlan target, SqlNodeSingleTableLogicalPlan from, MySqlNode fromRoot, SyntaxTokenTypeEnum joinType) { // todo: 注意論證左右節點順序交換是否影響結果? MySqlNode[] operands = new MySqlNode[2]; operands[0] = target.getRoot(); operands[1] = from.getRoot(); MySqlBasicCall root = new MySqlBasicCall(new MySqlOperator( joinType.name(), joinType), operands); target.setRoot(root); target.addFields(from.getFields()); target.getTableInfo().getQueryFields().addAll(from.getFields()); } /** * 補充邏輯計劃資訊 * * @param plan 計劃描述 * @param fieldInfo 要補充的欄位資訊 */ private void fullFillLogicalInfo(SqlNodeSingleTableLogicalPlan plan, FieldInfoDescriptor fieldInfo) { // go... QueryTableDto tableDto = fieldInfo.getThemeTableInfo(); if(tableDto != null) { plan.setTableInfo(tableDto); plan.addFields(tableDto.getQueryFields()); if(StringUtils.isBlank(plan.getTableName())) { plan.setTableName(tableDto.getTableName()); } } } private SqlNodeSingleTableLogicalPlan newLogicalPlan(MySqlNode root, FieldInfoDescriptor fieldInfo) { if(fieldInfo.isUseBitmap() && fieldInfo.getBitmapTable() == null) { fieldInfo.setBitmapTable(tryGetMyBitmapTableName(fieldInfo)); } SqlNodeSingleTableLogicalPlan logicalPlan = new SqlNodeSingleTableLogicalPlan(root); logicalPlan.setTableName(fieldInfo.getBitmapTable()); logicalPlan.setUseBitmap(fieldInfo.getUseBitmap()); fullFillLogicalInfo(logicalPlan, fieldInfo); logicalContainer.add(logicalPlan); return logicalPlan; } /** * 標記各節點的深度,為下一步處理做準備 * * @param root 根節點 * @param sqlBuilder x */ private void markNodeDepth(MySqlBasicCall root, ClickHouseSqlBuilder sqlBuilder) { MySqlNode[] childs = root.getOperands(); for (int i = 0; i < childs.length; i++) { MySqlNode node = childs[i]; // todo: 如何判定當前節點是葉子節點? if (node instanceof MySqlBasicCall) { markNodeDepth((MySqlBasicCall) node, sqlBuilder); root.setDistanceFromLeaf(Math.max(root.getDistanceFromLeaf(), node.getDistanceFromLeaf())); continue; } if (node instanceof MySqlCustomHandler) { // 此處有問題,待解決 node.setDistanceFromLeaf(root.getDistanceFromLeaf()); continue; } } if(childs.length != 2) { log.warn("規則不支援非2個運算子, 個數不支援:{}, root:{}", childs.length, root); throw new BizException(300130, "規則不支援非2個運算子"); } root.setDistanceFromLeaf(root.getDistanceFromLeaf() + 1); } /** * 邏輯計劃統一介面 */ private static class LogicalPlan { } /** * 將兩個查詢條件連線起來的執行計劃 (相當於join) * * 可聚合多個子條件 */ private class LineUpTwoTablePlan extends LogicalPlan { private LogicalPlan left; private LogicalPlan right; /** * 連線方式, 僅允許為 SQL_AND, SQL_OR , 其他方式一律不支援 */ private SyntaxTokenTypeEnum joinType; public LineUpTwoTablePlan(LogicalPlan left, LogicalPlan right, SyntaxTokenTypeEnum joinType) { this.left = left; this.right = right; this.joinType = joinType; } public LogicalPlan getLeft() { return left; } public LogicalPlan getRight() { return right; } public SyntaxTokenTypeEnum getJoinType() { return joinType; } @Override public String toString() { return "JoinTwoTable#" + "[ " + left.toString() + " \n\t" + joinType.name() + " \n\t" + right.toString() + " ]"; } } /** * 單表執行計劃 */ private class SqlNodeSingleTableLogicalPlan extends LogicalPlan { /** * 當前邏輯計劃的根節點,如 (((a > 1) and b = '2') or (c <> 'x')) * 相當於規則子表示式,此為規則分組 */ private MySqlNode root; /** * 代表當前表聚合了幾個子表示式,提示性屬性,無實際用處 */ private int nodeCnt; /** * 是否使用bitmap表,1:bitmap表, 0:寬表 */ private int useBitmap; private String tableName; /** * 表別名 */ private String tableAlias; /** * 臨時生成的表順序號 */ private int tableIndex; private Set<QueryFieldDto> fields; private QueryTableDto tableInfo; private SqlNodeSingleTableLogicalPlan child; private SqlNodeSingleTableLogicalPlan parent; /** * 與其兄弟節點的運算方式,and/or/andnot */ private SyntaxTokenTypeEnum logicalOp; public SqlNodeSingleTableLogicalPlan(MySqlNode root) { this.root = root; } public MySqlNode getRoot() { return root; } public void setRoot(MySqlNode root) { this.root = root; } public int getNodeCnt() { return nodeCnt; } public void setNodeCnt(int nodeCnt) { this.nodeCnt = nodeCnt; } public boolean isUseBitmap() { return useBitmap == 1; } public void setUseBitmap(int useBitmap) { this.useBitmap = useBitmap; } public String getTableName() { return tableName; } public void setTableName(String tableName) { this.tableName = tableName; } public Set<QueryFieldDto> getFields() { return fields; } public void setFields(Set<QueryFieldDto> fields) { this.fields = fields; } public void addFields(Collection<QueryFieldDto> fields) { if(this.fields != null) { this.fields.addAll(fields); return; } this.fields = new LinkedHashSet<>(); this.fields.addAll(fields); } public QueryTableDto getTableInfo() { return tableInfo; } public void setTableInfo(QueryTableDto tableInfo) { this.tableInfo = tableInfo; } public SqlNodeSingleTableLogicalPlan getChild() { return child; } public void setChild(SqlNodeSingleTableLogicalPlan child) { this.child = child; } public SqlNodeSingleTableLogicalPlan getParent() { return parent; } public void setParent(SqlNodeSingleTableLogicalPlan parent) { this.parent = parent; } public SyntaxTokenTypeEnum getLogicalOp() { return logicalOp; } public void setLogicalOp(SyntaxTokenTypeEnum logicalOp) { this.logicalOp = logicalOp; } public String getTableAlias() { return tableAlias; } public void setTableAlias(String tableAlias) { this.tableAlias = tableAlias; } public int getTableIndex() { return tableIndex; } public void setTableIndex(int tableIndex) { this.tableIndex = tableIndex; } @Override public String toString() { StringBuilder sb = new StringBuilder("SingleTableScan#") .append(tableName) .append("").append("[ "); int i = 0; for (QueryFieldDto r : fields) { if(i++ == 0) { sb.append(r.getField()); if(r.getAlia() != null) { sb.append(" => ").append(r.getAlia()); } continue; } sb.append(",").append(r.getField()); if(r.getAlia() != null) { sb.append(" => ").append(r.getAlia()); } } sb.append(" where ").append(root.toString()).append(" ]"); return sb.toString(); } } /** * 獲取解析樹 */ public MySqlNode getBinTreeRoot() { return binTreeRoot; } /** * 處理普通的 三元操作節點 * * @param root root * @param sqlBuilder sql構造容器,所有sql結構直接寫入該處 */ private void visitCallNode(MySqlBasicCall root, ClickHouseSqlBuilder sqlBuilder) { MySqlNode[] childs = root.getOperands(); for (MySqlNode node : childs) { if(node instanceof MySqlBasicCall) { visitCallNode((MySqlBasicCall) node, sqlBuilder); continue; } if(node instanceof MySqlCustomHandler) { buildSqlWithCustomHandlerNode((MySqlCustomHandler) node, sqlBuilder); continue; } } if(childs.length != 2) { log.warn("規則不支援非2個運算子, 個數不支援:{}, root:{}", childs.length, root); throw new BizException(300130, "規則不支援非2個運算子"); } buildSqlWithBinaryOperator(root, childs[0], childs[1], sqlBuilder); } /** * udf類規則解析生成 sql */ private void buildSqlWithCustomHandlerNode(MySqlCustomHandler handler, ClickHouseSqlBuilder sqlBuilder) { SyntaxStatement stmt = handler.getStmt(); MyRuleStmtMixedType mixedType = stmt.getStmtType(); if(mixedType == MyRuleStmtMixedType.UDF_HANDLER || mixedType == MyRuleStmtMixedType.LBS_FUNCTION) { String partialSql = stmt.translateTo( MyDialectTypeEnum.CLICK_HOUSE, getMyIdMapping()); log.info("ck partial function sql:{}", partialSql); List<SyntaxToken> IdContainer = handler.getHoldingMyIdList(); checkMyIdNum(IdContainer); // 僅有一主鍵id運算的情況下,where條件就是整個語句(無需拆分) String where = partialSql; FieldInfoDescriptor fieldInfoDescriptor = CommonMyConditionAstHandler.getFieldInfoByToken( IdContainer.get(0), getMyIdMapping()); if(fieldInfoDescriptor.isUseBitmap()) { String bitmapTableName = getMyBitmapTableName(fieldInfoDescriptor); fieldInfoDescriptor.setBitmapTable(bitmapTableName); buildBitmapTmpTableWithWhere(handler, fieldInfoDescriptor, where, sqlBuilder); return; } else { buildWideBitmapTmpTableWithWhere(handler, fieldInfoDescriptor, where, sqlBuilder); return; } // 根據主鍵後設資料資訊,構建外圍sql } } /** * 寬表型的條件生成bitmap臨時表方案 */ private void buildWideBitmapTmpTableWithWhere(MySqlCustomHandler root, FieldInfoDescriptor fieldInfoDescriptor, String whereCondition, ClickHouseSqlBuilder sqlBuilder) { QueryTableDto themeTableInfo = getQueryWideTableInfoFromMyMeta(fieldInfoDescriptor); String wideTableName = themeTableInfo.getTableName(); String custNoField = themeTableInfo.getJoinField(); // 主鍵名,欄位名,注意map結構資料 String Name = fieldInfoDescriptor.getMyName(); // $123 轉換,暫定樣板 // bitmap表結構完全一樣 String sqlAfterFrom = String.format("(SELECT %s, bitmapBuild(groupArray(toUInt64(%s))) uv " + " FROM %s WHERE %s)", Name, custNoField, wideTableName, whereCondition); int tableIndex = tableCounter.getAndIncrement(); String tableAlias = "t" + tableIndex; String userBitmapAlias = "user" + tableIndex; Set<QueryFieldDto> fields = new HashSet<>(); fields.add(QueryFieldDto.newField("1", "join_id")); fields.add(QueryFieldDto.newField("groupBitmapState(uv)", userBitmapAlias)); fields.add(QueryFieldDto.newField(Name, Name)); sqlBuilder.with(tableAlias, fields, sqlAfterFrom); QueryTableDto tableDto = new QueryTableDto(tableAlias, tableAlias, null, fields, "join_id", null, false); queryTableDtoList.add(tableDto); bitmapFunctionNodeContainer.put(root, userBitmapAlias); } // 獲取主鍵所在的寬表資訊 private QueryTableDto getQueryWideTableInfoFromMyMeta(FieldInfoDescriptor fieldInfoDescriptor) { QueryTableDto themeTableInfo = fieldInfoDescriptor.getThemeTableInfo(); if(themeTableInfo == null) { List<ThemeIndex> themeIndexList = themeIndexService.getByThemeID(fieldInfoDescriptor.getThemeId()); if(themeIndexList.isEmpty()) { throw new BizException(300203, "主題索引資訊未找到, 請先配置"); } ThemeIndex themeIndex = themeIndexList.get(0); String tableName = themeIndex.getValue(); if(tableName.contains(".")) { tableName = StringUtils.substringAfterLast(tableName, "."); } String joinField = themeIndex.getCustField(); // no dt specify, all in one themeTableInfo = new QueryTableDto(tableName, themeIndex.getAlias(), joinField, true); fieldInfoDescriptor.setThemeTableInfo(themeTableInfo); } return themeTableInfo; } /** * 構建bitmap臨時表(使用where條件) (如: udf函式處理) * * @param root 使用的運算節點 * @param Info 主鍵後設資料資訊 * @param whereCondition where查詢條件 * @param sqlBuilder ck sql容器 */ private void buildBitmapTmpTableWithWhere(MySqlCustomHandler root, FieldInfoDescriptor Info, String whereCondition, ClickHouseSqlBuilder sqlBuilder) { String bitmapTableName = Info.getBitmapTable(); String Name = Info.getMyName(); String sqlAfterFrom = String.format("%s WHERE %s", bitmapTableName, whereCondition); int tableIndex = tableCounter.getAndIncrement(); String tableAlias = "t" + tableIndex; String userBitmapAlias = "user" + tableIndex; Set<QueryFieldDto> fields = new HashSet<>(); fields.add(QueryFieldDto.newField("1", "join_id")); fields.add(QueryFieldDto.newField("groupBitmapMergeState(uv)", userBitmapAlias)); // fields.add(QueryFieldDto.newField("value", Name)); sqlBuilder.with(tableAlias, fields, sqlAfterFrom); QueryTableDto tableDto = new QueryTableDto(tableAlias, tableAlias, null, fields, "join_id", null, false); queryTableDtoList.add(tableDto); bitmapFunctionNodeContainer.put(root, userBitmapAlias); } /** * 暫只支援一個主鍵id進行帶入運算 */ private void checkMyIdNum(List<SyntaxToken> IdList) { Map<String, Object> idMapping = new HashMap<>(); IdList.forEach(r -> { Object v = idMapping.putIfAbsent(r.getRawWord(), r); if(v != null) { log.warn("主鍵id:{} 被引用多次,請檢查是否是正常情況", r); } }); if(idMapping.size() > 1) { throw new BizException(300206, "暫不支援函式中帶多個主鍵id"); } } /** * 對細粒度的二元操作,構建sql * * @param root 當前節點root * @param left 左子節點 * @param right 右子節點 * @param sqlBuilder sql構建器 */ private void buildSqlWithBinaryOperator(MySqlBasicCall root, MySqlNode left, MySqlNode right, ClickHouseSqlBuilder sqlBuilder) { MySqlOperator operator = root.getOperator(); // 如 (a > 1) and (b <= '2021-09-08') // 走bitmapAndOr() 函式 String leftBitmapParam = bitmapFunctionNodeContainer.get(left); String rightBitmapParam = bitmapFunctionNodeContainer.get(right); List<String> params = new ArrayList<>(); String currentBitmapFunc; switch (operator.getKind()) { case SQL_AND: params.add(leftBitmapParam); params.add(rightBitmapParam); // 做 剔除操作 if(reverseBitmapFlagContainer.get(right) != null && reverseBitmapFlagContainer.get(right)) { currentBitmapFunc = "bitmapAndnot(" + leftBitmapParam + "," + rightBitmapParam + ")"; } else { currentBitmapFunc = "bitmapAnd(" + leftBitmapParam + "," + rightBitmapParam + ")"; } bitmapFunctionNodeContainer.put(root, currentBitmapFunc); return; case SQL_OR: params.add(leftBitmapParam); params.add(rightBitmapParam); if(reverseBitmapFlagContainer.get(right) != null && reverseBitmapFlagContainer.get(right)) { log.warn("不支援的剔除操作, 或 不屬於白名單 型別的操作不支援, {}", root.toString()); throw new BizException(300209, "不支援的剔除操作"); } else { currentBitmapFunc = "bitmapOr(" + leftBitmapParam + "," + rightBitmapParam + ")"; } bitmapFunctionNodeContainer.put(root, currentBitmapFunc); return; case SIMPLE_MATH_OPERATOR: // 簡單四則運算,忽略,可能記錄些東西 log.warn("暫不支援四則運算的表示式:{}", root.toString()); throw new BizException(300211, "暫不支援四則運算的表示式:" + root.toString()); } // a +-*/ 1 四則運算,暫時忽略 // 如 a > 1, 走子查詢邏輯 // String sql = "SELECT 1 AS join_id, groupBitmapState(uv) AS users1 FROM _string_distribute WHERE name = '%s<主鍵名>' and value <operator> '%s<主鍵值>'"; if(left.getKind() == SyntaxTokenTypeEnum._ID || left.getKind() == SyntaxTokenTypeEnum._ID_NAME) { // todo: 解析出主鍵後設資料資訊,決定使用哪一個型別的bitmap表,或者寬表 FieldInfoDescriptor fieldInfoDescriptor = CommonMyConditionAstHandler.getFieldInfoByToken( ((MySqlLiteral)left).getValue(), getMyIdMapping()); if(fieldInfoDescriptor == null) { log.warn("未找到主鍵後設資料資訊:{},請檢查", left.toString()); } else if(fieldInfoDescriptor.getUseBitmap() == 1) { String bitmapTableName = getMyBitmapTableName(fieldInfoDescriptor); fieldInfoDescriptor.setBitmapTable(bitmapTableName); buildSimpleBitmapTmpTable(root, left, right, fieldInfoDescriptor, sqlBuilder); return; } // 寬表的bitmap構建方式 else { buildWideBitmapTmpTable(root, left, right, fieldInfoDescriptor, sqlBuilder); return; } } // $123 轉換,暫定樣板 // 以下僅試用於單測樣例,在正式執行時不得執行 String sqlAfterFrom = "_string_distribute WHERE name = '" + left.toString() + "' and value " + operator.toString() + " " + right.toString() + ""; int tableIndex = tableCounter.getAndIncrement(); String tableAlias = "t" + tableIndex; String userBitmapAlias = "user" + tableIndex; Set<QueryFieldDto> fields = new HashSet<>(); fields.add(QueryFieldDto.newField("1", "join_id")); fields.add(QueryFieldDto.newField("groupBitmapState(uv)", userBitmapAlias)); sqlBuilder.with(tableAlias, fields, sqlAfterFrom); QueryTableDto tableDto = new QueryTableDto(tableAlias, tableAlias, null, fields, "join_id", null, false); queryTableDtoList.add(tableDto); bitmapFunctionNodeContainer.put(root, userBitmapAlias); } /** * 獲取主鍵所在的bitmap表名(如果不是bitmap型主鍵,丟擲異常) * * @param fieldInfoDescriptor 主鍵後設資料資訊 * @return 主鍵所在bitmap表名 */ private String getMyBitmapTableName(FieldInfoDescriptor fieldInfoDescriptor) { if(!fieldInfoDescriptor.isUseBitmap()) { throw new BizException(300208, String.format("當前主鍵:%s不支援使用bitmap運算", fieldInfoDescriptor.getMyName())); } return tryGetMyBitmapTableName(fieldInfoDescriptor); } /** * 不丟擲異常地獲取bitmap表名 * * @see #getMyBitmapTableName(FieldInfoDescriptor) */ private String tryGetMyBitmapTableName(FieldInfoDescriptor fieldInfoDescriptor) { if(fieldInfoDescriptor.getBitmapTable() != null) { return fieldInfoDescriptor.getBitmapTable(); } // 白名單統一一張bitmap表 if(fieldInfoDescriptor.isWhitelistMy()) { String tableName = ApolloUtil.getProperty( "ck_whitelist_all_table_name", "bm_whitelist_all_bitmap"); fieldInfoDescriptor.setBitmapTable(tableName); return tableName; } /** * tinyint * smallint * int * bigint * float * double * decimal * timestamp * date * string * varchar * char * boolean * binary * map */ String bitmapTableName = null; String dataType = fieldInfoDescriptor.getDataType(); String distributeSuffix = "_Distribute_20210418"; if(dataType.equals("int") || dataType.endsWith("int")) { bitmapTableName = "ch__int"; } else if(dataType.equals("float") || dataType.equals("double") || dataType.equals("decimal")) { bitmapTableName = "ch__double"; } else if(dataType.equals("date") || dataType.equals("timestamp")) { bitmapTableName = "ch__date"; } else { bitmapTableName = "ch__string"; } // 如果是 is not null 型別,則到 ch__common 表中查詢 bitmapTableName += distributeSuffix; fieldInfoDescriptor.setBitmapTable(bitmapTableName); return bitmapTableName; } /** * 非空使用者bitmap集全表 (特殊表) */ private String getNotNullBitmapTableName() { String bitmapTableName = "ch__common"; return bitmapTableName + "_Distribute_20210418"; } /** * bitmap表的構建方式 * * @param fieldInfoDescriptor 單個主鍵後設資料資訊 */ private String buildSimpleBitmapTmpTable(MySqlBasicCall root, MySqlNode left, MySqlNode right, FieldInfoDescriptor fieldInfoDescriptor, ClickHouseSqlBuilder sqlBuilder) { String bitmapTableName = fieldInfoDescriptor.getBitmapTable(); // todo: 此處僅假設只有一個主鍵左值,並無四則運算和函式運算 String Name = fieldInfoDescriptor.getMyName(); String sqlAfterFrom = null; int tableIndex = tableCounter.getAndIncrement(); String tableAlias = "t" + tableIndex; String userBitmapAlias = "user" + tableIndex; Set<QueryFieldDto> fields = new HashSet<>(); fields.add(QueryFieldDto.newField("1", "join_id")); if(fieldInfoDescriptor.isWhitelistMy()) { String existOrNot = right.toString(); // xand $123 = 1, 且屬於白名單,只需選出該白名單bitmap即可 // xor $123 = 1, 或屬於白名單,結論是確定的 // xand $123 = 0, 且不屬於白名單,需要做排除運算,這是個難題, 使用 andnot 上層處理 // xor $123 = 0, 或不屬於白名單,需要上層運算改變,此處語義複雜,暫不實現 // 剔除屬於白名單,屬於白名單,不屬於白名單,剔除不屬於白名單 // SELECT arrayJoin(bitmapToArray(bitmapAndnot(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])))) exclude_whitelist sqlAfterFrom = String.format("%s WHERE name = '%s'", bitmapTableName, Name); fields.add(QueryFieldDto.newField("cust_no_bitmap_v", userBitmapAlias)); if("0".equals(existOrNot)) { reverseBitmapFlagContainer.put(root, true); } } else { // $123 轉換,暫定樣板 // bitmap表結構完全一樣 // is not null, 讀取另一張特殊表 if(right.getKind() == SyntaxTokenTypeEnum.SQL_NOT_NULL) { bitmapTableName = getNotNullBitmapTableName(); sqlAfterFrom = String.format("%s WHERE name = '%s'", bitmapTableName, Name); fields.add(QueryFieldDto.newField("groupBitmapMergeState(uv)", userBitmapAlias)); } else { String rightValue = translateRightNodeValue(right); MySqlNode operator = root.getOperator(); // 如 (substring($123, 1, 10) = '2xx') if(left instanceof MySqlCustomHandler) { String leftSql = ((MySqlCustomHandler)left).getStmt().translateTo( MyDialectTypeEnum.CLICK_HOUSE, getMyIdMapping()); String NameCond = "name = '" + Name + "'"; if(!leftSql.contains(NameCond)) { leftSql = NameCond + " and " + leftSql; } sqlAfterFrom = String.format("%s WHERE %s %s %s", bitmapTableName, leftSql, operator.toString(), rightValue); } else { sqlAfterFrom = String.format("%s WHERE name = '%s' and value %s %s", bitmapTableName, Name, operator.toString(), rightValue); } fields.add(QueryFieldDto.newField("groupBitmapMergeState(uv)", userBitmapAlias)); } // mergeState後,無法獲取具體的欄位值了 // fields.add(QueryFieldDto.newField("value", Name)); } sqlBuilder.with(tableAlias, fields, sqlAfterFrom); QueryTableDto tableDto = new QueryTableDto(tableAlias, tableAlias, null, fields, "join_id", null, false); queryTableDtoList.add(tableDto); bitmapFunctionNodeContainer.put(root, userBitmapAlias); return userBitmapAlias; } private static ThemeIndexService themeIndexService = SpringContextUtil.getBean(ThemeIndexService.class); /** * 寬表型的條件生成bitmap臨時表方案 */ private String buildWideBitmapTmpTable(MySqlBasicCall root, MySqlNode left, MySqlNode right, FieldInfoDescriptor fieldInfoDescriptor, ClickHouseSqlBuilder sqlBuilder) { QueryTableDto themeTableInfo = getQueryWideTableInfoFromMyMeta(fieldInfoDescriptor); String bitmapTableName = themeTableInfo.getTableName(); String custNoField = themeTableInfo.getJoinField(); String Name = fieldInfoDescriptor.getMyName(); // bitmap表結構完全一樣 MySqlNode operator = root.getOperator(); String rightValue = translateRightNodeValue(right); String sqlAfterFrom = String.format(" %s " + " WHERE %s %s %s", bitmapTableName, Name, operator.toString(), rightValue); int tableIndex = tableCounter.getAndIncrement(); String tableAlias = "t" + tableIndex; String userBitmapAlias = "user" + tableIndex; Set<QueryFieldDto> fields = new HashSet<>(); fields.add(QueryFieldDto.newField("1", "join_id")); fields.add(QueryFieldDto.newField("groupBitmapState(toUInt64OrZero(" + custNoField + "))", userBitmapAlias)); // fields.add(QueryFieldDto.newField(Name, Name)); sqlBuilder.with(tableAlias, fields, sqlAfterFrom); QueryTableDto tableDto = new QueryTableDto(tableAlias, tableAlias, null, fields, "join_id", null, false); queryTableDtoList.add(tableDto); bitmapFunctionNodeContainer.put(root, userBitmapAlias); return userBitmapAlias; } /** * 處理右節點值問題 (普通資料或簡單關鍵詞函式) * * @param right 右節點 * @return 翻譯後的值 */ private String translateRightNodeValue(MySqlNode right) { String rightValue = null; if(right.getKind() == SyntaxTokenTypeEnum.STATEMENT_HANDLER) { MySqlCustomHandler handler = (MySqlCustomHandler)right; rightValue = handler.getStmt().translateTo( MyDialectTypeEnum.CLICK_HOUSE, getMyIdMapping()); log.info("ck partial function sql:{}", rightValue); } else { rightValue = right.toString(); } return rightValue; } }
4.3. 單元測試
@Test public void testNodeReGroup() { MyRuleSqlNodeParsedClause parsedClause; MySqlTranslator translator; String ckSql; String ruleEl; MySqlTranslator.turnOnUseUnifiedParserSwitch(); ruleEl = "( substring ( field_2436 ,4 ,8 ) = '-08-' ) and ( substring ( field_4515 ,0 ,1 ) = '2' ) or ( substring ( field_27103 ,1 ,2 ) = '9' )"; testBinToRelConvertInner(ruleEl, null); ruleEl = "( field_461043 is not null ) and ( field_3728 is not null ) xor ( field_467492 = 1 )"; testBinToRelConvertInner(ruleEl, null); try { ruleEl = "(field_4749 >= '2017-09-01') or (field_4748 >= '2017-09-01')"; testBinToRelConvertInner(ruleEl, null); Assert.fail("ck支援lbs運算了?"); } catch (Exception e) { e.printStackTrace(); Assert.assertThat(e.getMessage(), containsString("主鍵資訊未找到,請檢查:field_4748")); } // is not null 處理 ruleEl = "( field_43221 is not null)"; testBinToRelConvertInner(ruleEl, null); ruleEl = "((field_469011 = 1 ) xor (field_469016 = 1 )) and ((field_40467 = '322') and (field_466911 = '0') and (substring(field_46789 ,0 ,2) in ('33')) and (field_40243 >= #{day-90}) and (field_40243 >= '20210704'))"; testBinToRelConvertInner(ruleEl, null); try { // 空規則測試 ruleEl = " "; testBinToRelConvertInner(ruleEl, null); Assert.fail("空規則解析了?"); } catch (Exception e) { e.printStackTrace(); Assert.assertThat(e.getMessage(), containsString("不支援空規則配置")); } try { // 空規則測試 ruleEl = "( )"; testBinToRelConvertInner(ruleEl, null); Assert.fail("空括號解析了?"); } catch (Exception e) { e.printStackTrace(); Assert.assertThat(e.getMessage(), containsString("不支援空括號")); } try { ruleEl = "( field_464592 is not null ) or ( field_472351 > 1 ) xor ( field_25100 = 1 )"; testBinToRelConvertInner(ruleEl, null); Assert.fail("field_472351 主鍵後設資料資訊找到了?"); } catch (Exception e) { e.printStackTrace(); System.err.println("發現異常,忽略錯誤,報錯資訊待完善,暫不捕獲"); } try { ruleEl = "(field_3826 > Y) and (field_2475 = 1)"; testBinToRelConvertInner(ruleEl, null); Assert.fail("Y被解析了?"); } catch (Exception e) { e.printStackTrace(); Assert.assertThat(e.getMessage(), containsString("主鍵id未找到對應後設資料資訊或規則配置錯誤:Y")); } ruleEl = "( ( field_13428 between 500000 and 1999999.999 ) and ( field_37272 < 1999999.9999 ) and ( field_39866 < 1999999.9999 ) ) or ( ( field_13428 < 1999999.9999 ) and ( field_37272 between 500000 and 1999999.9999 ) and ( field_39866 < 1999999.9999 ) ) or ( ( field_13428 < 1999999.9999 ) and ( field_37272 < 1999999.9999 ) and ( field_39866 between 500000 and 1999999.9999 ) ) or ( field_3728 in ( '600103317730' , '600103331257' , '600205279354' ) )"; testBinToRelConvertInner(ruleEl, null); ruleEl = "( field_472691 = #{day+0} )"; testBinToRelConvertInner(ruleEl, null); ruleEl = "( field_471875 > 1 ) and ( field_27050 = 'Y' ) or field_471874 <> '3' or field_471831 < 'code'"; testBinToRelConvertInner(ruleEl, null); ruleEl = "( field_471875 > 1 ) and ( field_27050 = 'Y' ) or (field_471874 <> '3' and (field_471874 like 'abc%' or field_471874 < 'x')) or field_471831 < 'code'"; testBinToRelConvertInner(ruleEl, null); ruleEl = "( field_471875 > 1 ) and ( field_27050 = 'Y' ) or field_471831 < '__code' or (field_471874 <> '3' and (field_471874 like 'abc%' or (field_471874 < 'x' and field_471874 > 2))) or field_471831 < 'code'"; testBinToRelConvertInner(ruleEl, null); MySqlTranslator.turnOffUseUnifiedParserSwitch(); } private void testBinToRelConvertInner(String ruleEl, String expectSql) { MyRuleSqlNodeParsedClause parsedClause; MySqlTranslator translator; String ckSql; parsedClause = MySyntaxParser.parseAsTree(ruleEl); parsedClause.resolveMyIdFromDB(); ckSql = parsedClause.translateFullCkSqlProf(false); System.out.println("ckSql:" + ckSql); }
4.4. 從calicite中摘取的 precedence util 參考
/** * Parser that takes a collection of tokens (atoms and operators) * and groups them together according to the operators' precedence * and associativity. */ public class PrecedenceClimbingParser { private Token first; private Token last; private PrecedenceClimbingParser(List<Token> tokens) { Token p = null; for (Token token : tokens) { if (p != null) { p.next = token; } else { first = token; } token.previous = p; token.next = null; p = token; } last = p; } public Token atom(Object o) { return new Token(Type.ATOM, o, -1, -1); } public Call call(Op op, ImmutableList<Token> args) { return new Call(op, args); } public Op infix(Object o, int precedence, boolean left) { return new Op(Type.INFIX, o, precedence * 2 + (left ? 0 : 1), precedence * 2 + (left ? 1 : 0)); } public Op prefix(Object o, int precedence) { return new Op(Type.PREFIX, o, -1, precedence * 2); } public Op postfix(Object o, int precedence) { return new Op(Type.POSTFIX, o, precedence * 2, -1); } public SpecialOp special(Object o, int leftPrec, int rightPrec, Special special) { return new SpecialOp(o, leftPrec * 2, rightPrec * 2, special); } public Token parse() { partialParse(); if (first != last) { throw new AssertionError("could not find next operator to reduce: " + this); } return first; } public void partialParse() { for (;;) { Op op = highest(); if (op == null) { return; } final Token t; switch (op.type) { case POSTFIX: t = call(op, ImmutableList.of(op.previous)); replace(t, op.previous.previous, op.next); break; case PREFIX: t = call(op, ImmutableList.of(op.next)); replace(t, op.previous, op.next.next); break; case INFIX: if(op.previous == null) { throw new BizException(300421, "規則配置有誤: 【" + op.toString() + "】的左邊未配置值"); } if(op.next == null) { throw new BizException(300422, "規則配置有誤: 【" + op.toString() + "】的右邊未配置值"); } t = call(op, ImmutableList.of(op.previous, op.next)); replace(t, op.previous.previous, op.next.next); break; case SPECIAL: Result r = ((SpecialOp) op).special.apply(this, (SpecialOp) op); Objects.requireNonNull(r); replace(r.replacement, r.first.previous, r.last.next); break; default: throw new AssertionError(); } // debug: System.out.println(this); } } @Override public String toString() { List<Token> all = all(); StringBuilder b = new StringBuilder(all.get(0).toString()); for (int i = 0; i < all.size(); i++) { Token t = all.get(i); b.append(", ").append(t.toString()); } return b.toString(); } /** Returns a list of all tokens. */ public List<Token> all() { return new TokenList(); } private void replace(Token t, Token previous, Token next) { t.previous = previous; t.next = next; if (previous == null) { first = t; } else { previous.next = t; } if (next == null) { last = t; } else { next.previous = t; } } private Op highest() { int p = -1; Op highest = null; for (Token t = first; t != null; t = t.next) { if ((t.left > p || t.right > p) && (t.left < 0 || t.left >= prevRight(t.previous)) && (t.right < 0 || t.right >= nextLeft(t.next))) { p = Math.max(t.left, t.right); highest = (Op) t; } } return highest; } /** Returns the right precedence of the preceding operator token. */ private int prevRight(Token token) { for (; token != null; token = token.previous) { if (token.type == Type.POSTFIX) { return Integer.MAX_VALUE; } if (token.right >= 0) { return token.right; } } return -1; } /** Returns the left precedence of the following operator token. */ private int nextLeft(Token token) { for (; token != null; token = token.next) { if (token.type == Type.PREFIX) { return Integer.MAX_VALUE; } if (token.left >= 0) { return token.left; } } return -1; } public String print(Token token) { return token.toString(); } /** Token type. */ public enum Type { ATOM, CALL, PREFIX, INFIX, POSTFIX, SPECIAL } /** A token: either an atom, a call to an operator with arguments, * or an unmatched operator. */ public static class Token { Token previous; Token next; public final Type type; public final Object o; final int left; final int right; Token(Type type, Object o, int left, int right) { this.type = type; this.o = o; this.left = left; this.right = right; } @Override public String toString() { return o.toString(); } protected StringBuilder print(StringBuilder b) { return b.append(o); } public Token copy() { return new Token(type, o, left, right); } } /** An operator token. */ public static class Op extends Token { Op(Type type, Object o, int left, int right) { super(type, o, left, right); } @Override public Token copy() { return new Op(type, o, left, right); } } /** An token corresponding to a special operator. */ public static class SpecialOp extends Op { public final Special special; SpecialOp(Object o, int left, int right, Special special) { super(Type.SPECIAL, o, left, right); this.special = special; } @Override public Token copy() { return new SpecialOp(o, left, right, special); } } /** A token that is a call to an operator with arguments. */ public static class Call extends Token { public final Op op; public final ImmutableList<Token> args; Call(Op op, ImmutableList<Token> args) { super(Type.CALL, null, -1, -1); this.op = op; this.args = args; } @Override public Token copy() { return new Call(op, args); } @Override public String toString() { return print(new StringBuilder()).toString(); } protected StringBuilder print(StringBuilder b) { switch (op.type) { case PREFIX: b.append('('); printOp(b, false, true); args.get(0).print(b); return b.append(')'); case POSTFIX: b.append('('); args.get(0).print(b); return printOp(b, true, false).append(')'); case INFIX: b.append('('); args.get(0).print(b); printOp(b, true, true); args.get(1).print(b); return b.append(')'); case SPECIAL: printOp(b, false, false) .append('('); b.append("special"); // for (Ord<Token> arg : Ord.zip(args)) { // if (arg.i > 0) { // b.append(", "); // } // arg.e.print(b); // } return b.append(')'); default: throw new AssertionError(); } } private StringBuilder printOp(StringBuilder b, boolean leftSpace, boolean rightSpace) { String s = op.o.toString(); if (leftSpace) { b.append(' '); } b.append(s); if (rightSpace) { b.append(' '); } return b; } } /** Callback defining the behavior of a special function. */ public interface Special { /** Given an occurrence of this operator, identifies the range of tokens to * be collapsed into a call of this operator, and the arguments to that * call. */ Result apply(PrecedenceClimbingParser parser, SpecialOp op); } /** Result of a call to {@link Special#apply}. */ public static class Result { final Token first; final Token last; final Token replacement; public Result(Token first, Token last, Token replacement) { this.first = first; this.last = last; this.replacement = replacement; } } /** Fluent helper to build a parser containing a list of tokens. */ public static class Builder { final List<Token> tokens = new ArrayList<>(); private final PrecedenceClimbingParser dummy = new PrecedenceClimbingParser(ImmutableList.of()); private Builder add(Token t) { tokens.add(t); return this; } public Builder atom(Object o) { return add(dummy.atom(o)); } public Builder call(Op op, Token arg0, Token arg1) { return add(dummy.call(op, ImmutableList.of(arg0, arg1))); } public Builder infix(Object o, int precedence, boolean left) { return add(dummy.infix(o, precedence, left)); } public Builder prefix(Object o, int precedence) { return add(dummy.prefix(o, precedence)); } public Builder postfix(Object o, int precedence) { return add(dummy.postfix(o, precedence)); } public Builder special(Object o, int leftPrec, int rightPrec, Special special) { return add(dummy.special(o, leftPrec, rightPrec, special)); } /** * 獲取最末一個token 資訊,輔助外部處理 */ public Token getLastToken() { if(tokens.isEmpty()) { return null; } return tokens.get(tokens.size() - 1); } public PrecedenceClimbingParser build() { return new PrecedenceClimbingParser(tokens); } } /** List view onto the tokens in a parser. The view is semi-mutable; it * supports {@link List#remove(int)} but not {@link List#set} or * {@link List#add}. */ private class TokenList extends AbstractList<Token> { @Override public Token get(int index) { for (Token t = first; t != null; t = t.next) { if (index-- == 0) { return t; } } throw new IndexOutOfBoundsException(); } @Override public int size() { int n = 0; for (Token t = first; t != null; t = t.next) { ++n; } return n; } @Override public Token remove(int index) { Token t = get(index); if (t.previous == null) { first = t.next; } else { t.previous.next = t.next; } if (t.next == null) { last = t.previous; } else { t.next.previous = t.previous; } return t; } @Override public Token set(int index, Token element) { final Token t = get(index); element.previous = t.previous; if (t.previous == null) { first = element; } else { t.previous.next = element; } element.next = t.next; if (t.next == null) { last = element; } else { t.next.previous = element; } return t; } } } // End PrecedenceClimbingParser.java
以上,可以說是沒有用現有大師輪子的情況下,自行實現了一個小的重組優化,也許現實意義並不太大,但是思考過程絕對是有用的。