貼一下第二期的第二道題。

第二期題目參考：http://www.itpub.net/thread-1403356-1-1.html

版主newkid點評參考：http://www.itpub.net/thread-1411980-1-1.html

簡單描述一下問題：

問題2(40分)：

在一個N*N的棋盤中，每行放置M個球，每列、每個45度的斜線上最多放置M個球，其中5<=N<=6,1<=M<=(N-1),現要求出每個M,N組合中最多擺放球的不同的擺法的個數（包括重複和不重複的，分別輸出）。請用最多兩條SQL語句得到以下結果：（以M=2, N=5為例）

兩條SQL的輸出格式：

SQL1:

M N AllCnt

2 5 92

SQL2:

M N NoReptCnt

2 5 xx

一條SQL的輸出格式：

M N AllCnt NoReptCnt

2 5 92 xx

Oracle變數定義如下（以M=2, N=5為例）：

var m number;

exec :m:=2;

var n number;

exec :n:=5;

這道題其實就是在第一題的基礎上要求了擴充套件性，雖然題目要求5<=N<=6，但是真正擴充套件性好的解題應該是2<=N，當然由於演算法和機器效能的限制，可能N最多也就支援到7左右。

SQL> VAR N NUMBER
SQL> VAR M NUMBER
SQL> EXEC :N := 5

PL/SQL 過程已成功完成。

SQL> EXEC :M := 2

PL/SQL 過程已成功完成。

SQL> SET PAGES 100 LINES 120 TIMING ON

SQL> with i as
2 (select rownum i from dual connect by rownum <= :n),
3 j as
4 (select rownum j from dual connect by rownum <= :n),
5 position as
6 (select i, j
7 from i, j),
8 b as
9 (select rownum - 1 b from dual connect by rownum <= 2),
10 b_line as
11 (select replace(sys_connect_by_path(b, ','), ',', '') line
12 from b
13 where level = :n
14 connect by level <= :n),
15 lines as
16 (select rownum, line from b_line
17 where instr(line, 1, 1, :m) > 0
18 and instr(line, 0, 1, :n - :m) > 0),
19 lines_result as
20 (select replace(sys_connect_by_path(line, ','), ',', '') result
21 from lines
22 where level = :n
23 connect by level <= :n)
24 select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, res2, res3, res4, res5, res6, res7, res8)) NoReptCnt
25 from (
26 select result res1, reverse(result) res2,
27 (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + 1, :n)), ',', '') from (select i from i order by i desc)) res3,
28 (select replace(wmsys.wm_concat(reverse(substr(result, (i-1)*:n + 1, :n))), ',', '') from (select i from i order by i)) res4,
29 (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j desc, i)) res5,
30 (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j, i)) res6,
31 (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j, i desc)) res7,
32 (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j desc, i desc)) res8
33 from (
34 select result, j col, (j-i) l, (j+i) r, sum(substr(result, (i-1)*:n + j, 1)) c
35 from lines_result, position
36 group by grouping sets ((result, j), (result, (j-i)), (result, (j+i))))
37 group by result
38 having max(case when l is null and r is null then c end) = :m
39 and max(case when col is null and l is null then c end) <= :m
40 and max(case when col is null and r is null then c end) <= :m
41 );

M N ALLCNT NOREPTCNT
---------- ---------- ---------- ----------
2 5 92 14

已用時間: 00: 00: 22.43

這是最初的SQL，以實現為主要目的。主要的思路仍然是仿照第一題的似乎，只不過這裡不能確定多少個表進行連線，因此構造的工作由樹形查詢來完成。

首先仍然是確定每行兩個球，然後將符合條件的行構造陣列。將資料結果和位置進行關聯，分別對j、j+i和j-i進行聚集，聚集的結果分別對應列、左對角線和右對角線，然後利用having語句過濾不滿足條件的列，最終的結果就是符合條件的陣列。

在符合條件的陣列上，分別進行7種變形，然後最終透過和第一題一樣的count distinct greatest方式，獲取不衝突的值。

雖然結果滿足要求，但是本來滿足條件的陣列就不少，還多次和N*N的陣列進行笛卡爾積，肯定效能不會太好。另外這種方式大量利用了wmsys.wm_concat，效能不但較低，而且這個函式不屬於Oracle的標準函式，不一定在所有的資料庫中都會存在。

當前這個SQL跑M=2，N=5需要將近23秒，跑M=2，N=6基本上就出不來結果了。

SQL> with i as
2 (select rownum i from dual connect by rownum <= :n),
3 j as
4 (select rownum j from dual connect by rownum <= :n),
5 position as
6 (select i, j from i, j),
7 trim as
8 (select sys_connect_by_path(level -1, ' ') str
9 from dual
10 where level = :m + 1
11 connect by level <= :m + 1),
12 b as
13 (select rownum - 1 b from dual connect by rownum <= 2),
14 b_line as
15 (select sys_connect_by_path(b, ',') line
16 from b
17 where level = :n
18 connect by level <= :n),
19 lines as
20 (select rownum, replace(line, ',', '') line from b_line
21 where instr(line, 1, 1, :m) > 0
22 and instr(line, 0, 1, :n - :m) > 0),
23 lines_result as
24 (select replace(sys_connect_by_path(line, ','), ',', '') result
25 from lines
26 where level = :n
27 connect by level <= :n)
28 select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, reverse(res1), res2, reverse(res2), res3, reverse(res3), res4, reverse(res4))) NoReptCnt
29 from (
30 select result res1,
31 (select replace(sys_connect_by_path(reverse(substr(result, (level-1)*:n + 1, :n)), ','), ',', '') from dual where level = :n connect by level <= :n) res2,
32 (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j desc, i)) res3,
33 (select replace(wmsys.wm_concat(substr(result, (i-1)*:n + j, 1)), ',' ,'') from (select i, j from position order by j, i)) res4
34 from lines_result, i
35 group by result
36 having sum(substr(result, (i-1)*:n + 1, :n)) = to_number(lpad(:m, :n, :m))
37 and ltrim(sum(substr(result, (i-1)*:n + 1, :n)*power(10, i-1)), (select str from trim)) is null
38 and ltrim(sum(substr(result, (i-1)*:n + 1, :n)*power(10, :n-i)), (select str from trim)) is null
39 );

M N ALLCNT NOREPTCNT
---------- ---------- ---------- ----------
2 5 92 14

已用時間: 00: 00: 03.48
SQL> exec :n := 6

PL/SQL 過程已成功完成。

已用時間: 00: 00: 00.00
SQL> /

M N ALLCNT NOREPTCNT
---------- ---------- ---------- ----------
2 6 1097 155

已用時間: 00: 08: 23.27

這是第一次大規模最佳化後的結果，跑M=2，N=5需要3秒多，是原始SQL的1/7，而M=2,N=6也可以得到結果，但是用了8分鐘之久，速度仍然太慢。

這裡最佳化主要在兩個方面，首先去掉了對陣列和位置的笛卡爾積，而變成了與行的笛卡爾積，這使得最終結果小了N倍，同時把GROUPING SET聚集方式又修改為ltrim方式。然後最佳化了RESULT的變形，除了本身結果外，只需要在獲得3種變形，然後利用REVERSE獲得另外的4種變形。

當前的效率仍然很低，主要原因在於使用樹形查詢沒有辦法在執行的中間將不符合條件的結果過濾掉，為了解決這個問題，只能使用11.2的新特性遞迴WITH語句。

遞迴WITH在11.2剛出來的時候，作為新特性研究了一下，不過並不深入。現在別說遞迴WITH，就是WITH語句，甚至SELECT語句，寫的都少得多了。所以基本上仍然處於現學現賣的過程，經過若干次小的最佳化之後，提交的最終SQL如下：

SQL> with trim as
2 (select sys_connect_by_path(level - 1, ' ') str
3 from dual
4 where level = :m + 1
5 connect by level <= :m + 1),
6 line_count as
7 (select (select round(power(2, sum(log(2, rownum)))) from dual connect by rownum <= :n)
8 / (select round(power(2, sum(log(2, rownum)))) from dual connect by rownum <= :m)
9 / (select round(power(2, sum(log(2, rownum)))) from dual connect by rownum <= :n - :m)
10 from dual),
11 b as
12 (select rownum - 1 b from dual connect by rownum <= 2),
13 b_line (line, cnt) as
14 (select b || '', b from b
15 union all
16 select b || line, b + cnt
17 from b, b_line
18 where cnt <= :m)
19 cycle line set dup_line to 'y' default 'n',
20 lines as
21 (select line
22 from b_line
23 where length(line) = :n
24 and cnt = :m
25 and rownum <= (select * from line_count)),
26 line_result (result, cnt, r, l) as
27 (select line || '', to_number(line), to_number(line), to_number(line)
28 from lines
29 union all
30 select line || result, to_number(line) + cnt, to_number(line) + 10*r, to_number(line) + l/10
31 from lines, line_result
32 where ltrim(cnt, (select str from trim)) is null
33 and ltrim(r, (select str from trim)) is null
34 and ltrim(l, (select str || '.' from trim)) is null
35 and instr(case when length(result) >= :n*(:n-:m+1) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '0', 1) = 0
36 and instr(case when :n-:m+2 < :n and length(result) >= :n*(:n-:m+2) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '1', 1) = 0)
37 cycle result set dup_result to 'y' default 'n',
38 result as
39 (select result, cnt, r, l from line_result
40 where rownum <= power((select * from line_count), :n))
41 select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, reverse(res1), res2, reverse(res2), res3, reverse(res3), res4, reverse(res4))) NoReptCnt
42 from (
43 select result res1,
44 (select replace(sys_connect_by_path(reverse(substr(result, (level-1)*:n + 1, :n)), ','), ',', '') from dual where level = :n connect by level <= :n) res2,
45 (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + ceil(level/:n), 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res3,
46 (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + :n-ceil(level/:n)+1, 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res4
47 from result
48 where length(result) = :n*:n
49 and cnt = to_number(lpad(:m, :n, :m))
50 and ltrim(r, (select str from trim)) is null
51 and ltrim(l, (select str || '.' from trim)) is null
52 );

M N ALLCNT NOREPTCNT
---------- ---------- ---------- ----------
2 6 1097 155

已用時間: 00: 00: 06.26
SQL> exec :m := 3

PL/SQL 過程已成功完成。

已用時間: 00: 00: 00.00
SQL> /

M N ALLCNT NOREPTCNT
---------- ---------- ---------- ----------
3 6 14412 1811

已用時間: 00: 03: 25.54

採用遞迴WITH的最大好處就是可以在每層迭代的同時過濾掉不滿足條件的資料，使得查詢速度有了質的飛躍，執行M=2，N=6現在只需要6秒多點，不過最耗時的M=3，N=6還需要3分半鐘，不是很令人滿意。

簡單介紹一下思路：trim字句是獲取ltrim的字串；line_count用於計算Cnm的值；兩層遞迴WITH查詢分別構造行記錄和陣列記錄，其中第二個構造陣列結果的遞迴WITH，還同時求出了列和兩個對角線的和，在遞迴的過程中可以用LTRIM函式過濾列和對角線來清除不滿足條件的記錄。

對獲得的結果，利用樹形查詢替代了wmsys.wm_concat對結果進行轉換，獲取另外三種變形，然後利用REVERSE獲取其他四種變形，最終利用COUNT DISTINCT GREATEST來獲得不重複的記錄。

由於最近工作上的事情很多，導致第二題沒有足夠的時間進行思考和最佳化，因此雖然明知道效能不是很滿意，也沒有辦法只能提交了。

不過在做第三題的時候，發現第二題還有很多可以最佳化的地方，只需要做些簡單的修改，在遞迴的時候就可以過濾更多的無用資料：

SQL> with trim as
2 (select sys_connect_by_path(level - 1, ' ') str
3 from dual
4 where level = :m + 1
5 connect by level <= :m + 1),
6 b as
7 (select rownum - 1 b from dual connect by rownum <= 2),
8 b_line (n, line, cnt) as
9 (select 1 n, to_char(b), b from b
10 union all
11 select n + 1, line || b, cnt + b
12 from b, b_line
13 where n < :n
14 and cnt <= :m),
15 lines as
16 (select line
17 from b_line
18 where cnt = :m
19 and length(line) = :n),
20 line_result (n, result, cnt, r, l) as
21 (select 1 n, line, to_number(line), to_number(line), to_number(line)
22 from lines
23 union all
24 select n + 1, result || line, to_number(line) + cnt, to_number(line) + 10*r, to_number(line) + l/10
25 from lines, line_result
26 where n < :n
27 and ltrim(to_number(line) + cnt, (select str from trim)) is null
28 and ltrim(to_number(line) + 10*r, (select str from trim)) is null
29 and ltrim(to_number(line) + l/10, (select str || '.' from trim)) is null
30 and instr(case when length(result) >= :n*(:n-:m+1) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '0', 1) = 0
31 and instr(case when :n-:m+2 < :n and length(result) >= :n*(:n-:m+2) then to_char(cnt, rpad('0', :n, '9')) else '9' end, '1', 1) = 0)
32 select :m M, :n N, count(*) AllCnt, count(distinct greatest(res1, reverse(res1), res2, reverse(res2), res3, reverse(res3), res4, reverse(res4))) NoReptCnt
33 from (
34 select result res1,
35 (select replace(sys_connect_by_path(reverse(substr(result, (level-1)*:n + 1, :n)), ','), ',', '') from dual where level = :n connect by level <= :n) res2,
36 (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + ceil(level/:n), 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res3,
37 (select replace(sys_connect_by_path(substr(result, mod(level-1,:n)*:n + :n-ceil(level/:n)+1, 1), ','), ',', '') from dual where level = :n*:n connect by level <= :n*:n) res4
38 from line_result
39 where length(result) = :n*:n
40 and cnt = to_number(lpad(:m, :n, :m))
41 and ltrim(r, (select str from trim)) is null
42 and ltrim(l, (select str || '.' from trim)) is null
43 );

M N ALLCNT NOREPTCNT
---------- ---------- ---------- ----------
2 6 1097 155

已用時間: 00: 00: 01.37
SQL> exec :m := 3

PL/SQL 過程已成功完成。

已用時間: 00: 00: 00.00
SQL> /

M N ALLCNT NOREPTCNT
---------- ---------- ---------- ----------
3 6 14412 1811

已用時間: 00: 00: 25.14

速度又提高了將近8倍。對於M=3，N=6而言，26秒的結果應該算是比較滿意了，而且程式碼長度減少了1/5。

ITPUB SQL大賽第二期（二）

相關文章