外部排序

壹頁書發表於2014-05-23
外部排序
如果需要排序的檔案超過了記憶體的大小,就需要分塊排序,按塊寫入檔案,最後再歸併已經有序的分塊檔案。

第一階段 切分排序
1.將原始未排序的大檔案,分為有序的小檔案
    1.1確定大檔案切分的位置
    1.2主執行緒按塊讀取原始檔案,然後交給排序執行緒
    1.3排序執行緒,對塊進行排序
    1.4寫入執行緒將有序的資料,寫入檔案


第二階段 歸併
    將臨時檔案歸併到目標的結果檔案

作為JAVA程式,需要注意的是,第一階段是IO密集型,儘量分配更多的記憶體給Old區
而第二階段是CPU密集型,需要注意Young區,避免過多的GC佔用CPU資源。

需要提前瞭解的技術細節。
1.JAVA切分大檔案
http://blog.itpub.net/29254281/viewspace-1161173/

2.JAVA記憶體對映檔案
http://blog.itpub.net/29254281/viewspace-1162157/

3.JAVA柵欄
http://blog.itpub.net/29254281/viewspace-1164727/

4.觀察者模式、生產者/消費者模式    

5.JVM監控和GC

實驗環境:
雙核CPU,使用1G的JAVA堆記憶體,對2億long型隨機數排序,一般情況下,2億long型隨機數檔案4G左右大小。




總體設計的思路
Main主執行緒負責切分原始檔案,
blocking函式根據檔案分片的設定,返回原始檔案的分塊。
這樣主要為了分塊包含整行,而不會切斷資料。
讀取出來的原始未排序資料封裝為Sorter物件,提交給排序執行緒池。
排序之後的資料,封裝為Writer物件,再交給寫入執行緒池,將排序好的分塊寫入臨時檔案。
在每個分塊寫入之後,執行緒在CyclicBarrier柵欄處等待,待所有的分塊寫入完畢,啟用歸併。
需要特別注意的是,在柵欄等待之前,一定要釋放所有的資源,以便JVM GC回收記憶體。
就是Writer物件write方法的下列程式碼

歸併非常消耗CPU資源。
在歸併的過程中,因為每個分塊本身都是有序的,所以只需要一個執行緒計算各個分塊中最小的數字,將其寫入BlockingQueue。
而另一個執行緒,不斷的將佇列的資料順序寫入目標檔案。
這就是Merge物件的作用。這個過程使用了觀察者模式和生產者消費者模式。

實現如下:
  1. import java.io.BufferedInputStream;
  2. import java.io.BufferedOutputStream;
  3. import java.io.File;
  4. import java.io.FileInputStream;
  5. import java.io.FileOutputStream;
  6. import java.io.IOException;
  7. import java.io.RandomAccessFile;
  8. import java.nio.MappedByteBuffer;
  9. import java.nio.channels.FileChannel;
  10. import java.nio.channels.FileChannel.MapMode;
  11. import java.util.ArrayList;
  12. import java.util.Arrays;
  13. import java.util.Collections;
  14. import java.util.List;
  15. import java.util.Queue;
  16. import java.util.concurrent.BlockingQueue;
  17. import java.util.concurrent.BrokenBarrierException;
  18. import java.util.concurrent.ConcurrentLinkedQueue;
  19. import java.util.concurrent.CyclicBarrier;
  20. import java.util.concurrent.ExecutorService;
  21. import java.util.concurrent.Executors;
  22. import java.util.concurrent.LinkedBlockingQueue;

  23. public class Controller {
  24.     public static void main(String[] args) throws IOException {
  25.         Controller c = new Controller(new File("/home/lihuilin/桌面/t.txt"), 15, "/home/lihuilin/桌面/");
  26.     }

  27.     // 排序執行緒池
  28.     private final ExecutorService sortThread;
  29.     // 寫入執行緒池,將排序之後的分片寫入檔案
  30.     private final ExecutorService writerThread;
  31.     // 柵欄,等待所有分片寫入檔案之後,啟動合併
  32.     private final CyclicBarrier barrier;
  33.     // 原始沒有順序的大檔案
  34.     private final File file;
  35.     // 分片大小
  36.     private final int pieces;
  37.     // 輸出目錄
  38.     private final String outDir;
  39.     // 記錄分片寫入臨時檔案的位置
  40.     private final List<File> outFileList = new ArrayList<File>();

  41.     public Controller(File file, int pieces, final String outDir) throws IOException {
  42.         final long start = System.currentTimeMillis();
  43.         sortThread = Executors.newFixedThreadPool(1);
  44.         // 寫入執行緒池的執行緒數一定不能小於分片的大小。否則CyclicBarrier
  45.         // await之後,後續的分片將沒有執行緒可用。await不會釋放執行緒資源。
  46.         writerThread = Executors.newFixedThreadPool(pieces + 1);
  47.         this.file = file;
  48.         this.pieces = pieces;
  49.         this.outDir = outDir;
  50.         this.barrier = new CyclicBarrier(pieces, new Runnable() {

  51.             @Override
  52.             public void run() {
  53.                 long end = System.currentTimeMillis();
  54.                 System.out.println("合併之前總用時:" + (end - start) / 1000 + "s");
  55.                 // 合併有序的分片臨時檔案
  56.                 Merger merger = new Merger(outFileList, outDir);
  57.                 writerThread.submit(merger);
  58.                 try {
  59.                     merger.merge();
  60.                 } catch (IOException e) {
  61.                     // TODO Auto-generated catch block
  62.                     e.printStackTrace();
  63.                 } catch (InterruptedException e) {
  64.                     // TODO Auto-generated catch block
  65.                     e.printStackTrace();
  66.                 }

  67.                 writerThread.shutdown();
  68.                 sortThread.shutdown();
  69.                 end = System.currentTimeMillis();
  70.                 System.out.println("外部排序總用時:" + (end - start) / 1000 + "s");
  71.             }
  72.         });
  73.         action();
  74.     }

  75.     private void action() throws IOException {
  76.         List<Point> list = blocking(file, pieces);
  77.         for (Point p : list) {
  78.             Spilter spilter = new MappedByteBufferSpilter(file, p);
  79.             long[] data = null;
  80.             data = spilter.spilt();
  81.             Sorter s = new Sorter(data, p, writerThread, barrier, outFileList);
  82.             sortThread.submit(s);
  83.         }

  84.     }

  85.     private List<Point> blocking(File file, int piece) throws IOException {
  86.         List<Point> result = new ArrayList<Point>();
  87.         List<Long> list = new ArrayList<Long>();
  88.         list.add(-1L);
  89.         long length = file.length();
  90.         long step = length / piece;
  91.         long index = 0;
  92.         for (int i = 0; i < piece; i++) {
  93.             BufferedInputStream in = new BufferedInputStream(new FileInputStream(file));
  94.             if (index + step < length) {
  95.                 index = index + step;
  96.                 in.skip(index);

  97.                 while (in.read() != 10) {
  98.                     index = index + 1;
  99.                 }
  100.                 list.add(index);
  101.                 index++;
  102.             }
  103.             in.close();
  104.         }
  105.         list.add(length - 1);
  106.         for (int i = 0; i < list.size() - 1; i++) {
  107.             long skipSize = list.get(i) + 1;
  108.             long l = list.get(i + 1) - list.get(i);
  109.             result.add(new Point(skipSize, l, outDir));

  110.         }
  111.         return result;
  112.     }

  113. }

  114. class Merger implements Runnable {
  115.     private final List<Worker> workerList = new ArrayList<Worker>();
  116.     private String outDir = null;
  117.     private BlockingQueue<Long> queue = new LinkedBlockingQueue<Long>(1000);
  118.     private volatile boolean finished = false;

  119.     public Merger(List<File> outFileList, String outDir) {
  120.         for (File file : outFileList) {
  121.             Worker worker = new Worker(file, workerList);
  122.             workerList.add(worker);
  123.         }
  124.         this.outDir = outDir;
  125.     }

  126.     @Override
  127.     public void run() {
  128.         try {
  129.             System.out.println("讀取佇列,寫入目標檔案");
  130.             BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(outDir + "result.txt"), 50 * 1024 * 1024);
  131.             while (finished != true || !queue.isEmpty()) {
  132.                 Long l = queue.take();
  133.                 bos.write((l + "\n").getBytes());
  134.             }
  135.             bos.flush();
  136.             bos.close();
  137.         } catch (Exception ex) {
  138.             ex.printStackTrace();
  139.         }
  140.     }

  141.     public void merge() throws IOException, InterruptedException {
  142.         while (workerList.size() != 0) {
  143.             Collections.sort(workerList);
  144.             Worker worker = workerList.get(0);
  145.             Long data = worker.poll();
  146.             if (data == null) {
  147.                 workerList.remove(worker);
  148.             } else {
  149.                 queue.put(data);
  150.             }
  151.         }
  152.         finished = true;
  153.     }

  154.     private class Worker implements Comparable<Worker> {
  155.         private long data;
  156.         private MappedByteBuffer buffer = null;
  157.         private List<Worker> workerList = null;
  158.         private boolean eof = false;

  159.         Worker(File file, List<Worker> workerList) {
  160.             try {
  161.                 RandomAccessFile rFile = new RandomAccessFile(file, "r");
  162.                 FileChannel channel = rFile.getChannel();
  163.                 buffer = channel.map(MapMode.READ_ONLY, 0, channel.size());
  164.                 channel.close();
  165.                 rFile.close();
  166.                 this.workerList = workerList;
  167.                 data = buffer.getLong();
  168.             } catch (IOException e) {
  169.                 // TODO Auto-generated catch block
  170.                 e.printStackTrace();
  171.             }
  172.         }

  173.         public long peek() {
  174.             return data;
  175.         }

  176.         public Long poll() {
  177.             long result = data;
  178.             if (buffer.position() != buffer.limit()) {
  179.                 data = buffer.getLong();
  180.             } else {
  181.                 if (eof == false) {
  182.                     eof = true;
  183.                 } else {
  184.                     return null;
  185.                 }
  186.             }

  187.             return result;
  188.         }

  189.         @Override
  190.         public int compareTo(Worker o) {
  191.             if (this.peek() > o.peek()) {
  192.                 return 1;
  193.             } else if (this.peek() < o.peek()) {
  194.                 return -1;
  195.             } else {
  196.                 return 0;
  197.             }
  198.         }
  199.     }

  200. }

  201. interface Spilter {
  202.     public long[] spilt();
  203. }

  204. class Sorter implements Runnable {
  205.     long[] data;
  206.     Point p;
  207.     ExecutorService writerThread;
  208.     List<File> outFileList;
  209.     CyclicBarrier barrier;

  210.     public Sorter(long[] data, Point p, ExecutorService writerThread, CyclicBarrier barrier, List<File> outFileList) {
  211.         this.data = data;
  212.         this.p = p;
  213.         this.outFileList = outFileList;
  214.         this.barrier = barrier;
  215.         this.writerThread = writerThread;
  216.     }

  217.     public long[] sort() {
  218.         System.out.println("\t開始排序:" + p);
  219.         long start = System.currentTimeMillis();
  220.         Arrays.sort(this.data);
  221.         long end = System.currentTimeMillis();
  222.         System.out.println("\t結束排序:" + p + ",用時:" + (end - start) / 1000);
  223.         return this.data;
  224.     }

  225.     @Override
  226.     public void run() {
  227.         Writer writer = new MappedByteBufferWriter(sort(), p, barrier, outFileList);
  228.         writerThread.submit(writer);
  229.     }
  230. }

  231. interface Writer extends Runnable {
  232.     public void write();
  233. }

  234. class MappedByteBufferWriter implements Writer {
  235.     private static int FLAG = 1;
  236.     private CyclicBarrier barrier = null;
  237.     private File outfile = null;
  238.     private Point point = null;

  239.     private long[] data = null;
  240.     private List<File> outFileList = null;

  241.     public MappedByteBufferWriter(long[] data, Point point, CyclicBarrier barrier, List<File> outFileList) {
  242.         this.data = data;
  243.         this.point = point;
  244.         this.outfile = new File(point.getOutDir() + FLAG + ".txt");
  245.         this.barrier = barrier;
  246.         this.outFileList = outFileList;
  247.         FLAG++;
  248.     }

  249.     @Override
  250.     public void write() {
  251.         try {
  252.             System.out.println("\t\t開始寫入:" + point);
  253.             long start = System.currentTimeMillis();
  254.             FileChannel channel = new RandomAccessFile(this.outfile, "rw").getChannel();
  255.             MappedByteBuffer buffer = channel.map(MapMode.READ_WRITE, 0, this.data.length * 8);
  256.             for (int i = 0; i < data.length; i++) {
  257.                 buffer.putLong(data[i]);
  258.             }
  259.             buffer.force();
  260.             long end = System.currentTimeMillis();
  261.             System.out.println("\t\t結束寫入:" + point + ",用時:" + (end - start) / 1000);
  262.             synchronized (outFileList) {
  263.                 outFileList.add(outfile);
  264.             }
  265.             this.data = null;
  266.             channel.close();
  267.             buffer = null;
  268.             barrier.await();
  269.         } catch (IOException ex) {
  270.             ex.printStackTrace();
  271.         } catch (InterruptedException e) {
  272.             // TODO Auto-generated catch block
  273.             e.printStackTrace();
  274.         } catch (BrokenBarrierException e) {
  275.             // TODO Auto-generated catch block
  276.             e.printStackTrace();
  277.         }
  278.     }

  279.     @Override
  280.     public void run() {
  281.         this.write();
  282.     }

  283. }

  284. class MappedByteBufferSpilter implements Spilter {
  285.     private File file;
  286.     private Point point;

  287.     public MappedByteBufferSpilter(File file, Point p) {
  288.         this.file = file;
  289.         this.point = p;
  290.     }

  291.     @Override
  292.     public long[] spilt() {
  293.         System.out.println("開始讀入:" + point);
  294.         long start = System.currentTimeMillis();
  295.         long[] result = null;
  296.         try {

  297.             FileChannel in = new RandomAccessFile(file, "r").getChannel();
  298.             MappedByteBuffer inBuffer = in.map(MapMode.READ_ONLY, point.getSkipSize(), point.getLength());

  299.             byte[] data = new byte[inBuffer.limit()];
  300.             inBuffer.get(data);
  301.             result = new long[getObjectSize(data)];
  302.             int resultIndex = 0;
  303.             int index = 0;
  304.             int first = 0;
  305.             while (index < data.length) {
  306.                 if (data[index] == 10) {
  307.                     byte[] tmpData = Arrays.copyOfRange(data, first, index);
  308.                     String str = new String(tmpData);
  309.                     result[resultIndex] = Long.valueOf(str);
  310.                     resultIndex++;
  311.                     first = index + 1;
  312.                 }

  313.                 index++;
  314.             }
  315.             in.close();

  316.         } catch (IOException ex) {
  317.             ex.printStackTrace();
  318.         }
  319.         long end = System.currentTimeMillis();
  320.         System.out.println("結束讀入:" + point + ",用時:" + (end - start) / 1000);
  321.         return result;
  322.     }

  323.     private int getObjectSize(byte[] data) {
  324.         int size = 0;
  325.         for (byte b : data) {
  326.             if (b == 10) {
  327.                 size++;
  328.             }
  329.         }
  330.         return size;
  331.     }

  332. }

  333. class Point {
  334.     public Point(long skipSize, long length, String outDir) {
  335.         if (length > Integer.MAX_VALUE) {
  336.             throw new RuntimeException("長度溢位");
  337.         }
  338.         this.skipSize = skipSize;
  339.         this.length = (int) length;
  340.         this.outDir = outDir;
  341.     }

  342.     @Override
  343.     public String toString() {
  344.         return "Point [skipSize=" + skipSize + ", length=" + length + "]";
  345.     }

  346.     private long skipSize;
  347.     private int length;
  348.     private String outDir;

  349.     public String getOutDir() {
  350.         return outDir;
  351.     }

  352.     public long getSkipSize() {
  353.         return skipSize;
  354.     }

  355.     public int getLength() {
  356.         return length;
  357.     }
  358. }
執行:
  1. [lihuilin@lihuilin 桌面]$ java Controller
  2. 開始讀入:Point [skipSize=0, length=271726519]
  3. 結束讀入:Point [skipSize=0, length=271726519],用時:8
  4. 開始讀入:Point [skipSize=271726519, length=271726515]
  5. 開始排序:Point [skipSize=0, length=271726519]
  6. 結束排序:Point [skipSize=0, length=271726519],用時:3
  7. 開始寫入:Point [skipSize=0, length=271726519]
  8. 結束寫入:Point [skipSize=0, length=271726519],用時:2
  9. 結束讀入:Point [skipSize=271726519, length=271726515],用時:9
  10. 開始排序:Point [skipSize=271726519, length=271726515]
  11. 開始讀入:Point [skipSize=543453034, length=271726511]
  12. 結束排序:Point [skipSize=271726519, length=271726515],用時:4
  13. 開始寫入:Point [skipSize=271726519, length=271726515]
  14. 結束寫入:Point [skipSize=271726519, length=271726515],用時:3
  15. 結束讀入:Point [skipSize=543453034, length=271726511],用時:9
  16. 開始讀入:Point [skipSize=815179545, length=271726515]
  17. 開始排序:Point [skipSize=543453034, length=271726511]
  18. 結束排序:Point [skipSize=543453034, length=271726511],用時:3
  19. 開始寫入:Point [skipSize=543453034, length=271726511]
  20. 結束寫入:Point [skipSize=543453034, length=271726511],用時:5
  21. 結束讀入:Point [skipSize=815179545, length=271726515],用時:13
  22. 開始讀入:Point [skipSize=1086906060, length=271726524]
  23. 開始排序:Point [skipSize=815179545, length=271726515]
  24. 結束排序:Point [skipSize=815179545, length=271726515],用時:3
  25. 開始寫入:Point [skipSize=815179545, length=271726515]
  26. 結束寫入:Point [skipSize=815179545, length=271726515],用時:5
  27. 結束讀入:Point [skipSize=1086906060, length=271726524],用時:13
  28. 開始讀入:Point [skipSize=1358632584, length=271726507]
  29. 開始排序:Point [skipSize=1086906060, length=271726524]
  30. 結束排序:Point [skipSize=1086906060, length=271726524],用時:3
  31. 開始寫入:Point [skipSize=1086906060, length=271726524]
  32. 結束寫入:Point [skipSize=1086906060, length=271726524],用時:5
  33. 結束讀入:Point [skipSize=1358632584, length=271726507],用時:12
  34. 開始讀入:Point [skipSize=1630359091, length=271726523]
  35. 開始排序:Point [skipSize=1358632584, length=271726507]
  36. 結束排序:Point [skipSize=1358632584, length=271726507],用時:3
  37. 開始寫入:Point [skipSize=1358632584, length=271726507]
  38. 結束寫入:Point [skipSize=1358632584, length=271726507],用時:5
  39. 結束讀入:Point [skipSize=1630359091, length=271726523],用時:13
  40. 開始讀入:Point [skipSize=1902085614, length=271726514]
  41. 開始排序:Point [skipSize=1630359091, length=271726523]
  42. 結束排序:Point [skipSize=1630359091, length=271726523],用時:3
  43. 開始寫入:Point [skipSize=1630359091, length=271726523]
  44. 結束寫入:Point [skipSize=1630359091, length=271726523],用時:5
  45. 結束讀入:Point [skipSize=1902085614, length=271726514],用時:13
  46. 開始讀入:Point [skipSize=2173812128, length=271726519]
  47. 開始排序:Point [skipSize=1902085614, length=271726514]
  48. 結束排序:Point [skipSize=1902085614, length=271726514],用時:3
  49. 開始寫入:Point [skipSize=1902085614, length=271726514]
  50. 結束寫入:Point [skipSize=1902085614, length=271726514],用時:5
  51. 結束讀入:Point [skipSize=2173812128, length=271726519],用時:13
  52. 開始讀入:Point [skipSize=2445538647, length=271726516]
  53. 開始排序:Point [skipSize=2173812128, length=271726519]
  54. 結束排序:Point [skipSize=2173812128, length=271726519],用時:3
  55. 開始寫入:Point [skipSize=2173812128, length=271726519]
  56. 結束寫入:Point [skipSize=2173812128, length=271726519],用時:5
  57. 結束讀入:Point [skipSize=2445538647, length=271726516],用時:12
  58. 開始讀入:Point [skipSize=2717265163, length=271726517]
  59. 開始排序:Point [skipSize=2445538647, length=271726516]
  60. 結束排序:Point [skipSize=2445538647, length=271726516],用時:3
  61. 開始寫入:Point [skipSize=2445538647, length=271726516]
  62. 結束寫入:Point [skipSize=2445538647, length=271726516],用時:5
  63. 結束讀入:Point [skipSize=2717265163, length=271726517],用時:13
  64. 開始讀入:Point [skipSize=2988991680, length=271726517]
  65. 開始排序:Point [skipSize=2717265163, length=271726517]
  66. 結束排序:Point [skipSize=2717265163, length=271726517],用時:3
  67. 開始寫入:Point [skipSize=2717265163, length=271726517]
  68. 結束寫入:Point [skipSize=2717265163, length=271726517],用時:5
  69. 結束讀入:Point [skipSize=2988991680, length=271726517],用時:12
  70. 開始讀入:Point [skipSize=3260718197, length=271726516]
  71. 開始排序:Point [skipSize=2988991680, length=271726517]
  72. 結束排序:Point [skipSize=2988991680, length=271726517],用時:3
  73. 開始寫入:Point [skipSize=2988991680, length=271726517]
  74. 結束寫入:Point [skipSize=2988991680, length=271726517],用時:5
  75. 結束讀入:Point [skipSize=3260718197, length=271726516],用時:12
  76. 開始讀入:Point [skipSize=3532444713, length=271726515]
  77. 開始排序:Point [skipSize=3260718197, length=271726516]
  78. 結束排序:Point [skipSize=3260718197, length=271726516],用時:3
  79. 開始寫入:Point [skipSize=3260718197, length=271726516]
  80. 結束寫入:Point [skipSize=3260718197, length=271726516],用時:5
  81. 結束讀入:Point [skipSize=3532444713, length=271726515],用時:12
  82. 開始讀入:Point [skipSize=3804171228, length=271726376]
  83. 開始排序:Point [skipSize=3532444713, length=271726515]
  84. 結束排序:Point [skipSize=3532444713, length=271726515],用時:3
  85. 開始寫入:Point [skipSize=3532444713, length=271726515]
  86. 結束寫入:Point [skipSize=3532444713, length=271726515],用時:4
  87. 結束讀入:Point [skipSize=3804171228, length=271726376],用時:12
  88. 開始排序:Point [skipSize=3804171228, length=271726376]
  89. 結束排序:Point [skipSize=3804171228, length=271726376],用時:2
  90. 開始寫入:Point [skipSize=3804171228, length=271726376]
  91. 結束寫入:Point [skipSize=3804171228, length=271726376],用時:3
  92. 合併之前總用時:190s
  93. 讀取佇列,寫入目標檔案
  94. 外部排序總用時:398s
JVM監控:


從監控可以看到第一階段是IO密集型,對於記憶體需要很大;
而第二階段,排序各個分塊對於CPU壓力很大,一定要注意不要讓GC執行緒佔用過多CPU資源,就是Young區不能過小。

上圖中FGC 55之後的就是第二階段的過程,Young GC明顯增多。

驗證:
開始使用1-100的實驗資料,可以正確排序。
大檔案排序之後,可以使用Linux Sort命令驗證。


關於效能優化
外部排序的本質就是排序寫入小檔案,再將小檔案歸併為有序的目標檔案。
所以時間大致應該是拷貝這個檔案的時間乘以2.

但是...
想起在15所的時候,吳老師使用大致與我效能相當的配置,居然只用了230S左右..
優化需要注意的兩點
1.記憶體對映檔案
    檔案複製,分塊檔案寫入然後讀出這種場景使用記憶體對映檔案,避免了核心空間和使用者空間的記憶體複製。並且可以使用堆外作業系統記憶體作為快取.
2.避免GC佔用CPU
    讀取檔案和寫入檔案的時候,Byte和Long做轉換的時候,均採用了String型別作為中轉,
    

    
    後續可以考慮直接將byte和long型互轉,避免引入String型別,這樣就可以避免額外的GC

第一點都很容易想到,而第二點吳老師做了,我沒有實現,這可能是導致效能較慢的原因。
以後有時間再補上這個細節吧。
因為使用了記憶體對映檔案,比了避免誤差,每次實驗之前最好清除快取。


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29254281/viewspace-1167988/,如需轉載,請註明出處,否則將追究法律責任。

相關文章