java通過檔案頭內容判斷檔案型別

atlantisholic發表於2011-04-07
 

/**
  * byte陣列轉換成16進位制字串
  * @param src
  * @return
  */
 public static String bytesToHexString(byte[] src){     
        StringBuilder stringBuilder = new StringBuilder();     
        if (src == null || src.length <= 0) {     
            return null;     
        }     
        for (int i = 0; i < src.length; i++) {     
            int v = src[i] & 0xFF;     
            String hv = Integer.toHexString(v);     
            if (hv.length() < 2) {     
                stringBuilder.append(0);     
            }     
            stringBuilder.append(hv);     
        }     
        return stringBuilder.toString();     
    }
 
 /**
  * 根據檔案流讀取圖片檔案真實型別
  * @param is
  * @return
  */
 public static String getTypeByStream(FileInputStream is){
     byte[] b = new byte[4];  
        try {
   is.read(b, 0, b.length);
  } catch (IOException e) {
   e.printStackTrace();
  }
        String type = bytesToHexString(b).toUpperCase();
        if(type.contains("FFD8FF")){
         return "jpg";
        }else if(type.contains("89504E47")){
         return "png";
        }else if(type.contains("47494638")){
         return "gif";
        }else if(type.contains("49492A00")){
         return "tif";
        }else if(type.contains("424D")){
         return "bmp";
        }
        return type;
    }

public static void main(String[] args) throws Exception {
//     String src = "D:/workspace//8129.jpg";
//     String src = "D:/workspace//temp/1.gif";
     String src = "D:/workspace//temp/2.bmp";
     FileInputStream is = new FileInputStream(src);  
//        byte[] b = new byte[4];  
//        is.read(b, 0, b.length);  
//        System.out.println(bytesToHexString(b));
       
        String type = getTypeByStream(is);
        System.out.println(type);
  /*
   * JPEG (jpg),檔案頭:FFD8FF
PNG (png),檔案頭:89504E47
GIF (gif),檔案頭:47494638
TIFF (tif),檔案頭:49492A00 
Windows Bitmap (bmp),檔案頭:424D

   */
    }  

==================================

用檔案頭判斷。直接讀取檔案的前幾個位元組。
常用檔案的檔案頭如下:
JPEG (jpg),檔案頭:FFD8FF
PNG (png),檔案頭:89504E47
GIF (gif),檔案頭:47494638
TIFF (tif),檔案頭:49492A00 
Windows Bitmap (bmp),檔案頭:424D
CAD (dwg),檔案頭:41433130
Adobe Photoshop (psd),檔案頭:38425053
Rich Text Format (rtf),檔案頭:7B5C727466
XML (xml),檔案頭:3C3F786D6C
HTML (html),檔案頭:68746D6C3E
Email [thorough only] (eml),檔案頭:44656C69766572792D646174653A
Outlook Express (dbx),檔案頭:CFAD12FEC5FD746F 
Outlook (pst),檔案頭:2142444E 
MS Word/Excel (xls.or.doc),檔案頭:D0CF11E0
MS Access (mdb),檔案頭:5374616E64617264204A
WordPerfect (wpd),檔案頭:FF575043
Postscript. (eps.or.ps),檔案頭:252150532D41646F6265
Adobe Acrobat (pdf),檔案頭:255044462D312E
Quicken (qdf),檔案頭:AC9EBD8F 
Windows Password (pwl),檔案頭:E3828596 
ZIP Archive (zip),檔案頭:504B0304 
RAR Archive (rar),檔案頭:52617221 
Wave (wav),檔案頭:57415645 
AVI (avi),檔案頭:41564920 
Real Audio (ram),檔案頭:2E7261FD 
Real Media (rm),檔案頭:2E524D46 
MPEG (mpg),檔案頭:000001BA 
MPEG (mpg),檔案頭:000001B3
Quicktime (mov),檔案頭:6D6F6F76 
Windows Media (asf),檔案頭:3026B2758E66CF11 
MIDI (mid),檔案頭:4D546864

GRAPHICS FILES

Adobe Photoshop File (.psd) 00000000 38 42 50 53 00 01 00 00 00 00 00 00 00 04 00 00 |8BPS............| 00000010 0b 71 00 00 10 dd 00 08 00 03 00 00 00 00 00 00 |.q...?..........| 00000020 6f c4 38 42 49 4d 04 04 00 00 00 00 00 07 1c 02 |o?8BIM..........| JPEG image (.jpg) 00000000 ff d8 ff e0 00 10 4a 46 49 46 00 01 01 01 00 48 |????..JFIF.....H| 00000010 00 48 00 00 ff db 00 43 00 06 04 05 06 05 04 06 |.H..??.C........| 00000020 06 05 06 07 07 06 08 0a 10 0a 0a 09 09 0a 14 0e |................| PNG image (.png) 00000000 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 |.PNG........IHDR| 00000010 00 00 03 20 00 00 02 58 08 06 00 00 00 9a 76 82 |... ...X......v.| 00000020 70 00 00 0c d9 69 43 43 50 69 63 63 00 00 78 da |p...?iCCPicc..x?| GIF image (.gif) 00000000 47 49 46 38 39 61 10 00 10 00 b3 0d 00 3f 3f 3f |GIF89a....?..???| 00000010 bf bf bf 2a 2a 2a 55 55 55 7f 7f 7f 15 15 15 40 |???***UUU......@| 00000020 40 40 60 60 60 c0 c0 c0 2f 2f 2f 90 90 90 ff ff |@@```???///...??| Adobe Illustrator File (.ai) 00000000 25 50 44 46 2d 31 2e 34 0d 25 e2 e3 cf d3 0d 0a |%PDF-1.4.%????..| 00000010 31 20 30 20 6f 62 6a 3c 3c 2f 50 61 67 65 73 20 |1 0 obj<MUSIC FILES MP3 Music Track (.mp3) 00000000 49 44 33 03 00 00 00 00 00 6f 54 49 54 32 00 00 |ID3......oTIT2..| 00000010 00 0e 00 00 00 54 68 65 20 4f 74 68 65 72 20 4d |.....The Other M| 00000020 61 6e 54 52 43 4b 00 00 00 02 00 00 00 33 54 50 |anTRCK.......3TP| WAV file (.wav) 00000000 52 49 46 46 62 b7 01 00 57 41 56 45 66 6d 74 20 |RIFFb?..WAVEfmt | 00000010 10 00 00 00 01 00 01 00 44 ac 00 00 88 58 01 00 |........D?...X..| 00000020 02 00 10 00 64 61 74 61 3e b7 01 00 57 01 bd 01 |....data>?..W.?.| AIFF file (.aif) 00000000 46 4f 52 4d 00 2a ef cc 41 49 46 46 43 4f 4d 54 |FORM.*??AIFFCOMT| 00000010 00 00 01 c2 00 01 00 00 00 00 00 00 00 12 43 72 |...?..........Cr| 00000020 65 61 74 6f 72 3a 20 4c 6f 67 69 63 20 50 72 6f |eator: Logic Pro|

TEXT FILES

Text file (often .txt, but not always) 00000000 48 69 20 65 76 65 72 79 6f 6e 65 2c 0a 0a 48 65 |Hi everyone,..He| 00000010 72 65 20 61 72 65 20 73 6f 6d 65 20 63 68 61 6e |re are some chan| 00000020 67 65 73 20 74 68 61 74 20 77 69 6c 6c 20 68 61 |ges that will ha| Microsoft Word/Office (.doc, .xls) 00000000 d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00 00 00 |??.....?........| 00000010 00 00 00 00 00 00 00 00 3e 00 03 00 fe ff 09 00 |........>...??..| 00000020 06 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 |................| Adobe PDF (.pdf) - Very similar to Adobe Illustrator and other Postscript. formats 00000000 25 50 44 46 2d 31 2e 34 0d 25 e2 e3 cf d3 0d 0a |%PDF-1.4.%????..| 00000010 36 20 30 20 6f 62 6a 20 3c 3c 2f 4c 69 6e 65 61 |6 0 obj <ANIMATION & VIDEO Flash movie (.swf) 00000000 43 57 53 08 ac 43 00 00 78 9c ed 7a 77 58 93 c9 |CWS.?C..x.?zwX.?| 00000010 d6 f8 49 25 f4 80 94 50 0d 45 4a 00 e9 45 b0 04 |??I%?..P.EJ.?E?.| 00000020 44 44 45 e9 55 d0 80 44 01 11 10 11 01 75 0d bd |DDE?U?.D.....u.?| Quicktime Movie (.mov) 00000000 00 00 00 20 66 74 79 70 71 74 20 20 20 05 03 00 |... ftypqt ...| 00000010 71 74 20 20 00 00 00 00 00 00 00 00 00 00 00 00 |qt ............| 00000020 00 00 03 55 6d 6f 6f 76 00 00 00 6c 6d 76 68 64 |...Umoov...lmvhd|


 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23071790/viewspace-691870/,如需轉載,請註明出處,否則將追究法律責任。

相關文章