java 猜測 檔案編碼

abc81163788發表於2020-12-03
  1. TikaEncodingDetector

Dependency:

<dependency>
  <groupId>org.apache.any23</groupId>
  <artifactId>apache-any23-encoding</artifactId>
  <version>1.1</version>
</dependency>

Sample:

public static Charset guessCharset(InputStream is) throws IOException {
  return Charset.forName(new TikaEncodingDetector().guessEncoding(is));    
}
  1. GuessEncoding
    Dependency:
<dependency>
  <groupId>org.codehaus.guessencoding</groupId>
  <artifactId>guessencoding</artifactId>
  <version>1.4</version>
  <type>jar</type>
</dependency>

Sample:

  public static Charset guessCharset2(File file) throws IOException {
    return CharsetToolkit.guessEncoding(file, 4096, StandardCharsets.UTF_8);
  }

https://stackoverflow.com/questions/499010/java-how-to-determine-the-correct-charset-encoding-of-a-stream

相關文章