doc轉docx(java-python)

PyJava老鸟發表於2024-07-08

本文功能借助 python實現的doc轉docx,調研了一下開源的工具或者類庫轉換效果不理想,所以選擇python

1. /resources/convert.py(py檔案放到resources下)

import argparse
from doc2docx import convert

def convert_doc_to_docx(docFilePath, docxFilePath):
    convert(docFilePath, docxFilePath)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Convert a .doc file to .docx')
    parser.add_argument('input', help='Input .doc file path')
    parser.add_argument('output', help='Output .docx file path')
    args = parser.parse_args()

    convert_doc_to_docx(args.input, args.output)

2. java相關程式碼-installPythonPackage

    private static void installPythonPackage() {
        String command = "pip install doc2docx";

        try {
            Process process = Runtime.getRuntime().exec(command);
            int exitCode = process.waitFor();
            if (exitCode != 0) {
                System.out.println("Package installation failed with exit code: " + exitCode);
            } else {
                System.out.println("Package installed successfully.");
            }
        } catch (IOException | InterruptedException e) {
            System.out.println("An error occurred during package installation:");
            e.printStackTrace();
        }
    }

3. java相關程式碼-convertDocToDocx

public static void convertDocToDocx(String docFilePath, String docxFilePath) throws IOException, InterruptedException {
        // 獲取資原始檔輸入流
        InputStream in = Doc2DocxUtil.class.getClassLoader().getResourceAsStream("convert.py");
        if (in == null) {
            throw new IllegalArgumentException("Script file not found");
        }

        // 建立臨時檔案
        Path temp = Files.createTempFile("script", ".py");
        File tempFile = temp.toFile();
        // 確保臨時檔案在 JVM 退出時會被刪除
        tempFile.deleteOnExit();
        // 將資原始檔複製到臨時檔案
        FileUtils.copyInputStreamToFile(in, tempFile);

        ProcessBuilder pb = new ProcessBuilder("python", tempFile.getAbsolutePath(), docFilePath, docxFilePath);
        Process p = pb.start();  
        int exitCode = p.waitFor();  
        if (exitCode != 0) {
            throw new RuntimeException("Python script execution failed with exit code " + exitCode);
        }
    }

4. Doc2DocxUtil類中增加靜態程式碼塊

    static{
        installPythonPackage();
    }

5. main方法

    public static void main(String[] args) throws Exception {
        String libreOfficePath="D:\\Program Files\\LibreOffice\\program\\soffice.exe";
        String docFilePath = "D:\\yy\\xxx.doc";
        String docxFilePath = "D:\\yy\\xxx.docx";

        convertDocToDocx(docFilePath,docxFilePath);
    }

相關文章