Python之將Python字串生成PDF

jclian91發表於2019-05-17

原文網址 : https://www.cnblogs.com/jclian91/p/10880822.html

筆者在今天的工作中，遇到了一個需求，那就是如何將Python字串生成PDF。比如，需要把Python字串‘這是測試檔案’生成為PDF, 該PDF中含有文字‘這是測試檔案’。
經過一番檢索，筆者決定採用wkhtmltopdf這個軟體，它可以將HTML轉化為PDF。wkhtmltopdf的訪問網址為：https://wkhtmltopdf.org/downloads.html ，讀者可根據自己的系統下載對應的檔案並安裝。安裝好wkhtmltopdf，我們再安裝這個軟體的Python第三方模組——pdfkit，安裝方式如下：

pip install pdfkit

我們再討論如下問題：

如何將Python字串生成PDF；
如何生成PDF中的表格；
解決PDF生成速度慢的問題。

如何將Python字串生成PDF

該問題的解決思路還是利用將Python字串嵌入到HTML程式碼中解決，注意換行需要用<br>標籤，示例程式碼如下：

import pdfkit

# PDF中包含的文字
content = '這是一個測試檔案。' + '<br>' + 'Hello from Python!'

html = '<html><head><meta charset="UTF-8"></head>' \
       '<body><div align="center"><p>%s</p></div></body></html>'%content

# 轉換為PDF
pdfkit.from_string(html, './test.pdf')

輸出的結果如下：

Loading pages (1/6)
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done

生成的test.pdf如下：

Python之將Python字串生成PDF

如何生成PDF中的表格

接下來我們考慮如何將csv檔案轉換為PDF中的表格，思路還是利用HTML程式碼。示例的iris.csv檔案（部分）如下：

Python之將Python字串生成PDF

將csv檔案轉換為PDF中的表格的Python程式碼如下：

import pdfkit

# 讀取csv檔案
with open('iris.csv', 'r') as f:
    lines = [_.strip() for _ in f.readlines()]

# 轉化為html中的表格樣式
td_width = 100
content = '<table width="%s" border="1" cellspacing="0px" style="border-collapse:collapse">' % (td_width*len(lines[0].split(',')))

for i in range(len(lines)):
    tr = '<tr>'+''.join(['<td width="%d">%s</td>'%(td_width, _) for _ in lines[i].split(',')])+'</tr>'
    content += tr

content += '</table>'

html = '<html><head><meta charset="UTF-8"></head>' \
       '<body><div align="center">%s</div></body></html>' % content

# 轉換為PDF
pdfkit.from_string(html, './iris.pdf')

生成的PDF檔案為iris.pdf，部分內容如下：

Python之將Python字串生成PDF

解決PDF生成速度慢的問題

用pdfkit生成PDF檔案雖然方便，但有一個比較大的缺點，那就是生成PDF的速度比較慢，這裡我們可以做個簡單的測試，比如生成100份PDF檔案，裡面的文字為“這是第*份測試檔案！”。Python程式碼如下：

import pdfkit
import time

start_time = time.time()

for i in range(100):
    content = '這是第%d份測試檔案！'%(i+1)
    html = '<html><head><meta charset="UTF-8"></head>' \
           '<body><div align="center">%s</div></body></html>' % content

    # 轉換為PDF
    pdfkit.from_string(html, './test/%s.pdf'%(i+1))

end_time = time.time()

print('一共耗時：%s 秒.' %(end_time-start_time))

在這個程式中，生成100份PDF檔案一共耗時約192秒。輸出結果如下：

......
Loading pages (1/6)
Counting pages (2/6)                                               
Resolving links (4/6)                                                       
Loading headers and footers (5/6)                                           
Printing pages (6/6)
Done                                                                      
一共耗時：191.9226369857788 秒.

如果想要加快生成的速度，我們可以使用多執行緒來實現，主要使用concurrent.futures模組，完整的Python程式碼如下：

import pdfkit
import time
from concurrent.futures import ThreadPoolExecutor, wait, ALL_COMPLETED

start_time = time.time()

# 函式: 生成PDF
def convert_2_pdf(i):
    content = '這是第%d份測試檔案！'%(i+1)
    html = '<html><head><meta charset="UTF-8"></head>' \
           '<body><div align="center">%s</div></body></html>' % content

    # 轉換為PDF
    pdfkit.from_string(html, './test/%s.pdf'%(i+1))


# 利用多執行緒生成PDF
executor = ThreadPoolExecutor(max_workers=10)  # 可以自己調整max_workers,即執行緒的個數
# submit()的引數： 第一個為函式， 之後為該函式的傳入引數，允許有多個
future_tasks = [executor.submit(convert_2_pdf, i) for i in range(100)]
# 等待所有的執行緒完成，才進入後續的執行
wait(future_tasks, return_when=ALL_COMPLETED)

end_time = time.time()
print('一共耗時：%s 秒.' %(end_time-start_time))

在這個程式中，生成100份PDF檔案一共耗時約41秒，明顯快了很多～

注意：不妨瞭解下筆者的微信公眾號： Python爬蟲與演算法（微訊號為：easy_web_scrape），歡迎大家關注~

Python 將PDF轉為PDF/A、PDF/X，以及PDF/A轉回PDF
2024-05-08
Python
python基礎之字串
2020-10-02
Python字串
python生成隨機數、隨機字串
2018-12-18
Python隨機字串
Python之合併PDF檔案
2018-05-18
Python
python3 將bytes轉為字串
2024-09-02
Python字串
使用 Docker 封裝 Python 小工具生成 GitBook PDF
2019-05-07
Docker封裝PythonGit
Python實現批次將ppt轉換為pdf
2023-03-31
Python
Python將字串轉為字典最佳實踐
2018-06-28
Python字串
Python中如何將字串變成數字?
2023-10-19
Python字串
105-Python中將資料插入字串
2024-07-16
Python字串
python 分割 pdf
2024-06-29
Python
【轉載】Python字串操作之字串分割與組合
2018-11-13
Python字串
Python字串
2021-04-18
Python字串
python基礎之字串和編碼
2019-10-11
Python字串
Python 將Word/ Exce/ PDF/ PPT文件轉為OFD文件
2024-06-03
Python
python學習之字串常用方法和格式化字串
2018-09-28
Python字串
pdf crop using python
2024-03-18
Python
python pdf轉Excel
2020-08-08
PythonExcel
豬行天下之Python基礎——3.5 字串
2019-04-03
Python字串
零基礎學習 Python 之字串
2018-12-12
Python字串
「翻轉字串」python之leetcode刷題|004
2018-08-10
字串PythonLeetCode
Python基礎之:數字字串和列表
2021-02-22
Python字串
Python 字串 str
2020-03-02
Python字串
Python字串字首
2020-10-07
Python字串
Python字串操作
2019-06-30
Python字串
Python-字串
2024-07-01
Python字串
python字串切片
2020-12-21
Python字串
Python 字串格式化(Python IO)
2019-03-03
Python字串格式化
利用SelectPdf外掛將網頁生成PDF
2020-10-26
網頁
用 PHP 和 Python 生成短連結服務的字串 ID
2018-10-14
PHPPython字串
好程式設計師Python培訓分享Python系列之字串的使用
2020-06-24
程式設計師Python字串
python 截圖，合成 pdf
2018-10-07
Python
python 讀取PDF表格
2020-09-25
Python
python基礎（補充）：python三大器之生成器
2021-04-16
Python
辦公利器！用Python快速將任意檔案轉為PDF
2021-06-27
Python
Python進階：如何將字串常量轉化為變數？
2019-04-15
Python字串變數
python如何將字串中的所有"you"替換成"we"
2021-09-11
Python字串
Python基礎—字串
2018-11-29
Python字串

Python之將Python字串生成PDF

如何將Python字串生成PDF

如何生成PDF中的表格

解決PDF生成速度慢的問題

相關文章