字型圖片批次生成-字型識別模型資料

陈作立的博客發表於2024-12-09

眾所周知,我們的文字有各種字型,字型透過字型檔案方式供作業系統使用,在需要使用字型圖片的場景,我們如何快速生成呢?

這篇文章介紹下,如何透過作業系統自帶的字型檔案,利用python的pillow包快速生成字型圖片。

  1. 各作業系統字型檔案路徑
    windows\linux\macos:
dirs = []
if sys.platform == "win32":
    # check the windows font repository
    # NOTE: must use uppercase WINDIR, to work around bugs in
    # 1.5.2's os.environ.get()
    windir = os.environ.get("WINDIR")
    if windir:
        dirs.append(os.path.join(windir, "fonts"))
elif sys.platform in ("linux", "linux2"):
    data_home = os.environ.get("XDG_DATA_HOME")
    if not data_home:
        # The freedesktop spec defines the following default directory for
        # when XDG_DATA_HOME is unset or empty. This user-level directory
        # takes precedence over system-level directories.
        data_home = os.path.expanduser("~/.local/share")
    xdg_dirs = [data_home]

    data_dirs = os.environ.get("XDG_DATA_DIRS")
    if not data_dirs:
        # Similarly, defaults are defined for the system-level directories
        data_dirs = "/usr/local/share:/usr/share"
    xdg_dirs += data_dirs.split(":")

    dirs += [os.path.join(xdg_dir, "fonts") for xdg_dir in xdg_dirs]
elif sys.platform == "darwin":
    dirs += [
        "/Library/Fonts",
        "/System/Library/Fonts",
        os.path.expanduser("~/Library/Fonts"),
    ]
  1. pillow生成圖片

import os
import random

import nltk
from PIL import Image, ImageDraw, ImageFont

# Download the necessary data from nltk
nltk.download('inaugural')

def wrap_text(text, line_length=4):
    """Wraps the provided text every 'line_length' words."""
    words = text.split()
    return "\n".join([" ".join(words[i:i + line_length]) for i in range(0, len(words), line_length)])


def random_prose_text(line_length=4):
    """Returns a random snippet from the Gutenberg corpus."""
    corpus = nltk.corpus.inaugural.raw()
    start = random.randint(0, len(corpus) - 800)
    end = start + 800
    return wrap_text(corpus[start:end], line_length=line_length)


def gen_images():
    # get font name and font files
    font_files = []
    for font_dir in dirs:
        for font_file in os.listdir(font_dir):
            if font_file.endswith('.ttf') or font_file.endswith('.ttc'):
                font_path = os.path.join(font_dir, font_file)
                font_name = font_file.split('.')[0]
                font_files.append((font_path, font_name))

    # Generate images for each font file
    for font_path, font_name in font_files:
        # Output the font name so we can see the progress
        print(font_path, font_name)

        # Counter for the image filename
        j = 0
        for i in range(IMAGES_PER_FONT):  # Generate 50 images per font - reduced to 10 for now to make things faster
            # Random font size
            font_size = random.choice(range(18, 72))

            if font_path.endswith('.ttc'):
                # ttc fonts have multiple fonts in one file, so we need to specify which one we want
                font = ImageFont.truetype(font_path, font_size, index=0)
            elif font_name in FONT_EXCEPTS:
                continue
            else:
                # ttf fonts have only one font in the file
                font = ImageFont.truetype(font_path, font_size)

            # Determine the number of words that will fit on a line
            font_avg_char_width = font.getbbox('x')[2]
            words_per_line = int(800 / (font_avg_char_width * 5))
            prose_sample = random_prose_text(line_length=words_per_line)

            # print("generate font image: " + str(prose_sample))
            for text in [prose_sample]:
                img = Image.new('RGB', (800, 400), color="white")  # Canvas size
                draw = ImageDraw.Draw(img)

                # Random offsets, but ensuring that text isn't too far off the canvas
                offset_x = random.randint(-20, 10)
                offset_y = random.randint(-20, 10)

                # vary the line height
                line_height = random.uniform(0, 1.25) * font_size
                draw.text((offset_x, offset_y), text, fill="black", font=font, spacing=line_height)

                j += 1
                output_file = os.path.join(GEN_IMAGES_DIR, f"{font_name}_{j}.png")
                img.save(output_file)

原始碼都記錄在這裡了:
https://github.com/chenzuoli/font-identifier

本程式碼參考開源專案:https://huggingface.co/gaborcselle/font-identifier

好了,記錄到這裡,持續更新中。

記錄問題也是一種修行。

修行


歡迎關注微信公眾號,你的資源可變現:【樂知付加密平臺】

樂知付加密平臺

歡迎關注微信公眾號,這裡記錄博主的創業之旅:【程式設計師寫書】

程式設計師寫書

一起學習,一起進步。

相關文章