影片去重關鍵技術-影片鏡頭智慧分割。使用大模型TransNetV2實現自動剪輯。Python自動指令碼。

小红帽大灰狼發表於2024-11-22

原文網址 : https://www.cnblogs.com/xiaohonmao/p/18563497

大模型Python指令碼

影片去重和鏡頭智慧分割是現代影片處理中的重要技術，尤其是在內容創作和管理中。下面我將為你詳細解釋這一過程中的關鍵概念、相關技術以及如何利用大模型TransNetV2來實現自動剪輯，並給出一個Python自動指令碼的示例。

關鍵概念解釋

影片去重：指的是從一段影片中識別並刪除重複的片段，以減少冗餘內容，提高影片的觀賞性和資訊密度。
鏡頭智慧分割：是將影片分割成多個鏡頭（場景），每個鏡頭代表一個獨立的場景或事件，這樣可以更方便地進行分析和編輯。
TransNetV2：是一種基於深度學習的模型，專門用於影片分析和處理。它透過學習影片中的時間和空間特徵，能夠有效地識別鏡頭的起始和結束位置。
Python自動指令碼：使用Python編寫的程式，可以自動執行影片處理任務，如鏡頭分割和去重，減少人工干預。

去重手段

市面上影片去重軟體有很多，但我們要把去重功能整合到自己的指令碼中，所以不能使用其他軟體，換言之要自己實現去重功能。

去重的手段主要有以下：

抽幀
插幀
裁剪
改解析度
改畫面比例
新增片頭片尾
修改幀率
修改位元率
修改亮度、對比度、飽和度
變速
新增影片蒙版
新增圖片蒙版
新增貼紙
新增轉場
新增背景圖片/影片
掃光
修改背景音樂音量
新增額外音軌
映象反轉
畫中畫
修改後設資料
加水印
打亂鏡頭順序

Python 自動指令碼示例

以下是一個使用TransNetV2模型進行影片鏡頭智慧分割的Python自動指令碼示例：

main.py

import os

from moviepy.editor import VideoFileClip
from transnetv2 import TransNetV2


if __name__ == '__main__':

    video_path = input('請輸入影片檔案路徑\n')
    while not os.path.isfile(video_path):
        video_path = input('請輸入正確的影片檔案路徑\n')
    video_name = os.path.basename(video_path)
    video_name_without_ext = os.path.splitext(video_name)[0]
    video_folder = os.path.dirname(video_path)
    output_folder = os.path.join(video_folder, video_name_without_ext)
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    model = TransNetV2()
    video_frames, single_frame_predictions, all_frame_predictions = model.predict_video_2(video_path)
    scenes = model.predictions_to_scenes(single_frame_predictions)

    video_clip = VideoFileClip(video_path)
    for i, (start, end) in enumerate(scenes):
        start_time = start / video_clip.fps
        end_time = end / video_clip.fps
        segment_clip = video_clip.subclip(start_time, end_time)
        output_path = os.path.join(output_folder, f'{video_name_without_ext}_{i+1}.mp4')
        segment_clip.write_videofile(output_path, codec='libx264', fps=video_clip.fps)
    video_clip.close()

    input('\n任務已完成，按Enter鍵退出……')

main.py

import os

from moviepy.editor import VideoFileClip
from transnetv2 import TransNetV2


if __name__ == '__main__':

    video_path = input('請輸入影片檔案路徑\n')
    while not os.path.isfile(video_path):
        video_path = input('請輸入正確的影片檔案路徑\n')
    video_name = os.path.basename(video_path)
    video_name_without_ext = os.path.splitext(video_name)[0]
    video_folder = os.path.dirname(video_path)
    output_folder = os.path.join(video_folder, video_name_without_ext)
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    model = TransNetV2()
    video_frames, single_frame_predictions, all_frame_predictions = model.predict_video_2(video_path)
    scenes = model.predictions_to_scenes(single_frame_predictions)

    video_clip = VideoFileClip(video_path)
    for i, (start, end) in enumerate(scenes):
        start_time = start / video_clip.fps
        end_time = end / video_clip.fps
        segment_clip = video_clip.subclip(start_time, end_time)
        output_path = os.path.join(output_folder, f'{video_name_without_ext}_{i+1}.mp4')
        segment_clip.write_videofile(output_path, codec='libx264', fps=video_clip.fps)
    video_clip.close()

    input('\n任務已完成，按Enter鍵退出……')

transnetv2.py

import math

import os

os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'

import numpy as np
import tensorflow as tf
from moviepy.editor import VideoFileClip


class TransNetV2:

    def __init__(self, model_dir=None):
        if model_dir is None:
            # model_dir = os.path.join(os.path.dirname(__file__), "transnetv2-weights/")
            model_dir = "transnetv2-weights/"
            if not os.path.isdir(model_dir):
                raise FileNotFoundError(f"[TransNetV2] ERROR: {model_dir} is not a directory.")
            else:
                print(f"[TransNetV2] Using weights from {model_dir}.")

        self._input_size = (27, 48, 3)
        try:
            self._model = tf.saved_model.load(model_dir)
        except OSError as exc:
            raise IOError(f"[TransNetV2] It seems that files in {model_dir} are corrupted or missing. "
                          f"Re-download them manually and retry. For more info, see: "
                          f"https://github.com/soCzech/TransNetV2/issues/1#issuecomment-647357796") from exc

    def predict_raw(self, frames: np.ndarray):
        assert len(frames.shape) == 5 and frames.shape[2:] == self._input_size, \
            "[TransNetV2] Input shape must be [batch, frames, height, width, 3]."
        frames = tf.cast(frames, tf.float32)

        logits, dict_ = self._model(frames)
        single_frame_pred = tf.sigmoid(logits)
        all_frames_pred = tf.sigmoid(dict_["many_hot"])

        return single_frame_pred, all_frames_pred

    def predict_frames(self, frames: np.ndarray):
        assert len(frames.shape) == 4 and frames.shape[1:] == self._input_size, \
            "[TransNetV2] Input shape must be [frames, height, width, 3]."

        def input_iterator():
            # return windows of size 100 where the first/last 25 frames are from the previous/next batch
            # the first and last window must be padded by copies of the first and last frame of the video
            no_padded_frames_start = 25
            no_padded_frames_end = 25 + 50 - (len(frames) % 50 if len(frames) % 50 != 0 else 50)  # 25 - 74

            start_frame = np.expand_dims(frames[0], 0)
            end_frame = np.expand_dims(frames[-1], 0)
            padded_inputs = np.concatenate(
                [start_frame] * no_padded_frames_start + [frames] + [end_frame] * no_padded_frames_end, 0
            )

            ptr = 0
            while ptr + 100 <= len(padded_inputs):
                out = padded_inputs[ptr:ptr + 100]
                ptr += 50
                yield out[np.newaxis]

        predictions = []

        for inp in input_iterator():
            single_frame_pred, all_frames_pred = self.predict_raw(inp)
            predictions.append((single_frame_pred.numpy()[0, 25:75, 0],
                                all_frames_pred.numpy()[0, 25:75, 0]))

            print("\r[TransNetV2] Processing video frames {}/{}".format(
                min(len(predictions) * 50, len(frames)), len(frames)
            ), end="")

        print("\n")

        single_frame_pred = np.concatenate([single_ for single_, all_ in predictions])
        all_frames_pred = np.concatenate([all_ for single_, all_ in predictions])

        return single_frame_pred[:len(frames)], all_frames_pred[:len(frames)]  # remove extra padded frames

    def predict_video(self, video_fn: str):
        try:
            import ffmpeg
        except ModuleNotFoundError:
            raise ModuleNotFoundError("For `predict_video` function `ffmpeg` needs to be installed in order to extract "
                                      "individual frames from video file. Install `ffmpeg` command line tool and then "
                                      "install python wrapper by `pip install ffmpeg-python`.")

        print("[TransNetV2] Extracting frames from {}".format(video_fn))
        video_stream, err = ffmpeg.input(video_fn).output(
            "pipe:", format="rawvideo", pix_fmt="rgb24", s="48x27"
        ).run(capture_stdout=True, capture_stderr=True)

        video = np.frombuffer(video_stream, np.uint8).reshape([-1, 27, 48, 3])
        return (video, *self.predict_frames(video))

    def predict_video_2(self, video_fn: str):
        print("[TransNetV2] Extracting frames from {}".format(video_fn))
        clip = VideoFileClip(video_fn, target_resolution=(27, 48))
        duration = math.floor(clip.duration * 10) / 10
        fps = clip.fps  # 影片的幀率
        frames = []
        for t in range(0, int(duration * fps)):
            frame = clip.get_frame(t / fps)  # 獲取當前時間點的幀
            if len(frame) != 0:  # 如果幀的長度不為零
                frames.append(frame)  # 將幀新增到 frames 列表中
        video = np.array(frames)
        return video, *self.predict_frames(video)

    @staticmethod
    def predictions_to_scenes(predictions: np.ndarray, threshold: float = 0.5):
        predictions = (predictions > threshold).astype(np.uint8)

        scenes = []
        t, t_prev, start = -1, 0, 0
        for i, t in enumerate(predictions):
            if t_prev == 1 and t == 0:
                start = i
            if t_prev == 0 and t == 1 and i != 0:
                scenes.append([start, i])
            t_prev = t
        if t == 0:
            scenes.append([start, i])

        # just fix if all predictions are 1
        if len(scenes) == 0:
            return np.array([[0, len(predictions) - 1]], dtype=np.int32)

        return np.array(scenes, dtype=np.int32)

    @staticmethod
    def visualize_predictions(frames: np.ndarray, predictions):
        from PIL import Image, ImageDraw

        if isinstance(predictions, np.ndarray):
            predictions = [predictions]

        ih, iw, ic = frames.shape[1:]
        width = 25

        # pad frames so that length of the video is divisible by width
        # pad frames also by len(predictions) pixels in width in order to show predictions
        pad_with = width - len(frames) % width if len(frames) % width != 0 else 0
        frames = np.pad(frames, [(0, pad_with), (0, 1), (0, len(predictions)), (0, 0)])

        predictions = [np.pad(x, (0, pad_with)) for x in predictions]
        height = len(frames) // width

        img = frames.reshape([height, width, ih + 1, iw + len(predictions), ic])
        img = np.concatenate(np.split(
            np.concatenate(np.split(img, height), axis=2)[0], width
        ), axis=2)[0, :-1]

        img = Image.fromarray(img)
        draw = ImageDraw.Draw(img)

        # iterate over all frames
        for i, pred in enumerate(zip(*predictions)):
            x, y = i % width, i // width
            x, y = x * (iw + len(predictions)) + iw, y * (ih + 1) + ih - 1

            # we can visualize multiple predictions per single frame
            for j, p in enumerate(pred):
                color = [0, 0, 0]
                color[(j + 1) % 3] = 255

                value = round(p * (ih - 1))
                if value != 0:
                    draw.line((x + j, y, x + j, y - value), fill=tuple(color), width=1)
        return img


def main():
    import sys
    import argparse

    parser = argparse.ArgumentParser()
    parser.add_argument("files", type=str, nargs="+", help="path to video files to process")
    parser.add_argument("--weights", type=str, default=None,
                        help="path to TransNet V2 weights, tries to infer the location if not specified")
    parser.add_argument('--visualize', action="store_true",
                        help="save a png file with prediction visualization for each extracted video")
    args = parser.parse_args()

    model = TransNetV2(args.weights)
    for file in args.files:
        if os.path.exists(file + ".predictions.txt") or os.path.exists(file + ".scenes.txt"):
            print(f"[TransNetV2] {file}.predictions.txt or {file}.scenes.txt already exists. "
                  f"Skipping video {file}.", file=sys.stderr)
            continue

        video_frames, single_frame_predictions, all_frame_predictions = \
            model.predict_video(file)

        predictions = np.stack([single_frame_predictions, all_frame_predictions], 1)
        np.savetxt(file + ".predictions.txt", predictions, fmt="%.6f")

        scenes = model.predictions_to_scenes(single_frame_predictions)
        np.savetxt(file + ".scenes.txt", scenes, fmt="%d")

        if args.visualize:
            if os.path.exists(file + ".vis.png"):
                print(f"[TransNetV2] {file}.vis.png already exists. "
                      f"Skipping visualization of video {file}.", file=sys.stderr)
                continue

            pil_image = model.visualize_predictions(
                video_frames, predictions=(single_frame_predictions, all_frame_predictions))
            pil_image.save(file + ".vis.png")


if __name__ == "__main__":
    main()

完整原始碼

下載地址:https://pan.quark.cn/s/935a6f314af5

即刻收集

使用大模型以及其他深度學習技術，實現ai自動化做影片剪輯
2024-10-30
大模型深度學習AI
自媒體（6）-短影片剪輯
2024-06-14
python實現自動搶課指令碼
2021-12-15
Python指令碼
實現MySQL表結構自動分割槽指令碼
2022-12-07
MySql指令碼
利用Python實現自動掃雷小指令碼
2019-01-10
Python指令碼
win10剪輯影片的軟體怎麼使用_win10自帶影片剪輯軟體的使用步驟
2020-07-14
Win10
影片直播app原始碼，vue實現列表自動滾動的方式
2023-01-09
APP原始碼Vue
實現指令碼自動部署docker
2023-10-10
指令碼Docker
python實現自動提取句子中的關鍵字
2020-10-25
Python
使用 Fastlane 實現 iOS 跟 Android 自動打包指令碼
2019-03-04
ASTiOSAndroid指令碼
Python 實現斷網自動重連
2021-10-18
Python
Python——自動簽到指令碼
2020-10-10
Python指令碼
頂會ICSE-2023釋出LIBRO技術，利用大模型技術進行缺陷重現，自動重現率達33%
2023-05-19
大模型
IT 自動化：如何去實現
2018-06-06
golang一鍵自動安裝指令碼
2019-06-11
Golang指令碼
鏡頭隨人物而動，影片編輯服務讓使用者穩站C位
2022-05-19
自動對視訊進行分割處理自媒體必備剪輯軟體
2022-01-19
實現快剪輯功能的短影片應用
2020-04-20
幾款影片剪輯軟體，輕鬆完成影片轉換，剪輯
2018-11-13
flyway實現java 自動升級SQL指令碼
2021-07-07
JavaSQL指令碼
使用開源API Logic Server實現業務邏輯模型自動化
2022-07-03
APIServer模型
Python 指令碼自動視窗截圖
2020-05-15
Python指令碼
開機自動執行python指令碼
2021-09-09
Python指令碼
iMovie影片剪輯入門
2024-06-27
微影片剪輯編輯器是一款簡單實用的短影片剪輯製作工具
2020-12-04
win10如何剪輯影片_win10剪輯影片的圖文教程
2020-07-14
Win10
《轉載》Jenkins持續整合－自動化部署指令碼的實現《python》
2020-04-04
Jenkins指令碼Python
linux透過shell指令碼實現ssh互動式自動化
2022-08-07
Linux指令碼
centos 自動啟動指令碼和自啟動服務
2020-09-27
CentOS指令碼
按鍵大師：用Python實現無人值守的自動化操作
2024-06-28
Python
specjvm自動化指令碼
2018-12-21
JVM指令碼
自動備份指令碼
2021-11-12
指令碼
Ubuntu自動啟動配置指令碼
2020-12-15
Ubuntu指令碼
影片剪輯軟體：Trimma for Mac
2023-10-26
Mac
什麼是影片剪輯SDK？
2023-09-26
Final Cut Pro X 影片剪輯
2022-05-12
MoneyPrinterPlus:AI自動短影片生成工具,詳細使用教程
2024-06-17
AI
如何實現工具無關化？關於自動化測試指令碼的設計
2022-12-09
指令碼

影片去重關鍵技術-影片鏡頭智慧分割。使用大模型TransNetV2實現自動剪輯。Python自動指令碼。

關鍵概念解釋

去重手段

Python 自動指令碼示例

相關文章