基於 Diffusion Transformer(DiT)又迎來一大力作「Flag-DiT」,這次要將影像、影片、音訊和 3D「一網打盡」。
論文地址:https://arxiv.org/pdf/2405.05945 GitHub 地址:https://github.com/Alpha-VLLM/Lumina-T2X 模型下載地址:https://huggingface.co/Alpha-VLLM/Lumina-T2I/tree/main 論文標題:Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
試用地址 1:http://106.14.2.150:10021/ 試用地址 2:http://106.14.2.150:10022/