【每週一讀】What is prompt-tuning?

Aikoin發表於2024-04-30

原文連結:https://research.ibm.com/blog/what-is-ai-prompt-tuning

原文作者:Kim Martineau

(女作者好耶!)

本文來自IBM的一篇科普性質的部落格。除了介紹手工設計的硬提示(hard prompt)、AI設計的由向量或數字組成的軟提示(soft prompt)以及將軟提示注入不同層的字首微調(prefix-tuning),本文還介紹了prompt-tuning的幾個新興領域,非常有意思。

比如把多工遷移學習的思想引入prompt設計,誒我們能不能設計一個普適的prompt,去學習不同任務之間共享的知識呢?啪,論文MPT出來了。

世界是不停變化的,新知識源源不斷。任務一個接一個,該怎麼在避免遺忘舊知識的情況下,去設計一個prompt連續學習新知識呢?啪,CODA-Prompt又出來了。不僅可以及時修正錯誤,還無需保留個人資料,這就是連續學習用於“即來即走”資料流的好處。

最後一個厲害了,透過prompt設計修正現實世界“不平等”資料引入的模型bias,IBM在2022NeurIPS發了兩篇論文。第一篇FairIJ可以識別訓練集中最具偏見的資料點,並透過附加到原始提示的prompt讓模型將它們排除在外。第二個FairReprogram也是類似的方法,感興趣的可以去搜搜原文。

prompt-tuning不僅降低了重新訓練大模型的成本,還可以糾正模型的行為。缺點就是不可解釋性啦,不過黑盒也是深度模型的通病了。

我做了一些原文摘錄和高亮放在下面,讀原文才是最地道的,這裡就不翻譯啦。

---------------------------------

Prompt-tuning originated with large language models but has since expanded to other foundation models, like transformers that handle other sequential data types, including audio and video. Prompts may be snippets of text, streams of speech, or blocks of pixels in a still image or video.

Hand-crafted prompts were quickly replaced by superior AI-designed prompts consisting of strings of numbers. In a paper the following year, Google researchers introduced so-called “soft” prompts designed by an AI that outperformed human-engineered “hard” prompts.

Around the same time, Stanford researchers introduced prefix-tuning, another automated prompt-design method that allows the model to learn one task after another. Prefix-tuning combines soft prompts with prompts injected into layers of the deep learning model for added flexibility. Though prompt-tuning is more efficient, both techniques let you freeze the model and skip expensive retraining.

Unlike hard prompts, AI-designed soft prompts are unrecognizable to the human eye. Each prompt consists of an embedding, or string of numbers, that distills knowledge from the larger model. High level or task specific, the prompt acts as a substitute for additional training data. Researchers recently estimated that a good language classifier prompt is worth hundreds to thousands of extra data points.

One drawback of prompt-tuning is its lack of interpretability. The AI discovers prompts optimized for a given task but can’t explain why it chose those embeddings. Like deep learning models themselves, soft prompts are opaque.

One area is multi-task learning. Foundation models often need to pivot quickly, from answering customer questions to identifying negative comments in online reviews. Rather than design a unique prompt for each task, researchers are discovering ways to create universal prompts that can be easily recycled.

“Think of it as applying multi-task transfer learning to prompts,” said Panda. “You learn a single prompt that consolidates task-shared knowledge so you can quickly adapt the model.”

In an upcoming paper at the International Conference on Learning Representations (ICLR), Panda and his colleagues show that their Multi-task Prompt Tuning (MPT) method outperformed other methods, and even did better than models fine-tuned on task-specific data.

Another up-and-coming area of research involves finding prompts on the fly as an AI model continually learns new tasks and concepts. Acquiring new knowledge involves updating the model on new data, but sometimes old knowledge gets overwritten in what’s known as catastrophic forgetting.

In a pre-print paper on arXiv, IBM researchers show that a technique called CODA-Prompt can discover prompts for consecutive, never-seen-before tasks, like classifying drawings, followed by paintings and photos without the model forgetting what it originally learned.

This type of flexible prompt for continual learning allows you to fix mistakes as they arise, without retaining the data and running afoul of privacy laws. “Mistakes might be observed in a chat session from user data,” said Leonid Karlinsky, an IBM researcher at the MIT-IBM Lab who co-developed the technique. “CODA-Prompt lets you correct the mistakes without holding on to that personal data.”

Finally, prompt-tuning also shows promise as a quick and low-cost tool to mitigate algorithmic bias. Because AI models are trained on real-world data, they inevitably absorb society’s biases, which can lead to decisions that perpetuate and exacerbate inequities in everything from healthcare to hiring. IBM researchers recently presented a pair of papers at the 2022 NeurIPS conference aimed at counteracting race and gender bias in large language and vision models using AI-designed prompts.

One of the researchers’ methods, called FairIJ, identifies the most biased data points in the model’s training set and has the model set them aside via prompts appended to the model’s original prompts. Tested on a salary-prediction task, a model tuned with FairIJ achieved more accurate, less biased results than several top bias-mitigation methods, the researchers found.

Prompt-tuning not only shrinks the cost of tailoring large models to new applications, said IBM's Cox, it can correct the model's behavior — in this case, mitigating bias.

相關文章