大模型學習筆記:attention 機制

dudu發表於2024-11-24
  • Understanding Query, Key, Value in Transformers and LLMs

This self-attention process is at the core of what makes transformers so powerful. They allow every word (or token) to dynamically adjust its importance based on the surrounding context, leading to a more accurate and nuanced understanding as the model processes multiple layers of the network.

  • Must-Read Starter Guide to Mastering Attention Mechanisms in Machine Learning

    • Soft Attention
    • Hard Attention
    • Self-Attention
    • Global Attention
  • Explainable AI: Visualizing Attention in Transformers

相關文章