A Gentle Introduction to Attention and Transformer Models
This post is divided into three parts; they are: • Origination of the Transformer Model • The Transformer Architecture • Variations of the Transformer Architecture Transformer architecture originated from the 2017 paper "Attention is All You Need" by Vaswani et al.
