Simple Transformer example

yossik32 · Post by **yossik32** » Tue Mar 31, 2026 8:03 am

Tiny Transformer-Style Attention Example

What this example teaches :

This tiny demo shows the core transformer idea:

1. Each earlier word has a key

This helps decide whether the word is important.

2. Each earlier word also has a value

This carries useful meaning.

3. The current position creates a query

This asks:

“Which earlier word matters most right now?”

4. Attention weights are computed

The model gives more weight to the useful token.

5. The weighted information is combined

That combined result helps make a prediction.

CPad users community board

Simple Transformer example

Simple Transformer example

Who is online