KV Cache Explained

Visualizing how transformers avoid recomputing past key/value pairs.

Input Sequence:

Enter a short sequence (words separated by spaces).

1. Input Tokens

Tokens will appear here.

Waiting for input...

The current query attends to its own K/V pairs and all pairs from the cache.

Stores K/V pairs from past tokens to be reused.

Cache is empty.

Click "Start / Next Step" to begin processing the sequence.