Skip to content

Commit 3e4dd3f

Browse files
committed
Merge branch 'master' of github.com:glouppe/info8010-deep-learning
2 parents eb5c692 + 0ea0fef commit 3e4dd3f

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

lecture7.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -311,7 +311,7 @@ $$a(\mathbf{q}, \mathbf{k}) = \frac{\mathbf{q}^T \mathbf{k}}{\sqrt{d}}.$$
311311
class: middle
312312

313313
For $n$ queries $\mathbf{Q} \in \mathbb{R}^{n \times d}$, keys $\mathbf{K} \in \mathbb{R}^{m \times d}$ and values $\mathbf{V} \in \mathbb{R}^{m \times v}$, the **scaled dot-product attention** layer computes an output tensor
314-
$$\mathbf{Y} = \underbrace{\text{softmax}\left(\frac{\mathbf{QK}^T)}{\sqrt{d}}\right)}\_{\text{attention matrix}\, \mathbf{A}}\mathbf{V} \in \mathbb{R}^{n \times v}.$$
314+
$$\mathbf{Y} = \underbrace{\text{softmax}\left(\frac{\mathbf{QK}^T}{\sqrt{d}}\right)}\_{\text{attention matrix}\, \mathbf{A}}\mathbf{V} \in \mathbb{R}^{n \times v}.$$
315315

316316
---
317317

@@ -732,4 +732,4 @@ Decision Transformer: Reinforcement Learning via Sequence Modeling.
732732
class: end-slide, center
733733
count: false
734734

735-
The end.
735+
The end.

0 commit comments

Comments
 (0)