Skip to content

Commit 5b0d27d

Browse files
committed
- Publish draft 0 of Stable diffusion
1 parent 9ff01fc commit 5b0d27d

File tree

51 files changed

+701
-1
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+701
-1
lines changed

Diff for: _posts/2022-09-06-generalist-agent.md

+256
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,256 @@
1+
---
2+
layout: prediction_post
3+
published: False
4+
title: The Illustrated Generalist Agent (Gato)
5+
---
6+
7+
8+
Could you train one machine learning model to learn hundreds of tasks spanning text, computer vision, and playing video games and controlling robots? In this post and video we go over DeepMind’s GATO that does this with a model that is simpler and smaller than you may think. It’s a GPT-like model that learns over 600 tasks. It opens the door to World Scope 4 as discussed in the Experience Grounds Language Video.
9+
10+
11+
<div class="img-div" markdown="0">
12+
<img src="/images/gato/.png" />
13+
<br />
14+
15+
</div>
16+
17+
18+
<div class="img-div" markdown="0">
19+
<img src="/images/gato/gato-paper-figure-1.png" />
20+
<br />
21+
Figure 1 from the paper
22+
</div>
23+
24+
25+
<div class="img-div" markdown="0">
26+
<img src="/images/gato/gato-paper-figure-2.png" />
27+
<br />
28+
Figure 2 from the paper
29+
</div>
30+
31+
32+
# Modalities map
33+
34+
35+
<div class="img-div" markdown="0">
36+
<img src="/images/gato/GPT-modalities.png" />
37+
<br />
38+
GPT
39+
</div>
40+
41+
42+
43+
<div class="img-div" markdown="0">
44+
<img src="/images/gato/bert - modalities.png" />
45+
<br />
46+
BERT
47+
</div>
48+
49+
50+
<div class="img-div" markdown="0">
51+
<img src="/images/gato/GAN - modalities.png" />
52+
<br />
53+
GAN
54+
</div>
55+
56+
<div class="img-div" markdown="0">
57+
<img src="/images/gato/clip modalities.png" />
58+
<br />
59+
CLIP
60+
</div>
61+
62+
<div class="img-div" markdown="0">
63+
<img src="/images/gato/Dalle stable diffusion image gen modalities.png" />
64+
<br />
65+
DallE / Stable Diffusion
66+
</div>
67+
68+
69+
<div class="img-div" markdown="0">
70+
<img src="/images/gato/gato modalities.png" />
71+
<br />
72+
Gato
73+
</div>
74+
75+
76+
<div class="img-div" markdown="0">
77+
<img src="/images/gato/gato-modalities-sequences.png" />
78+
<br />
79+
Gato sequences
80+
</div>
81+
82+
83+
84+
85+
<div class="img-div" markdown="0">
86+
<img src="/images/gato/gato-modalities-sequences.png" />
87+
<br />
88+
Figure 3 from the paper
89+
</div>
90+
91+
92+
93+
94+
<div class="img-div" markdown="0">
95+
<img src="/images/gato/table-1-datasets-gato.png" />
96+
<br />
97+
Table 1 from the paper - datasets
98+
</div>
99+
100+
101+
<div class="img-div" markdown="0">
102+
<img src="/images/gato/gato-paper-figure-4.png" />
103+
<br />
104+
Figure 4 from the paper
105+
</div>
106+
107+
108+
109+
## Performance and results
110+
111+
<div class="img-div" markdown="0">
112+
<img src="/images/gato/gato-paper-figure-5.png" />
113+
<br />
114+
Figure 5 from the paper
115+
</div>
116+
117+
118+
<div class="img-div" markdown="0">
119+
<img src="/images/gato/figure-5-explainer-1-at-0.png" />
120+
<br />
121+
Figure 5 from the paper at 0
122+
</div>
123+
124+
125+
126+
<div class="img-div" markdown="0">
127+
<img src="/images/gato/figure-5-explainer-2-at-50.png" />
128+
<br />
129+
Figure 5 from the paper at 50
130+
</div>
131+
132+
<div class="img-div" markdown="0">
133+
<img src="/images/gato/figure-5-explainer-3-at-100.png" />
134+
<br />
135+
Figure 5 from the paper at 100
136+
</div>
137+
138+
<div class="img-div" markdown="0">
139+
<img src="/images/gato/experts-vs-gato-scores.png" />
140+
<br />
141+
GATO vs. Experts scoring
142+
</div>
143+
144+
145+
## Tokenization
146+
147+
148+
<div class="img-div" markdown="0">
149+
<img src="/images/gato/text-tokens.png" />
150+
<br />
151+
Text tokenization
152+
</div>
153+
154+
155+
<div class="img-div" markdown="0">
156+
<img src="/images/gato/image-tokens.png" />
157+
<br />
158+
Image tokenization
159+
</div>
160+
161+
162+
163+
<div class="img-div" markdown="0">
164+
<img src="/images/gato/text-plus-images.png" />
165+
<br />
166+
Text + Image tokenization
167+
</div>
168+
169+
170+
171+
<div class="img-div" markdown="0">
172+
<img src="/images/gato/image-captioning.png" />
173+
<br />
174+
Text + Image tokenization - image captioning
175+
</div>
176+
177+
178+
## Discrete values
179+
180+
<div class="img-div" markdown="0">
181+
<img src="/images/gato/text-images-discrete-inputs.png" />
182+
<br />
183+
Text + Image tokenization - image captioning
184+
</div>
185+
186+
<div class="img-div" markdown="0">
187+
<img src="/images/gato/discrete-actions.png" />
188+
<br />
189+
190+
</div>
191+
192+
193+
<div class="img-div" markdown="0">
194+
<img src="/images/gato/actions-embeddings.png" />
195+
<br />
196+
197+
</div>
198+
199+
200+
201+
<div class="img-div" markdown="0">
202+
<img src="/images/gato/hadoken-sequence.png" />
203+
<br />
204+
205+
</div>
206+
207+
208+
209+
210+
# Timesteps & episodes
211+
212+
<div class="img-div" markdown="0">
213+
<img src="/images/gato/atari-image-action.png" />
214+
<br />
215+
Image + controller
216+
</div>
217+
218+
219+
<div class="img-div" markdown="0">
220+
<img src="/images/gato/image-action-timesteps.png" />
221+
<br />
222+
Image + controller
223+
</div>
224+
225+
226+
227+
<div class="img-div" markdown="0">
228+
<img src="/images/gato/.png" />
229+
<br />
230+
Image + controller vector sequence
231+
</div>
232+
233+
234+
## Continuous values
235+
236+
237+
238+
## Native and non-native modalities
239+
240+
241+
[Translating ]
242+
243+
<div class="img-div" markdown="0">
244+
<img src="/images/gato/.png" />
245+
<br />
246+
Expert sequences
247+
</div>
248+
249+
250+
251+
252+
<div class="img-div" markdown="0">
253+
<img src="/images/gato/.png" />
254+
<br />
255+
256+
</div>

0 commit comments

Comments
 (0)