You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pretrained transformers <ahref=https://w4ngatang.github.io/static/papers/superglue.pdftarget="_blank" rel="noopener noreferrer">dominate</a> most NLP tasks;</li>
205
205
<listyle="margin-top: 12px;">
206
206
Bigger CNNs <ahref="https://arxiv.org/abs/1912.11370" target="_blank" rel="noopener noreferrer">perform better</a> at computer vision;</li>
207
-
<listyle="margin-top: 12px;">GPT-3 has <ahref="https://arxiv.org/abs/2005.14165" target="_blank" rel="noopener noreferrer">175B</a> parameters and <atarget="_blank" rel="noopener noreferrer" href="https://arxiv.org/abs/2006.16668">still counting!</a></li>
207
+
<listyle="margin-top: 12px;">
208
+
GPT-3 has <ahref="https://arxiv.org/abs/2005.14165" target="_blank" rel="noopener noreferrer">175B</a> parameters and <atarget="_blank" rel="noopener noreferrer" href="https://arxiv.org/abs/2006.16668">still counting!</a></li>
208
209
</ul>
209
210
210
211
With transfer learning, these large models can harness nearly unlimited raw data to improve performance on both <ahref=https://paperswithcode.com/task/language-modellingtarget="_blank" rel="noopener noreferrer">academic benchmarks</a> and solve <ahref=https://medium.com/towards-artificial-intelligence/crazy-gpt-3-use-cases-232c22142044target="_blank" rel="noopener noreferrer">new unexpected</a> tasks.
That said, training large neural networks isn't cheap. The hardware used for the <ahref="https://arxiv.org/abs/1909.08053" target="_blank" rel="noopener noreferrer">previous largest</a> language model costs over $25 million. A single training run for GPT-3 will set you back <ahref="https://lambdalabs.com/blog/demystifying-gpt-3/" target="_blank" rel="noopener noreferrer">at least $4.6M</a> in cloud GPUs. As a result, researchers can't contribute to state-of-the-art deep learning models and practitioners can't build applications without <ahref=https://blogs.microsoft.com/ai/openai-azure-supercomputertarget="_blank" rel="noopener noreferrer">being supported</a> by a megacorporation. If we want AI to benefit everyone, it can't be privately owned.
235
+
That said, training large neural networks isn't cheap. The hardware used for the <ahref="https://arxiv.org/abs/1909.08053" target="_blank" rel="noopener noreferrer">previous largest</a> language model costs over $25 million. A single training run for GPT-3 will set you back <ahref="https://lambdalabs.com/blog/demystifying-gpt-3/" target="_blank" rel="noopener noreferrer">at least $4.6M</a> in cloud GPUs. As a result, researchers can't contribute to state-of-the-art deep learning models and practitioners can't build applications without <ahref=https://blogs.microsoft.com/ai/openai-azure-supercomputertarget="_blank" rel="noopener noreferrer">being supported</a> by a megacorporation. If we want the future of AI to be open, it can't be privately owned.
<aclass="github-button" href="https://github.com/learning-at-home/hivemind" data-size="large" data-show-count="true" aria-label="Star learning-at-home/hivemind on GitHub">Browse the code</a>
0 commit comments