Skip to content

Commit 38b809c

Browse files
committed
Applying LLMs
1 parent dd25c44 commit 38b809c

4 files changed

+113
-1
lines changed

_posts/2016-12-14-visual-interactive-guide-basics-neural-networks.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ title: A Visual and Interactive Guide to the Basics of Neural Networks
66
<span class="discussion">Discussions:
77
<a href="https://news.ycombinator.com/item?id=13183171" class="hn-link">Hacker News (63 points, 8 comments)</a>, <a href="https://www.reddit.com/r/programming/comments/5igdix/a_visual_and_interactive_guide_to_the_basics_of/" class="">Reddit r/programming (312 points, 37 comments)</a></span>
88
<br />
9-
<span class="discussion">Translations: <a href="https://rr0.org/people/a/AlammarJay/visual-interactive-guide-basics-neural-networks/index_fr.html">French</a>, <a href="https://camporeale.github.io/guia-interactiva-visual-conceptos-basicos-redes-neuronales/">Spanish</a>
9+
<span class="discussion">Translations: <a href="https://ai-ds.thakaa.sa/post/dlyl-mry-y-wtfaaly-l-ssyt-lshbkt-laasbwny">Arabic</a>, <a href="https://rr0.org/people/a/AlammarJay/visual-interactive-guide-basics-neural-networks/index_fr.html">French</a>, <a href="https://camporeale.github.io/guia-interactiva-visual-conceptos-basicos-redes-neuronales/">Spanish</a>
1010
</span>
1111

1212

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
layout: prediction_post
3+
published: True
4+
title: Applying massive language models in the real world with Cohere
5+
---
6+
7+
A little less than a year ago, I joined the awesome <a href="https://cohere.ai">Cohere</a> team. The company trains massive language models (both GPT-like and BERT-like) and offers them as an API (which also supports finetuning). Its founders include Google Brain alums including co-authors of the original Transformers paper. It's a fascinating role where I get to help companies and developers put these massive models to work solving real-world problems.
8+
9+
I love that I get to share some of the intuitions developers need to start problem-solving with these models. Even though I've been working very closely on pretrained Transformers for the past several years (for this blog and in developing <a href="https://github.com/jalammar/ecco">Ecco</a>), I'm enjoying the convenience of problem-solving with managed language models as it frees up the restrictions of model loading/deployment and memory/GPU management.
10+
11+
These are some of the articles I wrote and collaborated with colleagues on the last few months:
12+
13+
### <a href="https://docs.cohere.ai/intro-to-llms/">Intro to Large Language Models with Cohere</a>
14+
<div class="row two-column-text">
15+
<div class="col-md-6 col-xs-12">
16+
<a href="https://docs.cohere.ai/intro-to-llms/"><img src="https://docs.cohere.ai/img/intro-llms/text-to-text-or-embedding-language-model.png" class="small-image"/></a>
17+
</div>
18+
<div class="col-md-6 col-xs-12">
19+
<p>This is a high-level intro to large language models to people who are new to them. It establishes the difference between generative (GPT-like) and representation (BERT-like) models and examples use cases for them.</p>
20+
<p>This is one of the first articles I got to write. It's extracted from a much larger document that I wrote to explore some of the visual language to use in explaining the application of these models.</p>
21+
</div>
22+
</div>
23+
24+
### <a href="https://docs.cohere.ai/prompt-engineering-wiki/">A visual guide to prompt engineering </a>
25+
26+
<div class="row two-column-text">
27+
<div class="col-md-6 col-xs-12">
28+
<a href="https://docs.cohere.ai/prompt-engineering-wiki/"><img src="/images/cohere/language-model-input-prompt.png" class="small-image"/></a>
29+
</div>
30+
<div class="col-md-6 col-xs-12">
31+
<p>Massive GPT models open the door for a new way of programming. If you structure the input text in the right way, you can useful (and often fascinating) results for a lot of taasks (e.g. text classification, copy writing, summarization...etc).
32+
</p>
33+
<p>This article visually demonstrates four principals to create prompts effectively. </p>
34+
</div>
35+
</div>
36+
37+
38+
### <a href="https://docs.cohere.ai/text-summarization-example/"> Text Summarization</a>
39+
40+
<div class="row two-column-text">
41+
<div class="col-md-6 col-xs-12">
42+
<a href="https://docs.cohere.ai/text-summarization-example/"><img src="https://github.com/cohere-ai/notebooks/raw/main/notebooks/images/summarization.png" class="small-image"/></a>
43+
</div>
44+
<div class="col-md-6 col-xs-12">
45+
<p>This is a walkthrough of creating a simple summarization system. It links to a jupyter notebook which includes the code to start experimenting with text generation and summarization.</p>
46+
<p>The end of this notebook shows an important idea I want to spend more time on in the future. That of how to rank/filter/select the best from amongst multiple generations.</p>
47+
</div>
48+
</div>
49+
50+
51+
### <a href="https://docs.cohere.ai/semantic-search/">Semantic Search</a>
52+
53+
<div class="row two-column-text">
54+
<div class="col-md-6 col-xs-12">
55+
<a href="https://docs.cohere.ai/semantic-search/"><img src="https://github.com/cohere-ai/notebooks/raw/main/notebooks/images/basic-semantic-search-overview.png?3" class="small-image"/></a>
56+
</div>
57+
<div class="col-md-6 col-xs-12">
58+
<p>Semantic search has to be one of the most exciting applications of sentence embedding models. This tutorials implements a "similar questions" functionality using sentence embeddings and a a vector search library.</p>
59+
<p>The vector search library used here is <a href="https://github.com/spotify/annoy">Annoy</a> from Spotify. There are a bunch of others out there. <a href="https://github.com/facebookresearch/faiss">Faiss</a> is used widely. I experiment with <a href="https://github.com/lmcinnes/pynndescent">PyNNDescent</a> as well.</p>
60+
</div>
61+
</div>
62+
63+
64+
### <a href="https://docs.cohere.ai/finetuning-representation-models/"> Finetuning Representation Models</a>
65+
66+
<div class="row two-column-text">
67+
<div class="col-md-6 col-xs-12">
68+
<a href="https://docs.cohere.ai/finetuning-representation-models/"><img src="https://docs.cohere.ai/img/finetuning-rep/semantic-embed-labeled.png" class="small-image"/></a>
69+
</div>
70+
<div class="col-md-6 col-xs-12">
71+
<p>Finetuning tends to lead to the best results language models can achieve. This article explains the intuitions around finetuning representation/sentence embedding models. I've added a couple more visuals to the <a href="https://twitter.com/JayAlammar/status/1490712428686024705">Twitter thread</a>.</p>
72+
<p>The research around this area is very interesting. I've highly enjoyed papers like <a href="https://arxiv.org/abs/1908.10084">Sentence BERT</a> and <a href="https://arxiv.org/abs/2007.00808">Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval</a></p>
73+
</div>
74+
</div>
75+
76+
77+
### <a href="https://docs.cohere.ai/token-picking/">Controlling Generation with top-k & top-p</a>
78+
79+
<div class="row two-column-text">
80+
<div class="col-md-6 col-xs-12">
81+
<a href="https://docs.cohere.ai/token-picking/"><img src="https://docs.cohere.ai/img/token-picking/language-model-probability-distribution-output-tokens.png" class="small-image"/></a>
82+
</div>
83+
<div class="col-md-6 col-xs-12">
84+
<p>This one is a little bit more technical. It explains the parameters you tweak to adjust a GPT's <i>decoding strategy</i> -- the method with which the system picks output tokens.
85+
</p>
86+
</div>
87+
</div>
88+
89+
90+
### <a href="https://docs.cohere.ai/text-classification-embeddings/">Text Classification Using Embeddings</a>
91+
92+
<div class="row two-column-text">
93+
<div class="col-md-6 col-xs-12">
94+
<a href="https://docs.cohere.ai/text-classification-embeddings/"><img src="https://github.com/cohere-ai/notebooks/raw/main/notebooks/images/simple-classifier-embeddings.png" class="small-image"/></a>
95+
</div>
96+
<div class="col-md-6 col-xs-12">
97+
<p>
98+
This is a walkthrough of one of the most common use cases of embedding models -- text classification. It is similar to <a href="http://127.0.0.1:4000/a-visual-guide-to-using-bert-for-the-first-time/">A Visual Guide to Using BERT for the First Time</a>, but uses Cohere's API.
99+
</p>
100+
</div>
101+
</div>
102+
103+
You can find these and upcoming articles in the <a href="https://docs.cohere.ai/">Cohere docs</a> and <a href="https://github.com/cohere-ai/notebooks">notebooks repo</a>. I have quite number of experiments and interesting workflows I'd love to be sharing in the coming weeks. So stay tuned!
28.9 KB
Loading

style.scss

+9
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)