-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f0589e2
commit 1c0a681
Showing
7 changed files
with
214 additions
and
9 deletions.
There are no files selected for viewing
17 changes: 17 additions & 0 deletions
17
blog/claude-3s-exceptional-abilities-at-obscure-languages/content.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
<post-date date="7 March 2024"/> | ||
|
||
# Claude 3's exceptional abilities at obscure languages | ||
|
||
Earlier this week, [Anthropic launched Claude 3](https://www.anthropic.com/news/claude-3-family), its next-generation family of LLMs. The models – Opus, Sonnet, and (soon-to-be-released) Haiku – have already made waves for their ability to trade blows with the previous state-of-the-art, GPT-4. | ||
|
||
In my own testing, I've found Claude 3 to be quite capable and even worthy of hype to some extent. It's not a GPT-4 killer in a general sense, but, for example, [the Opus model matches GPT-4 in common programming tasks](/blog/testing-a-medley-of-local-llms-for-coding/). | ||
|
||
## The meat | ||
|
||
One standout aspect of the Claude 3 Opus model in particular is that it appears to be exceptionally good at reconstructing representations from uncommon data. | ||
|
||
In a recent blog post, [I found that Opus is almost twice as good as GPT-4 and nearly five times as good as GPT-4 Turbo at generating code in an obscure variety of assembly language](/blog/llm-performance-in-retro-assembly-coding/). While in theory it's possible that the difference would come from Anthropic upscaling on training data for obsolete assembly or related topics, I might side with the alternative that Opus is more generally accurate at extrapolating from limited data. | ||
|
||
User reports have also begun popping up of Opus being very capable at dealing with obscure human languages ([for example](https://www.reddit.com/r/singularity/comments/1b8603h/claude_3_opus_is_the_first_language_model_that/)). I can confirm that Opus is able to hold a conversation in a language that has very few native speakers and at which GPT-4 fails almost completely. The model does make mistakes in this, but the output is generally reasonable and understandable. | ||
|
||
It's not clear whether this standout performance is an emergent ability or a targeted effort by Anthropic to increase the representation of languages in Claude, but to me it seems possible that the model is in fact an unusually strong extrapolator of uncommon representations. I assume we'll get a research paper or two on this at some point. |
85 changes: 85 additions & 0 deletions
85
blog/claude-3s-exceptional-abilities-at-obscure-languages/index.html
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
<head> | ||
<meta name="viewport" content="width=device-width"> | ||
<meta http-equiv="content-type" content="text/html; charset=UTF-8"> | ||
|
||
<link rel="stylesheet" href="../+assets/blog.css"> | ||
<link rel="stylesheet" href="/assets/font-awesome-5-15-4/css/all.min.css"> | ||
<script defer src="/assets/font-awesome-5-15-4/attribution.js"></script> | ||
<script defer src="../+assets/highlight.min.js"></script> | ||
<script defer src="/dokki/distributable/dokki.js"></script> | ||
<script type="module" src="../+assets/blog-post-widgets.js"></script> | ||
<script type="module" src="../+assets/post-date.js"></script> | ||
|
||
<style> | ||
.dokki-table.results th.name { | ||
writing-mode: vertical-lr; | ||
font-weight: normal; | ||
} | ||
.dokki-table.results td:first-child { | ||
white-space: pre-line; | ||
min-width: 20rem; | ||
} | ||
|
||
.dokki-table.results td, | ||
.dokki-table.results th { | ||
width: 0px !important; | ||
} | ||
.dokki-table.results td.s0, | ||
.dokki-table.results td.s1, | ||
.dokki-table.results td.s2, | ||
.dokki-table.results td.s3, | ||
.dokki-table.results td.s { | ||
text-align: center !important; | ||
vertical-align: middle !important; | ||
max-width: 0.85em; | ||
} | ||
.dokki-table.results td.s0 { | ||
color: var(--dokkiCSS-page-inert-fg-color); | ||
} | ||
.dokki-table.results td.s1 { | ||
background-color: rgba(0, 0, 0, 0.05); | ||
} | ||
.dokki-table.results td.s2 { | ||
background-color: rgba(0, 0, 0, 0.1); | ||
} | ||
.dokki-table.results td.s3 { | ||
background-color: rgba(0, 0, 0, 0.2); | ||
} | ||
</style> | ||
</head> | ||
<body> | ||
<ths-feedback></ths-feedback> | ||
|
||
|
||
<template id="dokki"> | ||
<dokki-document> | ||
<dokki-header> | ||
<template #caption> | ||
|
||
Claude 3's exceptional abilities at obscure languages | ||
|
||
</template> | ||
<template #widgets> | ||
<blog-post-widgets></blog-post-widgets> | ||
</template> | ||
</dokki-header> | ||
<dokki-topics> | ||
|
||
<post-date date="7 March 2024"></post-date> | ||
<dokki-topic title="Claude 3's exceptional abilities at obscure languages"> | ||
<p>Earlier this week, <a href="https://www.anthropic.com/news/claude-3-family">Anthropic launched Claude 3</a>, its next-generation family of LLMs. The models – Opus, Sonnet, and (soon-to-be-released) Haiku – have already made waves for their ability to trade blows with the previous state-of-the-art, GPT-4.</p> | ||
<p>In my own testing, I've found Claude 3 to be quite capable and even worthy of hype to some extent. It's not a GPT-4 killer in a general sense, but, for example, <a href="/blog/testing-a-medley-of-local-llms-for-coding/">the Opus model matches GPT-4 in common programming tasks</a>.</p> | ||
<dokki-subtopic title="The meat"> | ||
<p>One standout aspect of the Claude 3 Opus model in particular is that it appears to be exceptionally good at reconstructing representations from uncommon data.</p> | ||
<p>In a recent blog post, <a href="/blog/llm-performance-in-retro-assembly-coding/">I found that Opus is almost twice as good as GPT-4 and nearly five times as good as GPT-4 Turbo at generating code in an obscure variety of assembly language</a>. While in theory it's possible that the difference would come from Anthropic upscaling on training data for obsolete assembly or related topics, I might side with the alternative that Opus is more generally accurate at extrapolating from limited data.</p> | ||
<p>User reports have also begun popping up of Opus being very capable at dealing with obscure human languages (<a href="https://www.reddit.com/r/singularity/comments/1b8603h/claude_3_opus_is_the_first_language_model_that/">for example</a>). I can confirm that Opus is able to hold a conversation in a language that has very few native speakers and at which GPT-4 fails almost completely. The model does make mistakes in this, but the output is generally reasonable and understandable.</p> | ||
<p>It's not clear whether this standout performance is an emergent ability or a targeted effort by Anthropic to increase the representation of languages in Claude, but to me it seems possible that the model is in fact an unusually strong extrapolator of uncommon representations. I assume we'll get a research paper or two on this at some point.</p> | ||
</dokki-subtopic></dokki-topic> | ||
|
||
</dokki-topics> | ||
</dokki-document> | ||
</template> | ||
</body> | ||
</html> |
66 changes: 66 additions & 0 deletions
66
blog/claude-3s-exceptional-abilities-at-obscure-languages/index.intermediate.html
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
<head> | ||
<meta name="viewport" content="width=device-width"> | ||
<meta http-equiv="content-type" content="text/html; charset=UTF-8"> | ||
|
||
<link rel="stylesheet" href="../+assets/blog.css"> | ||
<link rel="stylesheet" href="/assets/font-awesome-5-15-4/css/all.min.css"> | ||
<script defer src="/assets/font-awesome-5-15-4/attribution.js"></script> | ||
<script defer src="../+assets/highlight.min.js"></script> | ||
<script defer src="/dokki/distributable/dokki.js"></script> | ||
<script type="module" src="../+assets/blog-post-widgets.js"></script> | ||
<script type="module" src="../+assets/post-date.js"></script> | ||
|
||
<style> | ||
.dokki-table.results th.name { | ||
writing-mode: vertical-lr; | ||
font-weight: normal; | ||
} | ||
.dokki-table.results td:first-child { | ||
white-space: pre-line; | ||
min-width: 20rem; | ||
} | ||
|
||
.dokki-table.results td, | ||
.dokki-table.results th { | ||
width: 0px !important; | ||
} | ||
.dokki-table.results td.s0, | ||
.dokki-table.results td.s1, | ||
.dokki-table.results td.s2, | ||
.dokki-table.results td.s3, | ||
.dokki-table.results td.s { | ||
text-align: center !important; | ||
vertical-align: middle !important; | ||
max-width: 0.85em; | ||
} | ||
.dokki-table.results td.s0 { | ||
color: var(--dokkiCSS-page-inert-fg-color); | ||
} | ||
.dokki-table.results td.s1 { | ||
background-color: rgba(0, 0, 0, 0.05); | ||
} | ||
.dokki-table.results td.s2 { | ||
background-color: rgba(0, 0, 0, 0.1); | ||
} | ||
.dokki-table.results td.s3 { | ||
background-color: rgba(0, 0, 0, 0.2); | ||
} | ||
</style> | ||
</head> | ||
<body> | ||
<ths-feedback></ths-feedback> | ||
<template dokki-document> | ||
<section title> | ||
Claude 3's exceptional abilities at obscure languages | ||
</section> | ||
<section widgets> | ||
<blog-post-widgets/> | ||
</section> | ||
<section content> | ||
<article src="content.md"></article> | ||
</section> | ||
</template> | ||
</body> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.