Skip to content

Commit b35978a

Browse files
committed
Rewrite local model notes
1 parent 1a7a0b7 commit b35978a

File tree

1 file changed

+42
-35
lines changed

1 file changed

+42
-35
lines changed

examples/Detoxify.ipynb

Lines changed: 42 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -189,10 +189,12 @@
189189
]
190190
},
191191
{
192-
"metadata": {},
193192
"cell_type": "markdown",
194-
"source": "## Initializing TMaRCo",
195-
"id": "1eb7719e30054304"
193+
"id": "1eb7719e30054304",
194+
"metadata": {},
195+
"source": [
196+
"## Initializing TMaRCo"
197+
]
196198
},
197199
{
198200
"cell_type": "code",
@@ -205,43 +207,48 @@
205207
]
206208
},
207209
{
208-
"metadata": {},
209210
"cell_type": "markdown",
211+
"id": "3e16ee305f4983d9",
212+
"metadata": {},
210213
"source": [
211214
"This will initialize `TMaRCo` using the default models, taken from HuggingFace.\n",
212215
"<div class=\"alert alert-info\">\n",
213-
"To use local models with TMaRCo, we need to have them in a local storage, accessible to TMaRCo, initialize separately, and pass them to TMaRCo's constructor.\n",
216+
"To use local models with TMaRCo, we need to have the pre-initialized models in a local storage that is accessible to TMaRCo.\n",
214217
"</div>\n",
215218
"For instance, to use the default `facebook/bart-large` model, but locally. First, we would need to retrieve the model:"
216-
],
217-
"id": "3e16ee305f4983d9"
219+
]
218220
},
219221
{
220-
"metadata": {},
221222
"cell_type": "code",
222-
"outputs": [],
223223
"execution_count": null,
224+
"id": "614c9ff6f46a0ea9",
225+
"metadata": {},
226+
"outputs": [],
224227
"source": [
225228
"from huggingface_hub import snapshot_download\n",
226229
"\n",
227230
"snapshot_download(repo_id=\"facebook/bart-large\", local_dir=\"models/bart\")"
228-
],
229-
"id": "614c9ff6f46a0ea9"
231+
]
230232
},
231233
{
232-
"metadata": {},
233234
"cell_type": "markdown",
234-
"source": "We now initialize the base model and tokenizer from local files and pass them to `TMaRCo`:",
235-
"id": "95bd792e757205d6"
235+
"id": "95bd792e757205d6",
236+
"metadata": {},
237+
"source": [
238+
"We now initialize the base model and tokenizer from local files and pass them to `TMaRCo`:"
239+
]
236240
},
237241
{
238-
"metadata": {},
239242
"cell_type": "code",
243+
"execution_count": null,
244+
"id": "f0f24485822a7c3f",
245+
"metadata": {},
246+
"outputs": [],
240247
"source": [
241248
"from transformers import BartForConditionalGeneration, BartTokenizer\n",
242249
"\n",
243250
"tokenizer = BartTokenizer.from_pretrained(\n",
244-
" \"models/bart\", # Or directory where the local model is stored \n",
251+
" \"models/bart\", # Or directory where the local model is stored\n",
245252
" is_split_into_words=True, add_prefix_space=True\n",
246253
")\n",
247254
"\n",
@@ -255,10 +262,7 @@
255262
"\n",
256263
"# Initialize TMaRCo with local models\n",
257264
"tmarco = TMaRCo(tokenizer=tokenizer, base_model=base)"
258-
],
259-
"id": "f0f24485822a7c3f",
260-
"outputs": [],
261-
"execution_count": null
265+
]
262266
},
263267
{
264268
"cell_type": "code",
@@ -286,29 +290,30 @@
286290
]
287291
},
288292
{
289-
"metadata": {},
290293
"cell_type": "markdown",
294+
"id": "c113208c527c342e",
295+
"metadata": {},
291296
"source": [
292297
"<div class=\"alert alert-info\">\n",
293-
"To use local expert/anti-expert models with TMaRCo, we need to have them in a local storage, accessible to TMaRCo, as previously.\n",
298+
"To use local expert/anti-expert models with TMaRCo, we need to have them in a local storage that is accessible to TMaRCo, as previously.\n",
299+
"\n",
294300
"However, we don't need to initialize them separately, and can pass the directory directly.\n",
295301
"</div>\n",
296302
"If we want to use local models with `TMaRCo` (in this case the same default `gminus`/`gplus`):\n"
297-
],
298-
"id": "c113208c527c342e"
303+
]
299304
},
300305
{
301-
"metadata": {},
302306
"cell_type": "code",
303-
"outputs": [],
304307
"execution_count": null,
308+
"id": "dfa288dcb60102c",
309+
"metadata": {},
310+
"outputs": [],
305311
"source": [
306312
"snapshot_download(repo_id=\"trustyai/gminus\", local_dir=\"models/gminus\")\n",
307313
"snapshot_download(repo_id=\"trustyai/gplus\", local_dir=\"models/gplus\")\n",
308314
"\n",
309315
"tmarco.load_models([\"models/gminus\", \"models/gplus\"])"
310-
],
311-
"id": "dfa288dcb60102c"
316+
]
312317
},
313318
{
314319
"cell_type": "code",
@@ -450,21 +455,23 @@
450455
]
451456
},
452457
{
453-
"metadata": {},
454458
"cell_type": "markdown",
455-
"source": "As noted previously, to use local models, simply pass the initialized tokenizer and base model to the constructor, and the local path as the expert/anti-expert:",
456-
"id": "b0738c324227f57"
459+
"id": "b0738c324227f57",
460+
"metadata": {},
461+
"source": [
462+
"As noted previously, to use local models, simply pass the initialized tokenizer and base model to the constructor, and the local path as the expert/anti-expert:"
463+
]
457464
},
458465
{
459-
"metadata": {},
460466
"cell_type": "code",
461-
"outputs": [],
462467
"execution_count": null,
468+
"id": "b929e21a97ea914e",
469+
"metadata": {},
470+
"outputs": [],
463471
"source": [
464472
"tmarco = TMaRCo(tokenizer=tokenizer, base_model=base)\n",
465473
"tmarco.load_models([\"models/gminus\", \"models/gplus\"])"
466-
],
467-
"id": "b929e21a97ea914e"
474+
]
468475
},
469476
{
470477
"cell_type": "markdown",

0 commit comments

Comments
 (0)