|
248 | 248 | "metadata": {},
|
249 | 249 | "source": [
|
250 | 250 | "The above step will create a TileDB array in your working directory. For information about the structure of a dense\n",
|
251 |
| - "TileDB array in terms of files on disk please take a look [here](https://docs.tiledb.com/main/basic-concepts/data-format).\n", |
| 251 | + "TileDB array in terms of files on disk please take a look [here](https://docs.tiledb.com/main/concepts/data-format).\n", |
252 | 252 | "Let's open our TileDB array model and check metadata. Metadata that are of type list, dict or tuple have been JSON\n",
|
253 | 253 | "serialized while saving, i.e., we need json.loads to deserialize them."
|
254 | 254 | ]
|
|
315 | 315 | "metadata": {},
|
316 | 316 | "source": [
|
317 | 317 | "For the case of PyTorch models, internally, we save model's state_dict and optimizer's state_dict,\n",
|
318 |
| - "as [variable sized attributes)](https://docs.tiledb.com/main/solutions/tiledb-embedded/api-usage/writing-arrays/var-length-attributes)\n", |
| 318 | + "as [variable sized attributes)](https://docs.tiledb.com/main/how-to/arrays/writing-arrays/var-length-attributes)\n", |
319 | 319 | "(pickled), i.e., we can open the TileDB and get only the state_dict of the model or optimizer,\n",
|
320 | 320 | "without bringing the whole model in memory. For example, we can load model's and optimizer's state_dict\n",
|
321 | 321 | "for model tiledb-pytorch-mnist-1 as follows."
|
|
382 | 382 | "metadata": {},
|
383 | 383 | "source": [
|
384 | 384 | "What is really nice with saving models as TileDB array, is native versioning based on fragments as described\n",
|
385 |
| - "[here](https://docs.tiledb.com/main/basic-concepts/data-format#immutable-fragments). We can load a model, retrain it\n", |
| 385 | + "[here](https://docs.tiledb.com/main/concepts/data-format#immutable-fragments). We can load a model, retrain it\n", |
386 | 386 | "with new data and update the already existing TileDB model array with new model parameters and metadata. All information, old\n",
|
387 | 387 | "and new will be there and accessible. This is extremely useful when you retrain with new data or trying different architectures for the same\n",
|
388 | 388 | "problem, and you want to keep track of all your experiments without having to store different model instances. In our case,\n",
|
|
452 | 452 | "metadata": {},
|
453 | 453 | "source": [
|
454 | 454 | "Finally, a very interesting and useful, for machine learning models, TileDB feature that is described\n",
|
455 |
| - "[here](https://docs.tiledb.com/main/basic-concepts/data-format#groups) and [here](https://docs.tiledb.com/main/solutions/tiledb-embedded/api-usage/object-management#creating-tiledb-groups)\n", |
| 455 | + "[here](https://docs.tiledb.com/main/concepts/data-format#groups) and [here](https://docs.tiledb.com/main/how-to/object-management#creating-tiledb-groups)\n", |
456 | 456 | "are groups. Assuming we want to solve the MNIST problem, and we want to try several architectures. We can save each architecture\n",
|
457 | 457 | "as a separate TileDB array with native versioning each time it is re-trained, and then organise all models that solve the same problem (MNIST)\n",
|
458 | 458 | "as a TileDB array group with any kind of hierarchy. Let's firstly define a new model architecture."
|
|
0 commit comments