Skip to content

Commit

Permalink
Origin/jonfritz patch 4 docs (#673)
Browse files Browse the repository at this point in the history
* Spelling mistakes

* API doc updates

* Update index.rst

Update API in ToC

* Update index.rst

Update image typo

* Update index.rst

Update link typo

* Update APIs.rst

Update name

* Update aryn-sdk.rst

Update link
  • Loading branch information
jonfritz authored Aug 13, 2024
1 parent b28d1a2 commit 91cd341
Show file tree
Hide file tree
Showing 43 changed files with 190 additions and 62 deletions.
15 changes: 0 additions & 15 deletions docs/source/APIs/data_preparation.rst

This file was deleted.

32 changes: 0 additions & 32 deletions docs/source/APIs/transforms.rst

This file was deleted.

10 changes: 10 additions & 0 deletions docs/source/aryn_cloud/APIs.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Aryn Partitioning Service APIs
=============

This is the API reference for the Aryn-SDK, which is used to interact with the Aryn Partitioning Service.

.. toctree::
:maxdepth: 1

./APIs/aryn-sdk.rst

Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ Aryn SDK
.. toctree::
:maxdepth: 2

/APIs/aryn-sdk/partition.rst
./aryn-sdk/partition.rst
File renamed without changes.
File renamed without changes.
17 changes: 4 additions & 13 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Sycamore is a document processing engine covered under the Apache v2.0 license.

Sycamore uses LLM-powered transforms, and you can choose the model to leverage. It can handle complex documents with embedded tables, figures, graphs, and other infographics. For ETL use cases, Sycamore reliably generates vector embeddings with the model of your choice, and loads vector databases and search engines like Pinecone, OpenSearch, Weaviate, Elasticsearch, and more.

.. image:: images/ArynArchitecture_APS%2BSycamorev2.png
.. image:: images/ArynArchitecture_APS+Sycamorev2.png

**Key Features**

Expand Down Expand Up @@ -57,7 +57,7 @@ Next, you can:
..
You can specify additional options (e.g. table extraction), and a list of these options is :doc:`here </aryn_cloud/aryn_partitioning_service.html#specifying-options>`_
You can specify additional options (e.g. table extraction), and a list of these options is :doc: `here </aryn_cloud/aryn_partitioning_service.html#specifying-options>`

|
Expand Down Expand Up @@ -91,6 +91,7 @@ More Resources
:hidden:

/aryn_cloud/aryn_partitioning_service.rst
/aryn_cloud/APIs.rst

.. toctree::
:caption: Sycamore
Expand All @@ -103,14 +104,4 @@ More Resources
/sycamore/transforms.rst
/sycamore/connectors.rst
/sycamore/tutorials.rst


.. toctree::
:caption: APIs
:maxdepth: 2
:hidden:

/APIs/data_preparation.rst
/APIs/conversation_memory.rst
/APIs/transforms.rst
/APIs/aryn-sdk.rst
/sycamore/APIs.rst
17 changes: 17 additions & 0 deletions docs/source/sycamore/APIs.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Sycamore APIs
=============

This is the API reference for Sycamore, and it contains the functions you can use when writing Sycamore scripts to process data. If you are interested in contributing new transforms to the Sycamore project, please visit the Low-Level Transforms section in the API docs.

.. toctree::
:maxdepth: 1

./APIs/config.rst
./APIs/context.rst
./APIs/docset.rst
./APIs/docsetreader.rst
./APIs/docsetwriter.rst
./APIs/document.rst
./APIs/functions.rst
./APIs/node.rst
./APIs/low_level_transforms.rst
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
63 changes: 63 additions & 0 deletions docs/source/sycamore/APIs/gen
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/usr/bin/python3

"""
Auto-generate RST files from Python source.
Usage: ./gen
"""

import os
import sys
import ast


srcRoot = "../../../lib/sycamore/sycamore"
docRoot = "."


def shouldEmit(node):
if not isinstance(node, ast.ClassDef):
return False
if ast.get_docstring(node):
return True
for base in node.bases:
if base.id == "ABC":
return False # skip abstract base classes
return True


def doFile(name, dir, ent):
with open(f"{dir}/{ent}") as fp:
top = ast.parse(fp.read())

ary = []
base = ent[:-3]
for node in top.body: # iterate module-level nodes only
if shouldEmit(node):
ary.append(f"sycamore.{name}.{base}.{node.name}")

if ary:
with open(f"{docRoot}/{name}/{base}.rst", "w") as fp:
title = base.replace("_", " ").title()
line = "=" * len(title)
fp.write(f"{title}\n{line}\n\n")
for sym in sorted(ary):
fp.write(f".. autoclass:: {sym}\n :members:\n :show-inheritance:\n")
print(f" /APIs/{name}/{base}.rst")


def doDir(name):
dir = f"{srcRoot}/{name}"
for ent in sorted(os.listdir(dir)):
if not ent.endswith(".py"):
continue
doFile(name, dir, ent)


def main():
doDir("transforms")
return 0


if __name__ == "__main__":
sys.exit(main())
31 changes: 31 additions & 0 deletions docs/source/sycamore/APIs/low_level_transforms.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
.. _Ref-low_level_Transforms:

Low-Level Transforms (for Sycamore development)
===========

.. note::
Users of Sycamore won't need to interact with these classes and should instead use the classes in the top-level API docs. These transform classes are primarily of interest to developers looking to extend Sycamore or contribute to the project.

.. toctree::
:maxdepth: 2

./low_level_transforms/augment_text.rst
./low_level_transforms/basics.rst
./low_level_transforms/bbox_merge.rst
./low_level_transforms/embed.rst
./low_level_transforms/explode.rst
./low_level_transforms/extract_entity.rst
./low_level_transforms/extract_schema.rst
./low_level_transforms/extract_table.rst
./low_level_transforms/map.rst
./low_level_transforms/mark_misc.rst
./low_level_transforms/merge_elements.rst
./low_level_transforms/partition.rst
./low_level_transforms/query.rst
./low_level_transforms/random_sample.rst
./low_level_transforms/regex_replace.rst
./low_level_transforms/sketcher.rst
./low_level_transforms/split_elements.rst
./low_level_transforms/spread_properties.rst
./low_level_transforms/summarize.rst
./low_level_transforms/summarize_images.rst
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/usr/bin/python3

"""
Auto-generate RST files from Python source.
Usage: ./gen
"""

import os
import sys
import ast


srcRoot = "../../../lib/sycamore/sycamore"
docRoot = "."


def shouldEmit(node):
if not isinstance(node, ast.ClassDef):
return False
if ast.get_docstring(node):
return True
for base in node.bases:
if base.id == "ABC":
return False # skip abstract base classes
return True


def doFile(name, dir, ent):
with open(f"{dir}/{ent}") as fp:
top = ast.parse(fp.read())

ary = []
base = ent[:-3]
for node in top.body: # iterate module-level nodes only
if shouldEmit(node):
ary.append(f"sycamore.{name}.{base}.{node.name}")

if ary:
with open(f"{docRoot}/{name}/{base}.rst", "w") as fp:
title = base.replace("_", " ").title()
line = "=" * len(title)
fp.write(f"{title}\n{line}\n\n")
for sym in sorted(ary):
fp.write(f".. autoclass:: {sym}\n :members:\n :show-inheritance:\n")
print(f" /APIs/{name}/{base}.rst")


def doDir(name):
dir = f"{srcRoot}/{name}"
for ent in sorted(os.listdir(dir)):
if not ent.endswith(".py"):
continue
doFile(name, dir, ent)


def main():
doDir("transforms")
return 0


if __name__ == "__main__":
sys.exit(main())
2 changes: 1 addition & 1 deletion docs/source/sycamore/tutorials.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Tutorials
=============

Learn how to write Sycamore scrips
Learn how to write Sycamore scripts
--------------------------------------

Now that you've learned about Sycamore concepts, transforms, and connectors, let's put it all together with some tutorials showing how to write Sycamore processing jobs.
Expand Down

0 comments on commit 91cd341

Please sign in to comment.