Scanpy-pl #226

slobentanzer · 2024-12-10T10:44:03Z

This PR is to make the minimal additions to add scanpy pl module to the API calling submodule.

Does not contain all parameters yet.

Includes:

submodule in API submodule
entry in init.py all to make discoverable
test (mocked)
benchmark case
benchmark adjustment to allow testing submodule

0.28 removed the proxy keyword, but openai is not aware

…nto scanpy-pl

* add formatter functions for REST and Python * make discoverable on module level * add required field * test the formatting functions * `scanpy` to `sc` to fit common usage * adjust benchmark to use the formatter

* add scanpy_pl module with initial fields * add mocked test for module * add module to API agent __init__.py * add benchmark case * add conditional for module benchmark * downgrade httpx due to conflict 0.28 removed the proxy keyword, but openai is not aware * add back default `question_uuid` field into pydantic class * add scatter pydantic class * add sc.pl.pca * add pca benchmark case * distinguish web api and python api benchmark * change case to scatter * add tsne class * add tsne case * fix typing * add generic formatter (#233) * add formatter functions for REST and Python * make discoverable on module level * add required field * test the formatting functions * `scanpy` to `sc` to fit common usage * adjust benchmark to use the formatter --------- Co-authored-by: daniele-lucarelli <[email protected]>

* add the tools `tl` modules to API agent __init__.py * add scanpy_tl module with general description * Change pymilvus dependency in the pyproject.toml from the fixed version 2.2.8 to superior or equal to 2.2.8. Indeed, it appears that the grpcio 1.53.0 external dependency of pymilvus version 2.2.8 is not compatible with Windows OS 11 and Python version 2.12.3, whatever it is the wheel or source version. Running pytest does not yield any errors, beyond raising deprecated warnings * api agent for scnapy tl using the generate_pydantic_class_from_module method. Currently scanpy is imported when ScanpyTLQueryBuilder.parametrise_query is called. * generic method to generate pydantic classes for functions in a module. Only includes functions which dont start with "_" * working progress on QueryBuilder and its unit tests * Scanpy-pl (#226) * add scanpy_pl module with initial fields * add mocked test for module * add module to API agent __init__.py * add benchmark case * add conditional for module benchmark * downgrade httpx due to conflict 0.28 removed the proxy keyword, but openai is not aware * add back default `question_uuid` field into pydantic class * add scatter pydantic class * add sc.pl.pca * add pca benchmark case * distinguish web api and python api benchmark * change case to scatter * add tsne class * add tsne case * fix typing * add generic formatter (#233) * add formatter functions for REST and Python * make discoverable on module level * add required field * test the formatting functions * `scanpy` to `sc` to fit common usage * adjust benchmark to use the formatter --------- Co-authored-by: daniele-lucarelli <[email protected]> * Anndata api integration (#229) * pushed starter anndata file * removed the tester * Aim of the anndata api module * Draft of the AnnDataIOParameters * added a prompt * updated the prompt * started to implement the AnndataIOQueryBuilder * added test for anndata api * pushed pydantic reader classes * Updated the anndata tool with integrated test: -> returns dict with method & args Co-authored-by: Anis Ismail <[email protected]> * added query builder * added querybuilder for anndata and its test * updated query builder * added exclude none * feat(BaseAPIModel): Add reusable base class for structured outputs • Introduced BaseAPIModel, a reusable base class to streamline the creation of Pydantic models for structured outputs. • The class includes: • uuid: An optional field (str | None) for unique identification of model instances. • method_name: A required field (str) to specify the associated function or method, ensuring consistency across models. • Configured with arbitrary_types_allowed to support flexible extensions. • Designed for use in structured output generation. This addition lays the groundwork for standardized, maintainable, and consistent API models. * update query builder to remove create_runnable * Updated the pydatic classes with the BaseAPIModel * Updated the system prompt in the runnable of the AnnDataIOQueryBuilder * fix in import of pydanticparser * added test for query builder parameterise_query * removed comments + redundant script --------- Co-authored-by: Anis Ismail <[email protected]> Co-authored-by: Anis Ismail <[email protected]> * switch scanpy pl to langchain bind_tools * Fixed the prompt issue in the `AnnDataIOQueryBuilder`, but now no system prompt is updated for the anndata query * add in the benchmark a call to scanpy.pp to carry on a PCA with a given number of latent dimensions * fix schema issue with fixed length tuples replace with any length type (...) * remove nested list in benchmark * remove unnecessary variable * remove dual httpx definition * update ABC to return list from `parameterise_query` * add umap pydantic class * migrate legacy query builder and fetcher to work with list of pydantic classes adjusts the ABC, the individual legacy classes (builder and fetcher), and the tests * add draw_graph pydantic class * assume list of classes as return now has empty list in parameters * add draw_graph to tool list * return variable instead of call * add spatial pydantic class * add anndata benchmark * add anndata benchmark test case * change from langchain pydantic to original pydantic * Added mock test for ScanpyTLQueryBuilder (without module specification) * remove irrelevant imports in scanpy.tl module * move method_name to title problem: plotting functions also have a parameter called title * add test script to gitignore * delete duplicated regex definition * Resolve merge conflicts (#239) * fixed tests for both scanpypl and anndata * changed the format_as_python_call to intake BaseAPIModel as input argument * Refactor AnnData API models: - fixed the serializing issue by importing Field from pydantic. - instead of Add defaults for optional fields, ensure consistency and clean serialization. * fixed bug in Anndata IO test * fixed bugs in merged + commented documentparser dependant imports --------- Co-authored-by: noahbruderer <[email protected]> * first version of a function to build pydantic classes for all functions (#234) * first version of a funciton to build pydantic classes for all functions in a module. Does not consider defaults * update technicalities --------- Co-authored-by: slobentanzer <[email protected]> * add pl reduced args * Generic function pydantic classes (#241) * first version of a funciton to build pydantic classes for all functions in a module. Does not consider defaults * update technicalities * inlcude getting default values for parameters in pydantic class. Set all types to Any to avoid issues. * resolved some formatting issues * typing to modern style, formatting --------- Co-authored-by: slobentanzer <[email protected]> * revert title to method_name * include BaseAPIModel * renamed anndata module to anndata_agent due to conflicts * Add the scanpy tool `tl` modules to API agent, next and closing steps. (#238) * Solve dependencies issues by allowing earlier versions of scanpy to be tested (indeed, scanpy version 1.11 is not compliant with the mowgli package, while also developped in collaboration with saez lab) + update the dependencies of the pyproject.toml file. * Restructure query builder for sc.tl module and updated the mock test * Added tool_choice argument * add unit test for automated pydantic class generation and add example docstrings int he automated pydantic function. * Add poetry dependency to docstring-parser and relocate scnapy from required to optional dependencies. * remove Typing in unit tests + check that all tests are running * Parametrise query test from anndata * unused import * add tool package, cleanup * run poetry lock * cleanup (prompt, docstrings, syntax) --------- Co-authored-by: Valeriia Dragan <[email protected]> Co-authored-by: Lera <[email protected]> Co-authored-by: slobentanzer <[email protected]> * changing how pydantic classes are defined manually, alinging with automatic apporach * resolve import issue * Fix_benchmark_sc_plot (#244) * Add aspecific questions for pl * fix syntax * fix benchmark cases * force adata=adata * add reduced pydantic model tests * add reduced class specifically * refactor to dict of prompts * add umap, draw_graph, and spatial * resolve conflicts __init__ * resolve conflicts __init__ * add reduced query builder for `pl` --------- Co-authored-by: daniele-lucarelli <[email protected]> Co-authored-by: slobentanzer <[email protected]> * add reduced builder class * add file ignores for test and benchmark * Modify the benchmark to test querybuilders for scanpy tool operations (#253) * Solve dependencies issues by allowing earlier versions of scanpy to be tested (indeed, scanpy version 1.11 is not compliant with the mowgli package, while also developped in collaboration with saez lab) + update the dependencies of the pyproject.toml file. * Restructure query builder for sc.tl module and updated the mock test * Added tool_choice argument * add unit test for automated pydantic class generation and add example docstrings int he automated pydantic function. * Add poetry dependency to docstring-parser and relocate scnapy from required to optional dependencies. * remove Typing in unit tests + check that all tests are running * Parametrise query test from anndata * unused import * add tool package, cleanup * run poetry lock * cleanup (prompt, docstrings, syntax) * Add in the yaml benchmark data the queries question. * Add in the benchmark the call to process the tool module of scanpy. * remove irrelevant web api calls from scanpy * re-add tl query builder * format * add missing import * set method_call to empty string to catch empty list return * fix syntax --------- Co-authored-by: Valeriia Dragan <[email protected]> Co-authored-by: Lera <[email protected]> Co-authored-by: slobentanzer <[email protected]> * Create pydantic classes for the scanpy pp (#256) * add pydantic classes for all funcs * add query builder * fixed runnable * add pydantic classes for all funcs * add query builder * fixed runnable invoke * add Union instead of | * create mock test * add pydantic classes for all funcs * add query builder * fixed runnable * add pydantic classes for all funcs * add query builder * fixed runnable invoke * add Union instead of | * create mock test * edited invoke * Revert "edited invoke" This reverts commit d4b6b8d. * clear imports * added quotes to test_api_agent.py * altered Pydantic classes * corrected copy param * used BaseTools * fix imports, remove non-reduced builder * fix types * uncomment test --------- Co-authored-by: slobentanzer <[email protected]> * Anndata concatentation + mapping integration (#257) * fixed tests for both scanpypl and anndata * changed the format_as_python_call to intake BaseAPIModel as input argument * Refactor AnnData API models: - fixed the serializing issue by importing Field from pydantic. - instead of Add defaults for optional fields, ensure consistency and clean serialization. * fixed bug in Anndata IO test * fixed the pydantic Optional syntax * updated the method_name to title in the `BaseAPIModel` * added tools choice argument * In `create_runnable` updated bind_tools by setting tool_choice="required" Co-authored-by: Anis Ismail <[email protected]> * Added the ConcatenateAnnData BaseAPIModel class. Now the anndata is able to return code lines to concatenate anndata objs Co-authored-by: Anis Ismail <[email protected]> * Changed 'title' field back to 'method_name' * fixed ':' * Added Map functionality Created MapAnnData class. * Added support for MapAnnData formatting * -Updated the system prompt, human prompt test assertions in the ScanpyTlQueryBuilder and ConcatenateAnnData test -Minor fixes in the ScanpyTlQueryBuilder test * Fixed the TestScanpyTlQueryBuilder test * Added create runnable test for Anndata * added use cases to the benchmark * Merging changes from biohackthon3 and anndata classes remain pydantic forms for now * Merge changes from biohackathon3; anndata changed to pydantic classes for now * fracture test regex * add openai models, run 3 times * deactivate non-scanpy * run most benchmarks * fixed bugs in testing and added extra leiden clustering yaml * create api calling specific analysis / vis --------- Co-authored-by: Anis Ismail <[email protected]> Co-authored-by: Anis Ismail <[email protected]> Co-authored-by: Lera <[email protected]> Co-authored-by: slobentanzer <[email protected]> * figure params * refactor hooks to be more modular * some ruff complaints * docstrings * fix test (content alignment) * docstring * feat: protect chat attributes and improve error handling - Add property decorators to protect chat and ca_chat attributes in Conversation ABC - Make user parameter optional in all conversation classes - Add clear error messages when chat attributes accessed before initialization - Reset chat attributes on authentication failure - Add tests for new chat attribute behavior * fix CI * downgrade poetry * remove debug * skip flaky test for now * update poetry * name autogeneration of model and parameterisation adequately rename modules, classes, and adjust the tests * rename abc.py because it shadowed Python ABC * refactor API agent into `base`, `python`, and `web` components * update docs * update ruff config * run pre-commit * doc format * run pre-commit --------- Co-authored-by: bastienchassagnol <[email protected]> Co-authored-by: mengerj <[email protected]> Co-authored-by: daniele-lucarelli <[email protected]> Co-authored-by: noahbruderer <[email protected]> Co-authored-by: Anis Ismail <[email protected]> Co-authored-by: Anis Ismail <[email protected]> Co-authored-by: noahbruderer <[email protected]> Co-authored-by: Lera <[email protected]> Co-authored-by: chassagnol <[email protected]> Co-authored-by: Anis Ismail <[email protected]> Co-authored-by: mengerj <[email protected]> Co-authored-by: Valeriia Dragan <[email protected]> Co-authored-by: Daniele Lucarelli <[email protected]> Co-authored-by: Kvitoslava <[email protected]>

slobentanzer added 5 commits December 10, 2024 11:41

add scanpy_pl module with initial fields

c2e682a

add mocked test for module

c38fed7

add module to API agent __init__.py

af9d7b0

add benchmark case

22d6aa5

add conditional for module benchmark

9e596b7

slobentanzer temporarily deployed to Test CI December 10, 2024 10:44 — with GitHub Actions Inactive

slobentanzer changed the base branch from main to biohackathon3 December 10, 2024 11:19

slobentanzer added 3 commits December 10, 2024 13:32

downgrade httpx due to conflict

8d0e0d1

0.28 removed the proxy keyword, but openai is not aware

add back default question_uuid field into pydantic class

07683d2

add scatter pydantic class

95a22f9

slobentanzer linked an issue Dec 10, 2024 that may be closed by this pull request

Create API module for calling the scanpy pl module #221

Closed

daniele-lucarelli and others added 9 commits December 11, 2024 10:30

add sc.pl.pca

5a576c3

add pca benchmark case

fe36772

distinguish web api and python api benchmark

4240478

change case to scatter

1f453ea

add tsne class

2ca0cd4

add tsne case

a8712b7

Merge branch 'scanpy-pl' of https://github.com/biocypher/biochatter i…

e63930d

…nto scanpy-pl

fix typing

fe547cb

add generic formatter (#233)

6c099bd

* add formatter functions for REST and Python * make discoverable on module level * add required field * test the formatting functions * `scanpy` to `sc` to fit common usage * adjust benchmark to use the formatter

slobentanzer merged commit 46e666b into biohackathon3 Dec 11, 2024

slobentanzer deleted the scanpy-pl branch December 11, 2024 13:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scanpy-pl #226

Scanpy-pl #226

slobentanzer commented Dec 10, 2024 •

edited

Loading

Scanpy-pl #226

Scanpy-pl #226

Conversation

slobentanzer commented Dec 10, 2024 • edited Loading

slobentanzer commented Dec 10, 2024 •

edited

Loading