Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jack/initial eval framework #251

Closed
wants to merge 205 commits into from
Closed

Conversation

jackaldenryan
Copy link

@jackaldenryan jackaldenryan commented Jan 10, 2025

Structure of this is as follows:

  • The snippetization notebook is for loading the longmemeval data set and doing a one time reformatting of the LME data set into "snippets". This notebook saves all of this data, which is very large, into a newly created folder, which is git-ignored
  • The mini-dataset creation notebook is for filtering and formatting some specific "mini-dataset" which might be for instance 10 snippets. This notebook is also responsible for calling the utility function for ingesting and labeling, since part of the mini data set itself includes GPT-4o labels. The mini-dataset will have two forms: the internal eval CSV, used for the actual CLI test, and then the CSV used for human labeling, which will ultimately be imported into a Google sheet
  • The eval_extract_nodes file is for the CLI test. It is currently outdated and will eventually be updated to use the same ingestion and labeling utility function.

paul-paliychuk and others added 30 commits August 13, 2024 14:59
chore: Add readme, gitignore and poetry files
* chore: Initial draft of stubs

* updates

* chore: Add comments and mock implementation of the add_episode method

* chore: Add success and error callbacks

* stub updates

---------

Co-authored-by: prestonrasmussen <[email protected]>
rename and add indices
* chore: Initial draft of stubs

* chore: Add comments and mock implementation of the add_episode method

* chore: Add success and error callbacks

* chore: Add success and error callbacks

* refactor: Fix conflicts with the latest merge
* chore: Fix minor issues with episodic edge building + cleanup

* feat: Port podcast runner

* feat: Port podcast runner
* search updates

* add search_utils

* updates

* graph maintenance updates

* revert extract_new_nodes

* revert extract_new_edges

* parallelize node searching

* add edge fulltext search

* search optimizations
* fix: Address graph disconnect

* chore: Remove valid_to and valid_from setting in extract edges step (will be handled during invalidation step)
* feat: Initial version of temporal invalidation + tests

* fix: dont run int tests on CI

* fix: dont run int tests on CI

* fix: dont run int tests on CI

* fix: time of day issue

* fix: running non int tests in ci

* fix: running non int tests in ci

* fix: running non int tests in ci

* fix: running non int tests in ci

* fix: running non int tests in ci

* fix: running non int tests in ci

* fix: running non int tests in ci

* revert: Tests structural changes

* chore: Remove idea file

* chore: Get rid of NodesWithEdges class and define a triplet type instead
* benchmark logging

* load schema updates

* add extract bulk nodes and edges

* updated bulk calls

* compression updates

* bulk updates

* bulk logic first pass

* updated bulk process

* debug

* remove exact names first

* cleaned up prompt

* fix bad merge

* update

* fix merge issues
* search updates

* test updates

* add opinionated search

* update
* Makefile and format

* fix podcast stuff

* refactor: update import statement for transcript_parser in podcast_runner.py

* format and linting

* chore: Update import statements and remove unused code in maintenance module
* ruff action

* chore: Update Python version to 3.10 in lint.yml workflow

* fix lint and formatting

* cleanup
* search updates

* add helper function

* make format

* updates
* wip

* wip

* wip

* fix: Linter errors

* fix formatting

* chore: fix ruff

* fix: Duplication

---------

Co-authored-by: Daniel Chalef <[email protected]>
* chore: eenable mypy

* chore: Update MyPy command in typecheck.yml workflow

* fix caching. makefile lint improvements

* chore: Fix sed command in typecheck.yml workflow

* chore: Update sed command in typecheck.yml workflow

* chore: Update Python version to 3.10 in typecheck.yml workflow

* remove pretty

* pipefail
* wip

* wip

* wip

* fix: Linter errors

* fix formatting

* chore: fix ruff

* fix: Duplication

* chore: Fix unit tests for temporal invalidation

* attempt to fix unit tests

* fix: format

---------

Co-authored-by: Daniel Chalef <[email protected]>
* typing.Any and friends

* message

* chore: Import Message model in llm_client

* fix: 💄 mypy errors

* clean up mypy stuff

* mypy

* format

* mypy

* mypy

* mypy

---------

Co-authored-by: paulpaliychuk <[email protected]>
Co-authored-by: prestonrasmussen <[email protected]>
* improve deduping issue

* fix comment

* commit format

* default embeddings

* update
* feat: Add real world dates extraction

* fix: Linter

* fix: 💄 mypy errors

* chore: handle invalid dates returned by the llm

* chore: Polish prompt

* reformat

* style: 💄 reformat
* Add Apache License 2.0 boilerplate to all Python files

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/getzep/graphiti?shareId=XXXX-XXXX-XXXX-XXXX).

* format

* format

* chore: Add Ellipsis configuration file
Add a new GitHub Actions workflow file to handle the CLA Assistant functionality. Also, include a CONTRIBUTING.md file with guidelines for contributing to the project.
* chore: Update the context for date extraction + bug fixes

* chore: Remove logs
prasmussen15 and others added 19 commits November 14, 2024 12:18
* add fulltext search limit

* format

* update

* update

* update tests

* remove unused imports

* format

* mypy
* update edge fulltext search

* bump version
* add delete nodes by group_id

* remove unused imports

* bump version
* add pagination to subgraphs

* update pagination

* update LiteralString import

* cleanup

* cleanup

* update embedding dims
* update episode override

* remove unused import
* add unicode escape

* bump version
* implement so

* bug fixes and typing

* inject schema for non-openai clients

* correct datetime format

* remove List keyword

* Refactor node_operations.py to use updated prompt_library functions

* update example
Refactor OpenAIClient to handle retries and improve error handling
* default to no pagination

* remove unused imports
* update lucene escaping

* update unit test
* ensure utc timezones

* fix: dep cycle

---------

Co-authored-by: paulpaliychuk <[email protected]>
Copy link

github-actions bot commented Jan 10, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@jackaldenryan jackaldenryan force-pushed the jack/initial-eval-framework branch from 6f3def2 to f0c1bd9 Compare January 10, 2025 18:53
@jackaldenryan jackaldenryan deleted the jack/initial-eval-framework branch January 10, 2025 18:58
@github-actions github-actions bot locked and limited conversation to collaborators Jan 10, 2025
@jackaldenryan jackaldenryan restored the jack/initial-eval-framework branch January 10, 2025 18:59
@jackaldenryan jackaldenryan deleted the jack/initial-eval-framework branch January 10, 2025 19:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants