Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/revive integration tests #1343

Merged
merged 30 commits into from
Oct 5, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 21 additions & 113 deletions .github/workflows/integration-test-workflow-debian.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
name: R2R CLI Integration Test (Debian GNU/Linux 12 (bookworm) amd64)
name: R2R CLI Integration and Regression Test

on:
push:
branches:
- '**'
branches: ['**']
workflow_dispatch:

jobs:
Expand All @@ -16,29 +15,28 @@ jobs:
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
TELEMETRY_ENABLED: false
POSTGRES_USER: ${{ secrets.POSTGRES_USER }}
POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
POSTGRES_DBNAME: ${{ secrets.POSTGRES_DBNAME }}
POSTGRES_HOST: ${{ secrets.POSTGRES_HOST }}
POSTGRES_PORT: ${{ secrets.POSTGRES_PORT }}
R2R_PROJECT_NAME: ${{ secrets.R2R_PROJECT_NAME }}

POSTGRES_HOST: localhost
POSTGRES_DBNAME: postgres
POSTGRES_PORT: 5432
POSTGRES_PASSWORD: postgres
POSTGRES_USER: postgres
steps:
- uses: actions/checkout@v4
- name: Install and configure PostgreSQL
run: |
sudo apt-get update
sudo apt-get install -y postgresql
sudo systemctl start postgresql.service
sudo -u postgres psql -c "ALTER USER postgres PASSWORD 'postgres';"

- name: Set up Python
uses: actions/setup-python@v4
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.x'

- name: Install Poetry
- name: Install Poetry and dependencies
run: |
curl -sSL https://install.python-poetry.org | python3 -

- name: Install dependencies
working-directory: ./py
run: |
poetry install -E core -E ingestion-bundle
cd py && poetry install -E core -E ingestion-bundle

- name: Start R2R server
working-directory: ./py
Expand All @@ -50,99 +48,9 @@ jobs:
- name: Run integration tests
working-directory: ./py
run: |
echo "R2R Version"
poetry run r2r version

- name: Walkthrough
working-directory: ./py
run: |
echo "Ingest Data"
poetry run r2r ingest-sample-files

echo "Get Documents Overview"
poetry run r2r documents-overview

echo "Get Document Chunks"
poetry run r2r document-chunks --document-id=9fbe403b-c11c-5aae-8ade-ef22980c3ad1

echo "Delete Documents"
poetry run r2r delete --filter=document_id:eq:9fbe403b-c11c-5aae-8ade-ef22980c3ad1

echo "Update Document"
poetry run r2r update-files core/examples/data/aristotle_v2.txt --document-ids=9fbe403b-c11c-5aae-8ade-ef22980c3ad1

echo "Vector Search"
poetry run r2r search --query="What was Uber's profit in 2020?"

echo "Hybrid Search"
r2r search --query="What was Uber's profit in 2020?" --use-hybrid-search

echo "Basic RAG"
poetry run r2r rag --query="What was Uber's profit in 2020?"

echo "RAG with Hybrid Search"
poetry run r2r rag --query="Who is Jon Snow?" --use-hybrid-search

echo "Streaming RAG"
poetry run r2r rag --query="who was aristotle" --use-hybrid-search --stream

echo "User Registration"
curl -X POST http://localhost:7272/v2/register \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]",
"password": "password123"
}'

echo "User Login"
curl -X POST http://localhost:7272/v2/login \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "[email protected]&password=password123"

echo "Users Overview"
poetry run r2r users-overview

echo "Logging"
poetry run r2r logs

echo "Analytics"
poetry run r2r analytics --filters '{"search_latencies": "search_latency"}' --analysis-types '{"search_latencies": ["basic_statistics", "search_latency"]}'

- name: GraphRAG
working-directory: ./py
run: |
echo "Create Knowledge Graph"
poetry run r2r create-graph --document-ids=9fbe403b-c11c-5aae-8ade-ef22980c3ad1

echo "Inspect Knowledge Graph"
poetry run r2r inspect-knowledge-graph

echo "Graph Enrichment"
poetry run r2r enrich-graph

echo "Local Search"
r2r search --query="Who is Aristotle?" --use-kg-search --kg-search-type=local

echo "Global Search"
r2r search --query="What were Aristotles key contributions to philosophy?" --use-kg-search --kg-search-type=global --max-llm-queries-for-global-search=100

echo "RAG"
r2r rag --query="What are the key contributions of Aristotle to modern society?" --use-kg-search --kg-search-type=global --max-llm-queries-for-global-search=100






- name: Advanced RAG
working-directory: ./py
run: |
echo "HyDE"
poetry run r2r rag --query="who was aristotle" --use-hybrid-search --stream --search-strategy=hyde

echo "Rag-Fusion"
r2r rag --query="Explain the theory of relativity" --use-hybrid-search --stream --search-strategy=rag_fusion
python tests/integration/runner.py test_ingest_sample_files_cli
python tests/integration/runner.py test_document_ingestion_cli

- name: Stop R2R server
run: |
pkill -f "r2r serve"
if: always()
run: pkill -f "r2r serve"
78 changes: 78 additions & 0 deletions py/tests/integration/runner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# File: tests/integration/r2r_integration_tests.py

import json
import subprocess
import sys


def run_command(command):
result = subprocess.run(
command, shell=True, capture_output=True, text=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using shell=True can be a security risk if the command includes unsanitized input. Consider using a list of arguments instead, e.g., subprocess.run(['poetry', 'run', 'r2r', 'ingest-sample-files'], capture_output=True, text=True).

)
if result.returncode != 0:
print(f"Command failed: {command}")
print(f"Error: {result.stderr}")
sys.exit(1)
return result.stdout


def test_ingest_sample_files_cli():
print("Testing: Ingest sample files")
run_command("poetry run r2r ingest-sample-files")
print("Ingestion successful")


def test_document_ingestion_cli():
print("Testing: Document ingestion")
output = run_command("poetry run r2r documents-overview")
documents = json.loads(output)

expected_document = {
"id": "9fbe403b-c11c-5aae-8ade-ef22980c3ad1",
"title": "aristotle.txt",
"type": "txt",
"ingestion_status": "success",
"kg_extraction_status": "success",
"version": "v0",
"metadata": {"title": "aristotle.txt", "version": "v0"},
}

if not any(
all(doc.get(k) == v for k, v in expected_document.items())
for doc in documents
):
print("Document ingestion test failed")
print(f"Expected document not found in output: {output}")
sys.exit(1)
print("Document ingestion test passed")


def test_vector_search_cli():
print("Testing: Vector search")
output = run_command(
"poetry run r2r search --query='What was Uber's profit in 2020?'"
)
results = json.loads(output)
if not results.get("results"):
print("Vector search test failed: No results returned")
sys.exit(1)
print("Vector search test passed")


def test_rag_query_cli():
print("Testing: RAG query")
output = run_command("poetry run r2r rag --query='Who was Aristotle?'")
response = json.loads(output)
if not response.get("answer"):
print("RAG query test failed: No answer returned")
sys.exit(1)
print("RAG query test passed")


if __name__ == "__main__":
if len(sys.argv) < 2:
print("Please specify a test function to run")
sys.exit(1)

test_function = sys.argv[1]
globals()[test_function]()
Loading