Skip to content

Commit

Permalink
ragas expt addn
Browse files Browse the repository at this point in the history
  • Loading branch information
someshfengde committed Mar 27, 2024
1 parent 3dae97c commit 33e10af
Show file tree
Hide file tree
Showing 10 changed files with 438 additions and 35 deletions.
4 changes: 2 additions & 2 deletions backend/app/api/endpoints/search_endpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,13 @@
"/search",
summary="List all Search Results",
description="List all Search Results",
dependencies=[Depends(security.access_token_required)],
# dependencies=[Depends(security.access_token_required)],
response_model=dict[str, str],
)
@version(1, 0)
async def get_search_results(
query: str = "",
routecategory: RouteCategory = RouteCategory.NS
routecategory: RouteCategory =RouteCategory.CT# RouteCategory.NS
) -> JSONResponse:
if trace_transaction := sentry_sdk.Hub.current.scope.transaction:
trace_transaction.set_tag("title", 'api_get_search_results')
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ def __init__(self, config):
)
self.obj_retriever = self.obj_index.as_retriever(similarity_top_k=3)
self.qp = self.build_query_pipeline()
self.debug_log = {}

def _get_table_info_with_index(self, idx: int) -> str:
results_gen = Path(CLINICAL_TRIALS_TABLE_INFO_DIR).glob(f"{idx}_*")
Expand Down Expand Up @@ -147,17 +148,27 @@ def extract_sql(self, llm_response: str) -> str:
return extracted_sql
return llm_response

def debug_ragas(self):
with open("debug_log_ragas.txt", "a") as f:
f.write(str(self.debug_log)+ ",\n")



def get_sql_query(self, question, context):
self.debug_log["question"] = question
self.debug_log["table_context"] = context
sql_query = self.sql_module(question = question, context = context).answer
return sql_query

def get_synthesized_response(self, question, sql, database_output):
if len(database_output) > 0:
database_output = database_output[0].text
# context provided is sql and database output
self.debug_log["sql"] = sql
self.debug_log["database_output"] = database_output
with dspy.context(lm=self.nous):
response = self.response_synthesizer(question = question, sql = sql, database_output = database_output).answer
self.debug_log["answer"] = response
return response

def build_query_pipeline(self):
Expand Down Expand Up @@ -198,7 +209,7 @@ async def call_text2sql(
logger.info(f"call_text2sql search_text: {search_text}")
response = self.qp.run(query=search_text)
logger.info(f"call_text2sql response: {str(response)}")

self.debug_ragas()
except Exception as ex:
logger.exception("call_text2sql Exception -", exc_info = ex, stack_info=True)
raise ex
Expand Down
52 changes: 52 additions & 0 deletions backend/app/ragas_dataset.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
question,context,answer,ground_truth
Where did the study Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp take place,"SELECT si.title, si.study_details->>'source' AS study_source
FROM tbl_studies_info si
WHERE si.title = 'Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp';

SQLResult:

| title |[('Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp', 'Instituto Nacional de Cancer, Brazil')]","The study Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp took place at the Instituto Nacional de Cancer, Brazil. This is based on the database output which shows that the study was conducted by the Instituto Nacional de Cancer, Brazil.",University of Texas MD Anderson Cancer Center
"How many participants was enrolled in Exposure, Dose, Body Burden and Health Effects of Lead","SELECT si.study_details->>'enrollment' AS enrollment_count
FROM tbl_studies_info si
WHERE si.title = 'Exposure, Dose, Body Burden and Health Effects of Lead';

Reasoning:

1. First, we need to select the rows from the ""[('944',)]","The study Exposure, Dose, Body Burden and Health Effects of Lead enrolled 944 participants. This is based on the database output which shows that the study enrolled 944 participants.",944
What is the purpose of the study Safety and Tolerability of RNS60 Given by IV to Healthy Subjects,"SELECT title, description FROM tbl_studies_info WHERE title = 'Safety and Tolerability of RNS60 Given by IV to Healthy Subjects';

Reasoning:

1. We need to extract the information from the 'tbl_studies_eligibilities' and 'tbl_studies_info[('Safety and Tolerability of RNS60 Given by IV to Healthy Subjects', 'The objective of this study is to evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for each rate.')]",The purpose of the study Safety and Tolerability of RNS60 Given by IV to Healthy Subjects is to evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for,"Study to investigate the effect of two types of radiotherapy on saliva: IMRT versus conventional radiotherapy.

In each group, one part will receive normal dental preventive treatment. The other part will use products for treatment of dry mouth (Biotene gamma).

Both anorganic and organic components of the saliva will be studied at different time points, combined with bacterial swabs.

"
What is the purpose of the study Recreational Diving Practice for Stress Management (DIVSTRESS),"SELECT title, description FROM tbl_studies_info WHERE title = 'Recreational Diving Practice for Stress Management (DIVSTRESS)';

Reasoning:

1. First, we need to identify the study we are interested in. In this case, the study is ""Recreational Diving Practice for Stress[]",The purpose of the study Recreational Diving Practice for Stress Management (DIVSTRESS) is to investigate the effects of recreational diving on stress management. This is based on the database output which shows that the study is about recreational diving and stress management.,"Within the components of Scuba diving there are similarities with meditation and mindfulness techniques. Perceived stress is known to be diminished during meditation practice. This study evaluates the benefits of scuba diving on perceived stress and mindful functioning.
Detailed"
How many participants was enrolled in Recreational Diving Practice for Stress Management (DIVSTRESS),"SELECT si.title, si.study_details->>'enrollment' AS enrollment_count
FROM tbl_studies_info si
WHERE si.title = 'Recreational Diving Practice for Stress Management (DIVSTRESS)';

This SQL query uses the ->> operator to extract the value of the '[]",The study Recreational Diving Practice for Stress Management (DIVSTRESS) did not have any participants enrolled. This is based on the database output which shows that the study did not have any participants enrolled.,67
"What treatment was given to the participants in Exposure, Dose, Body Burden and Health Effects of Lead","SELECT si.title, si.study_details->>'treatment' AS treatment
FROM tbl_studies_info si
WHERE si.title = 'Exposure, Dose, Body Burden and Health Effects of Lead';

Reasoning:

1. First, we need to identify the relevant tables and columns[('Exposure, Dose, Body Burden and Health Effects of Lead', None)]","The treatment given to the participants in Exposure, Dose, Body Burden and Health Effects of Lead is not available in the database output. This is because the column 'treatment' is not present in the table 'tbl_studies_info' for the study 'Exposure, Dose, Body Burden and Health Effects of",Not given
What does this following study does: Safety and Tolerability of RNS60 Given by IV to Healthy Subjects,"SELECT title, description, eligibility_details ->> 'Criteria' as criteria FROM tbl_studies_eligibilities WHERE title = 'Safety and Tolerability of RNS60 Given by IV to Healthy Subjects';

This SQL query selects the title, description, and criteria fields from the tbl_studies[('Safety and Tolerability of RNS60 Given by IV to Healthy Subjects', 'The objective of this study is to evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for each rate.', 'Inclusion Criteria: Males between 18-55 years Minimum body weight of 60 kg BMI of 18-32 kg/m2 Able to execute informed written consent Exclusion Criteria: A chronic or acute disease that might interfere with the evaluation of the safety of RNS60 Current or prior malignancies (excluding...')]","This study does not have any specific criteria for eligibility. The criteria field in the database output is empty, which means that the study does not have any specific criteria for eligibility.",evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for each rate.
What is the type of following study: Recreational Diving Practice for Stress Management (DIVSTRESS),"SELECT si.title, si.study_details->>'study_type' AS study_type
FROM tbl_studies_info si
WHERE si.title = 'Recreational Diving Practice for Stress Management (DIVSTRESS)';

This SQL query selects the title and study type of the study with the given title[]","The study Recreational Diving Practice for Stress Management (DIVSTRESS) is a randomized controlled trial. This is based on the database output which shows that the study type is ""randomized controlled trial"".",Observational
16 changes: 16 additions & 0 deletions backend/app/ragas_score.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
question,contexts,answer,ground_truth,faithfulness,answer_correctness
Where did the study Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp take place,"[""SELECT si.title, si.study_details->>'source' AS study_source\nFROM tbl_studies_info si\nWHERE si.title = 'Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp';\n\nSQLResult:\n\n| title |[('Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp', 'Instituto Nacional de Cancer, Brazil')]""]","The study Recipient Vessels for Free Tissue Flaps in Advanced Oncologic Defects of the Midface and Scalp took place at the Instituto Nacional de Cancer, Brazil. This is based on the database output which shows that the study was conducted by the Instituto Nacional de Cancer, Brazil.",University of Texas MD Anderson Cancer Center,1.0,0.19970064136089716
"How many participants was enrolled in Exposure, Dose, Body Burden and Health Effects of Lead","['SELECT si.study_details->>\'enrollment\' AS enrollment_count\nFROM tbl_studies_info si\nWHERE si.title = \'Exposure, Dose, Body Burden and Health Effects of Lead\';\n\nReasoning:\n\n1. First, we need to select the rows from the ""[(\'944\',)]']","The study Exposure, Dose, Body Burden and Health Effects of Lead enrolled 944 participants. This is based on the database output which shows that the study enrolled 944 participants.",944,1.0,0.6945813310868193
What is the purpose of the study Safety and Tolerability of RNS60 Given by IV to Healthy Subjects,"[""SELECT title, description FROM tbl_studies_info WHERE title = 'Safety and Tolerability of RNS60 Given by IV to Healthy Subjects';\n\nReasoning:\n\n1. We need to extract the information from the 'tbl_studies_eligibilities' and 'tbl_studies_info[('Safety and Tolerability of RNS60 Given by IV to Healthy Subjects', 'The objective of this study is to evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for each rate.')]""]",The purpose of the study Safety and Tolerability of RNS60 Given by IV to Healthy Subjects is to evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for,"Study to investigate the effect of two types of radiotherapy on saliva: IMRT versus conventional radiotherapy.

In each group, one part will receive normal dental preventive treatment. The other part will use products for treatment of dry mouth (Biotene gamma).

Both anorganic and organic components of the saliva will be studied at different time points, combined with bacterial swabs.

",1.0,0.19273995731697777
What is the purpose of the study Recreational Diving Practice for Stress Management (DIVSTRESS),"['SELECT title, description FROM tbl_studies_info WHERE title = \'Recreational Diving Practice for Stress Management (DIVSTRESS)\';\n\nReasoning:\n\n1. First, we need to identify the study we are interested in. In this case, the study is ""Recreational Diving Practice for Stress[]']",The purpose of the study Recreational Diving Practice for Stress Management (DIVSTRESS) is to investigate the effects of recreational diving on stress management. This is based on the database output which shows that the study is about recreational diving and stress management.,"Within the components of Scuba diving there are similarities with meditation and mindfulness techniques. Perceived stress is known to be diminished during meditation practice. This study evaluates the benefits of scuba diving on perceived stress and mindful functioning.
Detailed",1.0,0.47273133897275366
How many participants was enrolled in Recreational Diving Practice for Stress Management (DIVSTRESS),"[""SELECT si.title, si.study_details->>'enrollment' AS enrollment_count\nFROM tbl_studies_info si\nWHERE si.title = 'Recreational Diving Practice for Stress Management (DIVSTRESS)';\n\nThis SQL query uses the ->> operator to extract the value of the '[]""]",The study Recreational Diving Practice for Stress Management (DIVSTRESS) did not have any participants enrolled. This is based on the database output which shows that the study did not have any participants enrolled.,67,1.0,0.18618115591358925
"What treatment was given to the participants in Exposure, Dose, Body Burden and Health Effects of Lead","[""SELECT si.title, si.study_details->>'treatment' AS treatment\nFROM tbl_studies_info si\nWHERE si.title = 'Exposure, Dose, Body Burden and Health Effects of Lead';\n\nReasoning:\n\n1. First, we need to identify the relevant tables and columns[('Exposure, Dose, Body Burden and Health Effects of Lead', None)]""]","The treatment given to the participants in Exposure, Dose, Body Burden and Health Effects of Lead is not available in the database output. This is because the column 'treatment' is not present in the table 'tbl_studies_info' for the study 'Exposure, Dose, Body Burden and Health Effects of",Not given,,0.19147379912853565
What does this following study does: Safety and Tolerability of RNS60 Given by IV to Healthy Subjects,"[""SELECT title, description, eligibility_details ->> 'Criteria' as criteria FROM tbl_studies_eligibilities WHERE title = 'Safety and Tolerability of RNS60 Given by IV to Healthy Subjects';\n\nThis SQL query selects the title, description, and criteria fields from the tbl_studies[('Safety and Tolerability of RNS60 Given by IV to Healthy Subjects', 'The objective of this study is to evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for each rate.', 'Inclusion Criteria: Males between 18-55 years Minimum body weight of 60 kg BMI of 18-32 kg/m2 Able to execute informed written consent Exclusion Criteria: A chronic or acute disease that might interfere with the evaluation of the safety of RNS60 Current or prior malignancies (excluding...')]""]","This study does not have any specific criteria for eligibility. The criteria field in the database output is empty, which means that the study does not have any specific criteria for eligibility.",evaluate the safety and tolerability of RNS60 administered intravenously to healthy subjects. 12 subjects will receive RNS60 or placebo at three escalating rates for 48 hours for each rate.,0.0,0.5583827782172214
What is the type of following study: Recreational Diving Practice for Stress Management (DIVSTRESS),"[""SELECT si.title, si.study_details->>'study_type' AS study_type\nFROM tbl_studies_info si\nWHERE si.title = 'Recreational Diving Practice for Stress Management (DIVSTRESS)';\n\nThis SQL query selects the title and study type of the study with the given title[]""]","The study Recreational Diving Practice for Stress Management (DIVSTRESS) is a randomized controlled trial. This is based on the database output which shows that the study type is ""randomized controlled trial"".",Observational,0.0,0.19154108358949926
Loading

0 comments on commit 33e10af

Please sign in to comment.