-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log requests and responses to/from OpenAI's API #25
Conversation
@yonromai 🔥 🙏 I was curious about that specific example in #24 (comment)
But feel free to ignore this request if it's complicated to get that. |
No, just had to nuke that puppy from the cache. (In case useful: link to prompt). Here's completion 1: The record describes the term "louping ill". The description indicates that this is a specific viral infectious disease that has defined symptoms and a specific mode of transmission. This disease appears to have a distinct clinical profile. Therefore, I would categorize it as a high precision term. High precision terms tend to represent specific, well-defined conditions with distinguishing clinical characteristics.
<END_OF_COT>
id|precision
EFO:0007348|high and completion 2: The disease term given is "louping ill", a viral infectious disease found in sheep and rarely humans. It is caused by the Louping ill virus, which is transmitted by the sheep tick, Ixodes ricinus. The infection can result in symptoms such as lethargy, muscle pains, fever, and focal neurological signs. Due to its specificity in terms of causing organism, transmission vector and symptoms, alongside its limited host range (primarily affecting sheep and very rarely humans), it represents a more definite and specific group of infected individuals. Thereby, it meets the criteria of a high precision term.
<END_OF_COT>
id|precision
EFO:0007348|high Obvious note: The outcome is potentially different than last execution since non-deterministic |
@yonromai nice 🙏 ! Man, world post-GPT is going to be very hard to debug ... 🤣Now is this dramatic change from all |
😭 I wonder if it's a sign that we should lower the model temperature |
Important please don't let my comments distract you from the #13. Feel free to ignore my comments/requests or respond in a week or two :) @yonromai actually looking at the notebook where this example came from in #24, it was the original prompt that classified it as all
It would be cool if this was sorted by the distance from the true label, such that we could focus on the problematic examples. |
Sounds good, I'd be quick to check but probably wise to postpone it until after the workshop - I'll merge this PR for now. |
This PR adds persistence on disk of all actual requests and responses to/from OpenAI's API by default. (I don't know why I didn't implement this earlier - it's been bothering me for a while)
@ravwojdyla
I fetched 50 more samples with the CoT prompt and used the new logging code to exact the prompt and response choices (#1, #2)
Note: If you want the explanations for some specific nodes (e.g. based on worst errors), I can nuke the cache and rerun the tagging for all 500 nodes so we have the CoT explanation for all of them.
(cc: @eric-czech @dhimmel )