Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing references when using run_storm_wiki_gpt_with_VectorRM.py #301

Open
danilotpnta-elsevier opened this issue Jan 12, 2025 · 2 comments

Comments

@danilotpnta-elsevier
Copy link

Describe the bug
Hi, when running run_storm_wiki_gpt_with_VectorRM.py with the offline qdrant knowledge base the references in the generated article are missing

To Reproduce
Report following things

  1. Topic:
  • Brain Waves
  1. Output generated:
  • conversation_log.json
  • direct_gen_outline.txt
  • lIm_call _history.jsonl
  • raw_search_results.json
  • run_config.json
  • storm_gen_article_polished.txt
  • storm_gen_article.txt
  • storm_gen_outline.txt
  • url_to_info.json

This is the last paragraph generated by STORM in storm_gen_article_polished.txt

## Challenges and Considerations

Despite the promising advancements, there are challenges associated with EEG research. The technique lacks spatial resolution, making it difficult to pinpoint specific brain regions involved in particular tasks or sensory events[5]. This limitation has led to a preference among some neurologists for graphic records that provide clearer localization of brain activity. Nevertheless, the integration of EEG with other neuroimaging techniques may offer a more comprehensive understanding of brain processes in the future[10].

And this is an extract of what I see in the url_to_info.json

{"url_to_unified_index": {"S0309174016300705_AgriBio": 2, "B9780081009154000026": 4, "B9780128196281000146": 3, "B978012512504850010X": 6, "B9780323952255000134": 5, "B9781845690052500206": 7, "S2590157524004309": 1, "S1466856418314449": 9, "B9780323482530000131": 8, "B9780123851031000117": 10}, "url_to_info": {"S0309174016300705_AgriBio": {"url": "S0309174016300705_AgriBio", "description": "Default description", "snippets": [". Alpha waves (8 to 13 Hz) are associated with relaxed wakefulness in adult humans and increase further with pharmacologically-induced unconsciousness. An isoelectric EEG is flat, it shows no activity. Depending on the circumstances, an isoelectric EEG is not always irreversible, but in general, we consider that the brain is dead.", "An electroencephalogram (EEG) allows visualizing the electrical activity of the brain and can be established using electrodes placed on the scalp or skull. Brain waves may have very different frequencies ranging from 0.1 to more than 100 Hz ( Pirrotta, 2011 ). There are several classes of brain wave frequencies. Fast frequencies correspond to beta (13 to 25) and gamma (25 to 60 Hz) waves. They are associated with a high state of vigilance, or cognitive activity"], "title": "Brain Waves", "meta": {"query": "brain wave therapy for mental health treatment"}, "citation_uuid": -1}, "B9780081009154000026": {"url": "B9780081009154000026", "description": "Default description", "snippets": [". They also note that the EEG waves found during day 18\u201320 of incubation are similar to those noted in chick sleeping patterns, further providing evidence of a sleep-like stage prior to hatch. However, the authors also note that this is not a clear area of science", ". The brain waves, measured via electroencephalograph waves, are initiated as early as day 13\u201314 of incubation, then go through a progressive developmental series in the embryo. Erratic spikes appear by day 15, and by the 18th day of incubation, EEG waves similar to slow/fast sleep waves appear. By the 19th\u201320th day, the EEG waves become similar to the waves noted in the hatchling during sleep"], "title": "Brain Waves", "meta": {"query": "brain wave patterns and mental health disorders"}, "citation_uuid": -1}, "B9780128196281000146": {"url": "B9780128196281000146", "description": "Default description", "snippets": ["., 2011; Myers et al., 2014; Iemi et al., 2019 ). While moving from theta to alpha waves is assumed to bridge the gap between the conscious and subconscious mind in humans, beta and gamma waves appear to increase conscious focus, problem solving, supporting learning and memory processes ( Palva and Palva, 2007 ). It was shown that brain oscillations can be entrained to rhythmic sensory stimulations, affecting perception and task performance ( Henry et al", ". However, it is not known, to what extent sensory neurons contribute to respective brain waves. In conclusion, in mammalian brains it is still not possible to determine which molecular properties of single oscillator neurons underlie the different brain waves. Insect neurons such as hawkmoth ORNs or central brain neurons in bees ( Popov and Szyszka, 2020 ) or cockroaches ( Rojas et al"], "title": "Brain Waves", "meta": {"query": "delta theta alpha beta gamma brain waves explained"}, "citation_uuid": -1}, "B978012512504850010X":

Any ideas why references are not generated correctly from the article snippet generated I can see a [5] and a [10]. However, the final articles does not get generated any references at the end.

Thanks for the help guys! Let me know which output you need to see to debug this further.

@danilotpnta-elsevier
Copy link
Author

[Update]: I followed the guide on run-storm-with-your-own-corpus. However, the article still lacks a properly formatted references section at the end. Below the output generated using the vector_store.zip provided:

@Yucheng-Jiang
Copy link
Collaborator

However, the article still lacks a properly formatted references section at the end

I take a look at your logging files, seems like the generated result is expected. It doesn't come with a reference section (which is nice to have of course).

Here's quick pointers to have to resolve this on your side, while we might implement this feature in the future:

  • Take a look at url_to_info.json. You may find reference index mapping to url through url_to_unified_index field in the json file
  • Then, use the url name as a key to retrieve detailed information from url_to_info field and format the reference section as you see fit.

Hope it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants