You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the dictionary is re-generated every time an ontology (submission) is processed. This process takes over an hour due to retrieving a huge data structure from Redis in a single call:
There is room for optimization here. Possible avenues to pursue:
Incremental dictionary file population
We may not need to rebuild the dictionary file for the entire system on every ontology parse. Updating it incrementally may drastically improve performance
Retrieve data from Redis in an iterative way:
Instead of using all = redis.hgetall(dict_holder), it's possible to iterate of the data structure using SCAN:
cursor = 0
loop do
cursor, key_values = redis.hscan(dict_holder, cursor, count: 1000)
@logger.info cursor if cursor.to_i % 1000 == 0
break if cursor == "0"
end
The text was updated successfully, but these errors were encountered:
Currently, the dictionary is re-generated every time an ontology (submission) is processed. This process takes over an hour due to retrieving a huge data structure from Redis in a single call:
https://github.com/ncbo/ncbo_annotator/blob/master/lib/ncbo_annotator.rb#L122
There is room for optimization here. Possible avenues to pursue:
We may not need to rebuild the dictionary file for the entire system on every ontology parse. Updating it incrementally may drastically improve performance
Instead of using
all = redis.hgetall(dict_holder)
, it's possible to iterate of the data structure usingSCAN
:The text was updated successfully, but these errors were encountered: