Replies: 1 comment 1 reply
-
Hi @wammy19, thanks for interesting questions! Regarding the ids, Qdrant was intended to be used as an indexing engine, so in most applications it should be provided with some existing IDS which you have in your data storage. As you correctly noticed this original ID could be in a form of String, which is currently supported by Qdrant only in a form of payload, so you would steel need to have an integer id. In future, we are planning to add support for string IDS as well https://github.com/qdrant/qdrant/projects/1#card-52881607 What I can propose in your case, but it depends on an amount of vectors you are planning to index. Qdrant uses According to the approximation equation Regarding |
Beta Was this translation helpful? Give feedback.
-
Hi again.
I'm wondering what your opinion is on the best way of indexing a constant stream of data is?
I've experimented with using the API to index one document at a time, but I'm not to confident this is so robust as I have to keep track of the ID on the client side. I need to make sure no document gets indexed over, and there doesn't seem to be any auto increment of the ID with this method. Is there a way to have the ID auto increment on qdrant's side? Alternatively Elasticsearch allow strings as ID's, meaning you can use things such as NanoID to generate random ID's that have a low probability of being regenerated.
Code:
I was also looking at the python client's upload_collection() method, but this seems intended for initially uploading a collection. This method is also throwing me an error:
ValueError: not enough values to unpack (expected 2, got 1)
The error is coming from line 133 in the qdrant_client.py:
num_vectors, _dim = vectors.shape
The error makes sense as I'm passing in a np.array with a shape of (768,), 768 because I'm using the BERT model for encoding, but I can't seem to get the data in the way the client wants. If you could provide any advice here that would be helpful, but like I mentioned before this method probably isn't intended for indexing one document at a time.
Code:
Any response is much appreciated :)
Beta Was this translation helpful? Give feedback.
All reactions