Skip to content

[BUG] Index corruption while using Vector #2724

@hez2010

Description

@hez2010

Version
6.0.0-prerelease.73

Describe the bug
I'm using the vector search feature in LiteDB 6.0. But sometimes the vector index can be corrupted, especially if you inserting records containing embeddings concurrently. Things will boom when you try to drop the collection.

 LiteDB.LiteException: invalid segment position in index footer: {0}/{1}
         at LiteDB.Constants.ENSURE(Boolean conditional, String format, Object[] args)
         at LiteDB.Engine.BasePage.Get(Byte index)
         at LiteDB.Engine.VectorIndexPage.GetNode(Byte index)
         at LiteDB.Engine.VectorIndexService.GetNode(PageAddress address)
         at LiteDB.Engine.VectorIndexService.ClearTree(VectorIndexMetadata metadata)
         at LiteDB.Engine.VectorIndexService.Drop(VectorIndexMetadata metadata)
         at LiteDB.Engine.Snapshot.DropCollection(Action safePoint)
         at LiteDB.Engine.LiteEngine.<>c__DisplayClass1_0.<DropCollection>b__0(TransactionService transaction)
         at LiteDB.Engine.LiteEngine.AutoTransaction[T](Func`2 fn)
         at LiteDB.Engine.LiteEngine.DropCollection(String name)
         at LiteDB.SharedEngine.<>c__DisplayClass21_0.<DropCollection>b__0()
         at LiteDB.SharedEngine.QueryDatabase[T](Func`1 Query)
         at LiteDB.SharedEngine.DropCollection(String name)
         at LiteDB.LiteDatabase.DropCollection(String name)

where in LiteDB.Engine.BasePage.Get:

	public BufferSlice Get(byte index)
	{
		Constants.ENSURE(ItemsCount > 0, "should have items in this page");
		Constants.ENSURE(HighestIndex != byte.MaxValue, "should have at least 1 index in this page");
		Constants.ENSURE(index <= HighestIndex, "get only index below highest index");
		int offset = CalcPositionAddr(index);
		int offset2 = CalcLengthAddr(index);
		ushort num = _buffer.ReadUInt16(offset);
		ushort num2 = _buffer.ReadUInt16(offset2);
		Constants.ENSURE(IsValidPos(num), "invalid segment position in index footer: {0}/{1}", this, index);
		Constants.ENSURE(IsValidLen(num2), "invalid segment length in index footer: {0}/{1}", this, index);
		return _buffer.Slice(num, num2);
	}

the context is:

Image

Apparently the vector index was corrupted, both num and num2 becomes 0 here.

Code to Reproduce
I don't have an exact repro, but in our application it's doing something like:

var collection = db.GetCollection<Record>("records");
// parallelly inserting plenty of records that have an embedding
Parallel.For(0, 1000, i => { var record = new Record { Embedding = ... }; collection.Upsert(record); });
// Use the collection for vector search
var result = collection.Query().TopK(...).ToArray();
// Finally drops it
collection.DropCollection("records")

Expected behavior
No exception occurrence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions