Skip to content

Commit

Permalink
Update SearchClient to use webPageUrl instead of static fileName for …
Browse files Browse the repository at this point in the history
…webpages (fixes #491) (#521)

## Motivation and Context (Why the change? What's the scenario?)

When providing webpages as facts, the "filename" currently is a static
"content.url" - this provides no value when asking the LLM to include
sources directly in the response (e.g. to have per paragraph sources).

Update SearchClient to use webPageUrl instead of static fileName for
webpages.

## High level description (Approach, Design)

When creating the facts, instead of "content.url" the webpage url is
added

Co-authored-by: Michael Keller <[email protected]>
  • Loading branch information
chaelli and michaelkellerviu authored May 27, 2024
1 parent 9732e74 commit 17d73e0
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion service/Core/Search/SearchClient.cs
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,8 @@ public async Task<MemoryAnswer> AskAsync(

string fileName = memory.GetFileName(this._log);

string webPageUrl = memory.GetWebPageUrl(index);

var partitionText = memory.GetPartitionText(this._log).Trim();
if (string.IsNullOrEmpty(partitionText))
{
Expand All @@ -248,7 +250,7 @@ public async Task<MemoryAnswer> AskAsync(
factsAvailableCount++;

// TODO: add file age in days, to push relevance of newer documents
var fact = $"==== [File:{fileName};Relevance:{relevance:P1}]:\n{partitionText}\n";
var fact = $"==== [File:{(fileName == "content.url" ? webPageUrl : fileName)};Relevance:{relevance:P1}]:\n{partitionText}\n";

// Use the partition/chunk only if there's room for it
var size = this._textGenerator.CountTokens(fact);
Expand Down

0 comments on commit 17d73e0

Please sign in to comment.