Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Syntactic indexing with heuristics #58727

Closed
4 of 20 tasks
varungandhi-src opened this issue Dec 4, 2023 · 1 comment
Closed
4 of 20 tasks

Syntactic indexing with heuristics #58727

varungandhi-src opened this issue Dec 4, 2023 · 1 comment
Assignees
Labels
Epic feature Tracking issues for a feature graph/backend Related to Go code in the backend graph/frontend Related to client-side code (web frontend etc.) team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) tracking

Comments

@varungandhi-src
Copy link
Contributor

varungandhi-src commented Dec 4, 2023

Subpart of https://github.com/sourcegraph/sourcegraph/issues/58005

For this quarter:

Stretch goals (these will most likely be pushed to the next quarter)

  • Update worker with policies & scheduling for batch indexing jobs
  • New GraphQL resolver or fields to request batch indexing
  • Database schema changes (+ associated migration) for tracking heuristics-based indexes
  • Changes to frontend code to prefer heuristics-based results before falling back to text based search
  • Observability
  • Analytics
  • User-facing documentation

Once the feature works end-to-end (which should hopefully be sometime next quarter), we can start rolling out support for the most popular languages first: Java, TypeScript, JavaScript, Python, Go.

@varungandhi-src varungandhi-src added team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) feature Tracking issues for a feature graph/backend Related to Go code in the backend graph/frontend Related to client-side code (web frontend etc.) labels Dec 4, 2023
@keynmol keynmol changed the title Batch indexing with heuristics Syntactic indexing with heuristics Feb 7, 2024
@varungandhi-src
Copy link
Contributor Author

varungandhi-src commented Mar 11, 2024

Mar 11 2024 project update:

  • Status: We're making incremental progress. However, at the moment, there is still a lot of work that needs to be done (deployments, new GraphQL API design), we continue getting support requests, and the team's capacity will be reduced in April (parental leave). So the project is at risk for not being completed this quarter.
  • Velocity: The complexity of the existing backend code, some of it related to scaling concerns, plus unfamiliar coding patterns with lots of indirection, has made on-boarding for us harder than what I had initially estimated.
  • Big change in implementation direction for one sub-part: We've decided to change the implementation strategy for symbol matching/lookup, to filtering results from searcher/Zoekt with syntactic data, because about 2 weeks back, we learnt that the existing code for storing SCIP symbols is not well-suited to the kind of fuzzy matching that we want to do, and that storing all the symbols directly would require too much storage to be practical.

Some other challenges that were either unexpected or took longer than expected:

  • There was an unexpected (minor) security issue we discovered when looking at some existing code; fixing that took some time.
  • The Linear trial has taken up more of Varun's time than planned due to bugs in the issue importing functionality they have.

@mmanela mmanela added the Migrated label May 6, 2024 — with Linear
@mmanela mmanela changed the title Syntactic indexing with heuristics Syntactic indexing with heuristics [Converted to "Syntactic indexing with heuristics" project] May 8, 2024
@mmanela mmanela changed the title Syntactic indexing with heuristics [Converted to "Syntactic indexing with heuristics" project] Syntactic indexing with heuristics May 8, 2024
@eseliger eseliger removed the Migrated label May 21, 2024
@mmanela mmanela closed this as not planned Won't fix, can't repro, duplicate, stale Aug 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Epic feature Tracking issues for a feature graph/backend Related to Go code in the backend graph/frontend Related to client-side code (web frontend etc.) team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) tracking
Projects
None yet
Development

No branches or pull requests

5 participants