-
Notifications
You must be signed in to change notification settings - Fork 9.7k
fix(edit): Improve edit tool reliability and error messages #13868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(edit): Improve edit tool reliability and error messages #13868
Conversation
Major improvements to address issue google-gemini#1028: 1. **Replace sync fs operations with async** (editCorrector.ts) - Changed fs.statSync to fs.promises.stat - Prevents blocking on Windows and other platforms - Adds proper error handling for stat failures 2. **Enhanced error messages with actionable guidance** - Detailed list of common failure causes - Shows preview of searched string - Provides specific remediation steps 3. **Add similarity-based content hints** - New findSimilarContent() method finds close matches - Helps model locate intended edit location - Shows context around similar content (up to 3 lines) - Uses character-level similarity scoring 4. **Improved LLM correction flow** - Better fallback when external file modifications detected - More robust error handling in stat operations These changes significantly improve the edit tool's reliability by: - Reducing false negatives through better matching - Providing actionable feedback when edits fail - Helping the model self-correct with similarity hints - Eliminating blocking sync operations Fixes google-gemini#1028
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @JuanCS-Dev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the reliability and user experience of the edit tool by addressing critical issues that led to failed edits. It introduces asynchronous file system operations to prevent performance bottlenecks, provides highly detailed and actionable error messages to guide users and the model, and incorporates intelligent similarity-based hints to assist in locating intended edit locations even when an exact match is not found. These changes aim to reduce false negatives and improve the model's ability to self-correct. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant improvements to the edit tool's reliability and error feedback, which is a great step forward. The move to asynchronous file operations and the addition of more detailed error messages with similarity hints will definitely enhance the user experience and model performance. However, I've identified a critical issue in the new similarity calculation logic that would prevent it from working as intended. My review includes a detailed comment and a code suggestion to address this.
| private calculateSimilarity(str1: string, str2: string): number { | ||
| if (str1 === str2) return 1.0; | ||
| if (str1.length === 0 || str2.length === 0) return 0; | ||
|
|
||
| // Normalize whitespace for comparison | ||
| const s1 = str1.replace(/\s+/g, ' ').toLowerCase(); | ||
| const s2 = str2.replace(/\s+/g, ' ').toLowerCase(); | ||
|
|
||
| if (s1 === s2) return 0.95; // High score but not perfect (whitespace differs) | ||
|
|
||
| // Count matching substrings | ||
| const minLen = Math.min(s1.length, s2.length); | ||
| const maxLen = Math.max(s1.length, s2.length); | ||
| let matches = 0; | ||
|
|
||
| for (let i = 0; i < minLen; i++) { | ||
| if (s1[i] === s2[i]) matches++; | ||
| } | ||
|
|
||
| return matches / maxLen; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of calculateSimilarity only compares the prefixes of the two strings. This is not a robust measure of similarity and will fail to identify similar lines if the common text doesn't start at the beginning of the line. For example, calculateSimilarity("foo bar", "another foo bar") would incorrectly return 0.
The comment // Count matching substrings is also misleading, as the code only counts matching characters in the prefix.
This is a significant issue as it undermines the new similarity hint feature. I've suggested a more robust implementation using the Longest Common Substring algorithm, which will correctly identify similarities anywhere in the strings.
private calculateSimilarity(str1: string, str2: string): number {
if (str1 === str2) return 1.0;
if (str1.length === 0 || str2.length === 0) return 0;
// Normalize whitespace for comparison
const s1 = str1.replace(/\s+/g, ' ').toLowerCase();
const s2 = str2.replace(/\s+/g, ' ').toLowerCase();
if (s1 === s2) return 0.95; // High score but not perfect (whitespace differs)
// The original logic only compared prefixes. This implementation uses
// dynamic programming to find the longest common substring, which is a more
// robust measure of similarity.
const m = Array(s1.length + 1)
.fill(0)
.map(() => Array(s2.length + 1).fill(0));
let longest = 0;
for (let i = 1; i <= s1.length; i++) {
for (let j = 1; j <= s2.length; j++) {
if (s1[i - 1] === s2[j - 1]) {
m[i][j] = m[i - 1][j - 1] + 1;
if (m[i][j] > longest) {
longest = m[i][j];
}
}
}
}
// Normalize by the average length of the two strings for a score between 0 and 1.
return (2 * longest) / (s1.length + s2.length);
}Addresses review comment from gemini-code-assist bot.
Previous implementation only compared string prefixes, which would fail
for cases like calculateSimilarity('foo bar', 'another foo bar') returning 0.
New implementation uses dynamic programming to find the longest common
substring anywhere in the strings, providing more robust similarity detection.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
462100f to
43251cc
Compare
Fixes #1028
Description
Major improvements to the edit tool to address the widespread reliability issues reported by users (74👍 reactions). This PR tackles the core problems causing "Failed to edit, could not find the string to replace" errors.
Root Causes Identified
fs.statSync()causing performance issues, especially on WindowsChanges
1. Replace Sync Operations with Async (
editCorrector.ts)fs.statSync(filePath)- blocking operationawait fs.promises.stat(filePath)- non-blocking2. Enhanced Error Messages with Actionable Guidance (
edit.ts)Before:
After:
3. Add Similarity-Based Content Hints
New
findSimilarContent()method provides intelligent fallback:Example output when edit fails:
4. Improved LLM Correction Flow
Testing
npm run buildImpact Assessment
Benefits
Risks
Related Issues
This directly addresses the most upvoted bug (#1028 - 74 reactions) affecting edit tool reliability.
Checklist
replace(edit) tool #1028)