-
Notifications
You must be signed in to change notification settings - Fork 250
Open
Description
Gemini 2.5 Computer Use Preview exhibits inconsistent behavior regarding the accuracy of click_at coordinates after a certain number of iterations. While the output_text shows a reasoning with valid coordinates within the normalized 0-1000 range, the function_call click_at action often provides coordinates exceeding this maximum, leading to incorrect clicks. This issue seems to manifest after approximately 15 iterations. I am attaching a llm logs of the output_text and function_call from a recent instance where this occurred.
Please provide any additional information required for your investigation.
{
"timestamp": "2025-10-17T08:17:47.122Z",
"success": true,
"id": "gcua_1760689067122_nhvkogrhj",
"model": "gemini-2.5-computer-use-preview-10-2025",
"output": [
{
"type": "output_text",
"text": "Pressing 'escape' also did not close the modal. This is strange.\nThe modal \"Apply to Explore Group\" is preventing me from interacting with the page.\nMaybe I need to click the \"Dismiss\" button at the bottom right (y=940, x=720) if there is one, or try clicking outside the modal area again, very far from it. Or maybe the 'X' (y=190, x=720) again.\nIt looks like the 'X' is the only close option. I'll try it one more time, very carefully aiming for its center."
},
{
"type": "computer_call",
"call_id": "call_1760689067122_movb2b9yu",
"action": {
"type": "click_at",
"y": 139,
"x": 1922
}
}
],
"usage": {
"input_tokens": 9779,
"output_tokens": 157
},
"finishReason": "STOP"
}ravisahu, mattjames-canva and titombagwu
Metadata
Metadata
Assignees
Labels
No labels