Skip to content

ADDED WAIT OR NON OPERATION #229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: main
Choose a base branch
from
Open

Conversation

Koolkatze
Copy link

@Koolkatze Koolkatze commented Feb 9, 2025

What does this PR do?

When a website or the screen doesn't load quickly (in raspberry pi for example). The SOC now is capable of waiting for some time to the page to load without the previous bug that stopped the program running.

Fixes #

Operate.py includes wait or none operation.

Added Wait operation to prompts.py

Also added double click operation to operate.py and prompts.py

Added Claude 3.7 and Qwen-VL to the list of multimodal USABLE models

Added a 1.25 multiplication in horizontal pyautogui clicks coordinates and a 1.50 on vertical to adapt to Windows 1920x1080 resolution 125% Scale on the windows screen.

Added wait function
I uploaded a functional code to wait if the screen isnt yet loaded and it works perfectly.
@joshbickett
Copy link
Contributor

@Koolkatze I think a wait command would be good, but if I understand correctly this PR doesn't change the prompt so the AI will never fire a wait operation. I think it would required that update.

@Koolkatze
Copy link
Author

The prompt isn't followed to perfection by models in general so it could happen (and it does happen) that the model invents a new operation called "wait" or "none" and fails to continue it's way through the code so it crashes. Actually it could be good to describe it in the prompt for the model to use the exact word for the operation and doesn't invent a newest "waiting" or a "null" operation instead of the previously mentioned (that would also end up in a crash). I will update the prompt as soon as I can so the model uses the operation with the correct words.

Added wait operation to the prompt so if the page isn't loaded yet it uses operate.py "wait" operation and Waits for 5 seconds.
@Koolkatze
Copy link
Author

Just added the wait operation to all variations of the prompt.

Corrected and finished adding wait operation to prompt.
Made the prompt more coherent
new model (Claude 3.7) explanation
implementing Claude 3.7 and Qwen-VL API KEY request
few adjustments
added Claude 3.7 and Qwen-VL call functions
No need to make changes for Claude 3.7 model Prompt selection but instead implemented Qwen-VL model Prompt selection
no need to make changes for Claude 3.7 screenshot function but added compressed screenshot function for Qwen-VL instead
intendation correction
intendation fix
added coherence to the code
change sting to integer in wait operation seconds
changed call_claude_37 function to correctly handle requests
added a scaling factor multiplier and divider, added a double click operation, left a pair of ideas behind but commented them in case somebody needs them.
added reliable claude 3.7 usability
added double click functionality
You can place the mouse over an icon and hit enter to make a screenshot of the icons area, the png will be saved on a folder for further use in guideing the SOC through icons.
Koolkatze added 10 commits March 1, 2025 21:17
Small spelling correction
more scroll integer number so it scrolls faster
now the scroll scrolls faster
had to redo some lines for the scroll to work well
sorry, better now
changes in scroll prompt to suite better our needs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants