-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prompt used for os environment benchmark? #725
Comments
The prompt used for AndroidWorld can be found in Qwen2.5-VL/cookbooks/mobile_agent.ipynb. Notably, to fit the action space in AndroidWorld evaluation, we add a new action
|
Hi @cjfcsjt Thank you for raising this question! The following script is my
Code still untidy a little bit, clean it if you need. Feel free to explore the code and let me know if you have any questions or need further clarification. I hope this helps with your benchmarking efforts! Best regards, |
@Timothyxxx @LukeForeverYoung
|
For your question 2, when benchmarking OSWorld from prompting qwen2.5-vl, the thought is included in the response, for example:
|
For tasks in mobile scenarios, you could ask Qwen2.5-VL to think and provide summaries. Here is an example prompt:
We leave the summaries in conversation histories by default. It should work well if you organize the summaries into the prompts, similar to mainstream UI-Agent frameworks. |
What is the prompt used for the os environment benchmark? (e.g., Android world, OSworld)
The text was updated successfully, but these errors were encountered: