Skip to content

Commit 3e87f45

Browse files
quanruyuyutaotao
andauthored
Docs/1.1 (#1723)
* docs(site): 1.1 changelog * fix(docs): yaml script examples for Android automation * docs(core): update changelog * chore(docs): remove outdated v1.0 section from changelog --------- Co-authored-by: yutao <yutao.tao@bytedance.com>
1 parent e697b43 commit 3e87f45

File tree

6 files changed

+100
-11
lines changed

6 files changed

+100
-11
lines changed

apps/site/docs/en/automate-with-scripts-in-yaml.mdx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -264,13 +264,11 @@ android:
264264
tasks:
265265
- name: Launch Settings app
266266
flow:
267-
- launch:
268-
uri: com.android.settings
267+
- launch: com.android.settings
269268
270269
- name: Open webpage
271270
flow:
272-
- launch:
273-
uri: https://www.example.com
271+
- launch: https://www.example.com
274272
```
275273

276274
### The `ios` part

apps/site/docs/en/changelog.mdx

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,50 @@
11
# Changelog
22

3+
## v1.1 - `aiAct` deep thinking and extensible MCP SDK
4+
5+
v1.1 optimizes model planning capabilities and MCP extensibility, making automation more stable in complex scenarios while providing more flexible solutions for enterprise MCP service deployments.
6+
7+
### `aiAct` can enable deep thinking (deepThink)
8+
9+
When deep thinking is enabled in `aiAct`, the model will interpret intent more thoroughly and optimize its planning results. This is suited for complex forms, multi-step flows, and similar scenarios. It improves accuracy but increases planning latency.
10+
11+
Currently supported: Qwen3-vl on Alibaba Cloud and Doubao-vision on Volcano Engine. See [Model strategy](./model-strategy) for details.
12+
13+
Example usage:
14+
15+
```typescript
16+
await agent.aiAct('If the UI shows an "Add shipping address" button, expand the existing "Shipping address" list and select the last item', { deepThink: true });
17+
```
18+
19+
### MCP extension and SDK exposure
20+
21+
Developers can use the MCP SDK exposed by Midscene to flexibly deploy a public MCP service. This capability applies to Agent instances on any platform.
22+
23+
Typical application scenarios:
24+
- Run MCP in enterprise intranet to control private device pools
25+
- Package Midscene capabilities as internal microservices for multiple teams
26+
- Extend custom automation toolchains
27+
28+
See documentation: [MCP Services](./mcp)
29+
30+
### Chrome extension improvements
31+
- Fixed potential event loss during recording, improving recording stability
32+
- Optimized coordinate passing in `describeElement` for better element description accuracy
33+
34+
### CLI and configuration enhancements
35+
- **File parameter support**: Fixed CLI issue where `--files` parameter wasn't properly handled when `--config` was specified; now they can be flexibly combined
36+
- **Dynamic configuration**: Fixed Playground not reading the `MIDSCENE_REPLANNING_CYCLE_LIMIT` environment variable properly
37+
38+
### iOS Agent compatibility improvements
39+
- Optimized `getWindowSize` method to automatically fall back to legacy endpoint when newer API is unavailable, improving compatibility with WebDriverAgent versions
40+
41+
### Report and Playground improvements
42+
- Fixed issue where report wasn't properly initialized before accessing screen properties
43+
- Fixed abnormal behavior of stop function in Playground
44+
- Improved error handling during video export to avoid crashes caused by frame cancel
45+
46+
Thanks to contributors: @FriedRiceNoodles
47+
348
## v1.0 - Midscene v1.0 is here!
449

550
Midscene v1.0 is here! Try it out today and see how it can help you automate your workflows.

apps/site/docs/en/mcp.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,9 @@ Add the Midscene Android MCP server (`@midscene/android-mcp`) in your MCP client
121121

122122
## Implement your own MCP
123123

124-
If you want to integrate Midscene tools into your own MCP service, you can use the `mcpKitForAgent` function to get tool definitions without starting a full MCP server.
124+
If you want to integrate Midscene tools into your own MCP service, you can use the `mcpKitForAgent` function to get tool definitions and expose your own MCP service as needed.
125+
126+
The tools provided by `mcpKitForAgent` include screenshots and every Action in the Action Space.
125127

126128
### Using mcpKitForAgent
127129

apps/site/docs/zh/automate-with-scripts-in-yaml.mdx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -266,13 +266,11 @@ android:
266266
tasks:
267267
- name: 启动设置应用
268268
flow:
269-
- launch:
270-
uri: com.android.settings
269+
- launch: com.android.settings
271270
272271
- name: 打开网页
273272
flow:
274-
- launch:
275-
uri: https://www.example.com
273+
- launch: https://www.example.com
276274
```
277275

278276
### `ios` 部分

apps/site/docs/zh/changelog.mdx

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,49 @@
11
# 更新日志
2+
3+
## v1.1 - `aiAct`深度思考与可扩展的 MCP SDK
4+
5+
v1.1 版本在模型规划能力与 MCP 扩展性上实现优化,让复杂场景的自动化更稳定,同时为企业级 MCP 服务部署提供更灵活的方案。
6+
7+
### `aiAct` 可开启深度思考能力(deepThink)
8+
9+
`aiAct` 时开启深度思考能力后,模型会更加深入地理解用户意图、优化规划结果,适用于复杂表单、多步骤流程等场景。它会带来更高的准确率,但也会增加规划耗时。
10+
11+
目前已支持阿里云的 Qwen3-vl 与火山引擎的 Doubao-vision 模型,具体请参考 [模型策略](./model-strategy)
12+
13+
示例用法:
14+
15+
```typescript
16+
await agent.aiAct('如果界面上展示“添加收货地址”按钮,那么展开已有的“收货地址”列表,并选择最后一项', { deepThink: true });
17+
```
18+
19+
### MCP 扩展与 SDK 开放
20+
21+
开发者可以使用 Midscene 暴露的 MCP SDK 灵活部署自己的公共 MCP 服务。此能力适用于任意平台的 Agent 实例。
22+
23+
典型应用场景:
24+
- 在企业内网中运行 MCP 控制私有设备池
25+
- 将 Midscene 能力封装为内部微服务供多团队使用
26+
- 扩展自定义自动化工具链
27+
28+
详见文档:[MCP 服务](./mcp)
29+
30+
### Chrome 扩展优化
31+
- 修复录制期间的潜在事件丢失问题,提升录制稳定性
32+
- 优化 `describeElement` 的坐标传递,提高元素描述准确性
33+
34+
### CLI 与配置增强
35+
- **文件参数支持**: 修复 CLI 在同时指定 `--config` 时未正确处理 `--files` 参数的问题,现在可灵活组合使用
36+
- **动态配置**: 修复 Playground 中环境变量 `MIDSCENE_REPLANNING_CYCLE_LIMIT` 未正确读取的问题
37+
38+
### iOS Agent兼容性提升
39+
- 优化 `getWindowSize` 方法,在新版本 API 不可用时自动回退到 legacy endpoint,提升对 WebDriverAgent 版本的兼容性
40+
41+
### 报告与 Playground 改进
42+
- 修复报告在访问屏幕属性前未正确初始化的问题
43+
- 修复 Playground 中 stop 函数的异常行为
44+
- 优化视频导出时的错误处理,避免 frame cancel 导致的崩溃
45+
46+
感谢贡献者:@FriedRiceNoodles
247

348
## v1.0 - Midscene v1.0 正式发布!
449

apps/site/docs/zh/mcp.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -119,10 +119,11 @@ open report_file_name.html
119119
}
120120
```
121121

122-
123122
## 实现自己的 MCP
124123

125-
如果你想在自己的 MCP 服务中集成 Midscene 工具,可以使用 `mcpKitForAgent` 函数来获取工具定义,而不需要启动完整的 MCP 服务器。
124+
如果你想在自己的 MCP 服务中集成 Midscene 工具,可以使用 `mcpKitForAgent` 函数来获取工具定义,继而自己按需暴露 MCP 服务。
125+
126+
`mcpKitForAgent` 提供的工具包括截图与 Action Space 中的每个 Action。
126127

127128
### 使用 mcpKitForAgent
128129

0 commit comments

Comments
 (0)