|  | 
|  | 1 | +--- | 
|  | 2 | +title: "MaxCompute" | 
|  | 3 | +weight: 7 | 
|  | 4 | +type: docs | 
|  | 5 | +aliases: | 
|  | 6 | +  - /connectors/maxcompute | 
|  | 7 | +--- | 
|  | 8 | + | 
|  | 9 | +<!-- | 
|  | 10 | +Licensed to the Apache Software Foundation (ASF) under one | 
|  | 11 | +or more contributor license agreements.  See the NOTICE file | 
|  | 12 | +distributed with this work for additional information | 
|  | 13 | +regarding copyright ownership.  The ASF licenses this file | 
|  | 14 | +to you under the Apache License, Version 2.0 (the | 
|  | 15 | +"License"); you may not use this file except in compliance | 
|  | 16 | +with the License.  You may obtain a copy of the License at | 
|  | 17 | +
 | 
|  | 18 | +  http://www.apache.org/licenses/LICENSE-2.0 | 
|  | 19 | +
 | 
|  | 20 | +Unless required by applicable law or agreed to in writing, | 
|  | 21 | +software distributed under the License is distributed on an | 
|  | 22 | +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | 
|  | 23 | +KIND, either express or implied.  See the License for the | 
|  | 24 | +specific language governing permissions and limitations | 
|  | 25 | +under the License. | 
|  | 26 | +--> | 
|  | 27 | + | 
|  | 28 | +# MaxCompute Connector | 
|  | 29 | + | 
|  | 30 | +MaxCompute Pipeline 连接器可以用作 Pipeline 的 *Data Sink*,将数据写入[MaxCompute](https://www.aliyun.com/product/odps)。 | 
|  | 31 | +本文档介绍如何设置 MaxCompute Pipeline 连接器。 | 
|  | 32 | + | 
|  | 33 | +## 连接器的功能 | 
|  | 34 | + | 
|  | 35 | +* 自动建表 | 
|  | 36 | +* 表结构变更同步 | 
|  | 37 | +* 数据实时同步 | 
|  | 38 | + | 
|  | 39 | +## 示例 | 
|  | 40 | + | 
|  | 41 | +从 MySQL 读取数据同步到 MaxCompute 的 Pipeline 可以定义如下: | 
|  | 42 | + | 
|  | 43 | +```yaml | 
|  | 44 | +source: | 
|  | 45 | +  type: mysql | 
|  | 46 | +  name: MySQL Source | 
|  | 47 | +  hostname: 127.0.0.1 | 
|  | 48 | +  port: 3306 | 
|  | 49 | +  username: admin | 
|  | 50 | +  password: pass | 
|  | 51 | +  tables: adb.\.*, bdb.user_table_[0-9]+, [app|web].order_\.* | 
|  | 52 | +  server-id: 5401-5404 | 
|  | 53 | + | 
|  | 54 | +sink: | 
|  | 55 | +  type: maxcompute | 
|  | 56 | +  name: MaxCompute Sink | 
|  | 57 | +  accessId: ak | 
|  | 58 | +  accessKey: sk | 
|  | 59 | +  endpoint: endpoint | 
|  | 60 | +  project: flink_cdc | 
|  | 61 | +  bucketSize: 8 | 
|  | 62 | + | 
|  | 63 | +pipeline: | 
|  | 64 | +  name: MySQL to MaxCompute Pipeline | 
|  | 65 | +  parallelism: 2 | 
|  | 66 | +``` | 
|  | 67 | +
 | 
|  | 68 | +## 连接器配置项 | 
|  | 69 | +
 | 
|  | 70 | +<div class="highlight"> | 
|  | 71 | +<table class="colwidths-auto docutils"> | 
|  | 72 | +   <thead> | 
|  | 73 | +      <tr> | 
|  | 74 | +        <th class="text-left" style="width: 25%">Option</th> | 
|  | 75 | +        <th class="text-left" style="width: 8%">Required</th> | 
|  | 76 | +        <th class="text-left" style="width: 7%">Default</th> | 
|  | 77 | +        <th class="text-left" style="width: 10%">Type</th> | 
|  | 78 | +        <th class="text-left" style="width: 50%">Description</th> | 
|  | 79 | +      </tr> | 
|  | 80 | +    </thead> | 
|  | 81 | +    <tbody> | 
|  | 82 | +    <tr> | 
|  | 83 | +      <td>type</td> | 
|  | 84 | +      <td>required</td> | 
|  | 85 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 86 | +      <td>String</td> | 
|  | 87 | +      <td>指定要使用的连接器, 这里需要设置成 <code>'maxcompute'</code>.</td> | 
|  | 88 | +    </tr> | 
|  | 89 | +    <tr> | 
|  | 90 | +      <td>name</td> | 
|  | 91 | +      <td>optional</td> | 
|  | 92 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 93 | +      <td>String</td> | 
|  | 94 | +      <td>Sink 的名称.</td> | 
|  | 95 | +    </tr> | 
|  | 96 | +    <tr> | 
|  | 97 | +      <td>accessId</td> | 
|  | 98 | +      <td>required</td> | 
|  | 99 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 100 | +      <td>String</td> | 
|  | 101 | +      <td>阿里云账号或RAM用户的AccessKey ID。您可以进入<a href="https://ram.console.aliyun.com/manage/ak"> | 
|  | 102 | +            AccessKey管理页面</a> 获取AccessKey ID。</td> | 
|  | 103 | +    </tr> | 
|  | 104 | +    <tr> | 
|  | 105 | +      <td>accessKey</td> | 
|  | 106 | +      <td>required</td> | 
|  | 107 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 108 | +      <td>String</td> | 
|  | 109 | +      <td>AccessKey ID对应的AccessKey Secret。您可以进入<a href="https://ram.console.aliyun.com/manage/ak"> | 
|  | 110 | +            AccessKey管理页面</a> 获取AccessKey Secret。</td> | 
|  | 111 | +    </tr> | 
|  | 112 | +    <tr> | 
|  | 113 | +      <td>endpoint</td> | 
|  | 114 | +      <td>required</td> | 
|  | 115 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 116 | +      <td>String</td> | 
|  | 117 | +      <td>MaxCompute服务的连接地址。您需要根据创建MaxCompute项目时选择的地域以及网络连接方式配置Endpoint。各地域及网络对应的Endpoint值,请参见<a href="https://help.aliyun.com/zh/maxcompute/user-guide/endpoints"> | 
|  | 118 | +           Endpoint</a>。</td> | 
|  | 119 | +    </tr> | 
|  | 120 | +    <tr> | 
|  | 121 | +      <td>project</td> | 
|  | 122 | +      <td>required</td> | 
|  | 123 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 124 | +      <td>String</td> | 
|  | 125 | +      <td>MaxCompute项目名称。您可以登录<a href="https://maxcompute.console.aliyun.com/"> | 
|  | 126 | +           MaxCompute控制台</a>,在 工作区 > 项目管理 页面获取MaxCompute项目名称。</td> | 
|  | 127 | +    </tr> | 
|  | 128 | +    <tr> | 
|  | 129 | +      <td>tunnelEndpoint</td> | 
|  | 130 | +      <td>optional</td> | 
|  | 131 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 132 | +      <td>String</td> | 
|  | 133 | +      <td>MaxCompute Tunnel服务的连接地址,通常这项配置可以根据指定的project所在的region进行自动路由。仅在使用代理等特殊网络环境下使用该配置。</td> | 
|  | 134 | +    </tr> | 
|  | 135 | +    <tr> | 
|  | 136 | +      <td>quotaName</td> | 
|  | 137 | +      <td>optional</td> | 
|  | 138 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 139 | +      <td>String</td> | 
|  | 140 | +      <td>MaxCompute 数据传输使用的独享资源组名称,如不指定该配置,则使用共享资源组。详情可以参考<a href="https://help.aliyun.com/zh/maxcompute/user-guide/purchase-and-use-exclusive-resource-groups-for-dts"> | 
|  | 141 | +           使用 Maxcompute 独享资源组</a></td> | 
|  | 142 | +    </tr> | 
|  | 143 | +    <tr> | 
|  | 144 | +      <td>stsToken</td> | 
|  | 145 | +      <td>optional</td> | 
|  | 146 | +      <td style="word-wrap: break-word;">(none)</td> | 
|  | 147 | +      <td>String</td> | 
|  | 148 | +      <td>当使用RAM角色颁发的短时有效的访问令牌(STS Token)进行鉴权时,需要指定该参数。</td> | 
|  | 149 | +    </tr> | 
|  | 150 | +    <tr> | 
|  | 151 | +      <td>bucketsNum</td> | 
|  | 152 | +      <td>optional</td> | 
|  | 153 | +      <td style="word-wrap: break-word;">16</td> | 
|  | 154 | +      <td>Integer</td> | 
|  | 155 | +      <td>自动创建 MaxCompute Delta 表时使用的桶数。使用方式可以参考 <a href="https://help.aliyun.com/zh/maxcompute/user-guide/transaction-table2-0-overview"> | 
|  | 156 | +           Delta Table 概述</a></td> | 
|  | 157 | +    </tr> | 
|  | 158 | +    <tr> | 
|  | 159 | +      <td>compressAlgorithm</td> | 
|  | 160 | +      <td>optional</td> | 
|  | 161 | +      <td style="word-wrap: break-word;">zlib</td> | 
|  | 162 | +      <td>String</td> | 
|  | 163 | +      <td>写入MaxCompute时使用的数据压缩算法,当前支持<code>raw</code>(不进行压缩),<code>zlib</code>和<code>snappy</code>。</td> | 
|  | 164 | +    </tr> | 
|  | 165 | +    <tr> | 
|  | 166 | +      <td>totalBatchSize</td> | 
|  | 167 | +      <td>optional</td> | 
|  | 168 | +      <td style="word-wrap: break-word;">64MB</td> | 
|  | 169 | +      <td>String</td> | 
|  | 170 | +      <td>内存中缓冲的数据量大小,单位为分区级(非分区表单位为表级),不同分区(表)的缓冲区相互独立,达到阈值后数据写入到MaxCompute。</td> | 
|  | 171 | +    </tr> | 
|  | 172 | +    <tr> | 
|  | 173 | +      <td>bucketBatchSize</td> | 
|  | 174 | +      <td>optional</td> | 
|  | 175 | +      <td style="word-wrap: break-word;">4MB</td> | 
|  | 176 | +      <td>String</td> | 
|  | 177 | +      <td>内存中缓冲的数据量大小,单位为桶级,仅写入 Delta 表时生效。不同数据桶的缓冲区相互独立,达到阈值后将该桶数据写入到MaxCompute。</td> | 
|  | 178 | +    </tr> | 
|  | 179 | +    <tr> | 
|  | 180 | +      <td>numCommitThreads</td> | 
|  | 181 | +      <td>optional</td> | 
|  | 182 | +      <td style="word-wrap: break-word;">16</td> | 
|  | 183 | +      <td>Integer</td> | 
|  | 184 | +      <td>checkpoint阶段,能够同时处理的分区(表)数量。</td> | 
|  | 185 | +    </tr> | 
|  | 186 | +    <tr> | 
|  | 187 | +      <td>numFlushConcurrent</td> | 
|  | 188 | +      <td>optional</td> | 
|  | 189 | +      <td style="word-wrap: break-word;">4</td> | 
|  | 190 | +      <td>Integer</td> | 
|  | 191 | +      <td>写入数据到MaxCompute时,能够同时写入的桶数量。仅写入 Delta 表时生效。</td> | 
|  | 192 | +    </tr> | 
|  | 193 | +    </tbody> | 
|  | 194 | +</table>     | 
|  | 195 | +</div> | 
|  | 196 | +
 | 
|  | 197 | +## 使用说明 | 
|  | 198 | +
 | 
|  | 199 | +* 链接器 支持自动建表,将MaxCompute表与源表的位置关系、数据类型进行自动映射(参见下文映射表),当源表有主键时,自动创建 | 
|  | 200 | +  MaxCompute Delta 表,否则创建普通 MaxCompute 表(Append表) | 
|  | 201 | +* 当写入普通 MaxCompute 表(Append表)时,会忽略`delete`操作,`update`操作会被视为`insert`操作 | 
|  | 202 | +* 目前仅支持 at-least-once,Delta 表由于主键特性能够实现幂等写 | 
|  | 203 | +* 对于表结构变更同步 | 
|  | 204 | +    * 新增列只能添加到最后一列 | 
|  | 205 | +    * 修改列类型,只能修改为兼容的类型。兼容表可以参考[ALTER TABLE](https://help.aliyun.com/zh/maxcompute/user-guide/alter-table) | 
|  | 206 | + | 
|  | 207 | +## 表位置映射 | 
|  | 208 | + | 
|  | 209 | +链接器自动建表时,使用如下映射关系,将源表的位置信息映射到MaxCompute表的位置。注意,当MaxCompute项目不支持Schema模型时,每个同步任务仅能同步一个Mysql | 
|  | 210 | +Database。(其他Datasource同理,链接器会忽略TableId.namespace信息) | 
|  | 211 | + | 
|  | 212 | +<div class="wy-table-responsive"> | 
|  | 213 | +<table class="colwidths-auto docutils"> | 
|  | 214 | +    <thead> | 
|  | 215 | +      <tr> | 
|  | 216 | +        <th class="text-left">Flink CDC 中抽象</th> | 
|  | 217 | +        <th class="text-left">MaxCompute 位置</th> | 
|  | 218 | +        <th class="text-left" style="width:60%;">Mysql 位置</th> | 
|  | 219 | +      </tr> | 
|  | 220 | +    </thead> | 
|  | 221 | +    <tbody> | 
|  | 222 | +    <tr> | 
|  | 223 | +      <td>配置文件中project</td> | 
|  | 224 | +      <td>project</td> | 
|  | 225 | +      <td>(none)</td> | 
|  | 226 | +    </tr> | 
|  | 227 | +    <tr> | 
|  | 228 | +      <td>TableId.namespace</td> | 
|  | 229 | +      <td>schema(仅当MaxCompute项目支持Schema模型时,如不支持,将忽略该配置)</td> | 
|  | 230 | +      <td>database</td> | 
|  | 231 | +    </tr> | 
|  | 232 | +    <tr> | 
|  | 233 | +      <td>TableId.tableName</td> | 
|  | 234 | +      <td>table</td> | 
|  | 235 | +      <td>table</td> | 
|  | 236 | +    </tr> | 
|  | 237 | +    </tbody> | 
|  | 238 | +</table> | 
|  | 239 | +</div> | 
|  | 240 | + | 
|  | 241 | +## 数据类型映射 | 
|  | 242 | + | 
|  | 243 | +<div class="wy-table-responsive"> | 
|  | 244 | +<table class="colwidths-auto docutils"> | 
|  | 245 | +    <thead> | 
|  | 246 | +      <tr> | 
|  | 247 | +        <th class="text-left">Flink Type</th> | 
|  | 248 | +        <th class="text-left">MaxCompute Type</th> | 
|  | 249 | +      </tr> | 
|  | 250 | +    </thead> | 
|  | 251 | +    <tbody> | 
|  | 252 | +    <tr> | 
|  | 253 | +      <td>CHAR/VARCHAR</td> | 
|  | 254 | +      <td>STRING</td> | 
|  | 255 | +    </tr> | 
|  | 256 | +    <tr> | 
|  | 257 | +      <td>BOOLEAN</td> | 
|  | 258 | +      <td>BOOLEAN</td> | 
|  | 259 | +    </tr> | 
|  | 260 | +    <tr> | 
|  | 261 | +      <td>BINARY/VARBINARY</td> | 
|  | 262 | +      <td>BINARY</td> | 
|  | 263 | +    </tr> | 
|  | 264 | +    <tr> | 
|  | 265 | +      <td>DECIMAL</td> | 
|  | 266 | +      <td>DECIMAL</td> | 
|  | 267 | +    </tr> | 
|  | 268 | +    <tr> | 
|  | 269 | +      <td>TINYINT</td> | 
|  | 270 | +      <td>TINYINT</td> | 
|  | 271 | +    </tr> | 
|  | 272 | +    <tr> | 
|  | 273 | +      <td>SMALLINT</td> | 
|  | 274 | +      <td>SMALLINT</td> | 
|  | 275 | +    </tr> | 
|  | 276 | +    <tr> | 
|  | 277 | +      <td>INTEGER</td> | 
|  | 278 | +      <td>INTEGER</td> | 
|  | 279 | +    </tr> | 
|  | 280 | +    <tr> | 
|  | 281 | +      <td>BIGINT</td> | 
|  | 282 | +      <td>BIGINT</td> | 
|  | 283 | +    </tr> | 
|  | 284 | +    <tr> | 
|  | 285 | +      <td>FLOAT</td> | 
|  | 286 | +      <td>FLOAT</td> | 
|  | 287 | +    </tr> | 
|  | 288 | +    <tr> | 
|  | 289 | +      <td>DOUBLE</td> | 
|  | 290 | +      <td>DOUBLE</td> | 
|  | 291 | +    </tr> | 
|  | 292 | +    <tr> | 
|  | 293 | +      <td>TIME_WITHOUT_TIME_ZONE</td> | 
|  | 294 | +      <td>STRING</td> | 
|  | 295 | +    </tr> | 
|  | 296 | +    <tr> | 
|  | 297 | +      <td>DATE</td> | 
|  | 298 | +      <td>DATE</td> | 
|  | 299 | +    </tr> | 
|  | 300 | +    <tr> | 
|  | 301 | +      <td>TIMESTAMP_WITHOUT_TIME_ZONE</td> | 
|  | 302 | +      <td>TIMESTAMP_NTZ</td> | 
|  | 303 | +    </tr> | 
|  | 304 | +    <tr> | 
|  | 305 | +      <td>TIMESTAMP_WITH_LOCAL_TIME_ZONE</td> | 
|  | 306 | +      <td>TIMESTAMP</td> | 
|  | 307 | +    </tr> | 
|  | 308 | +    <tr> | 
|  | 309 | +      <td>TIMESTAMP_WITH_TIME_ZONE</td> | 
|  | 310 | +      <td>TIMESTAMP</td> | 
|  | 311 | +    </tr> | 
|  | 312 | +    <tr> | 
|  | 313 | +      <td>ARRAY</td> | 
|  | 314 | +      <td>ARRAY</td> | 
|  | 315 | +    </tr> | 
|  | 316 | +    <tr> | 
|  | 317 | +      <td>MAP</td> | 
|  | 318 | +      <td>MAP</td> | 
|  | 319 | +    </tr> | 
|  | 320 | +    <tr> | 
|  | 321 | +      <td>ROW</td> | 
|  | 322 | +      <td>STRUCT</td> | 
|  | 323 | +    </tr> | 
|  | 324 | +    </tbody> | 
|  | 325 | +</table> | 
|  | 326 | +</div> | 
|  | 327 | + | 
|  | 328 | +{{< top >}} | 
0 commit comments