You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/guides/54-query/04-external-function.md
+99-27Lines changed: 99 additions & 27 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
-
title: 'External Functions in Databend Cloud'
3
-
sidebar_label: 'External Function'
2
+
title: "External Functions in Databend Cloud"
3
+
sidebar_label: "External Function"
4
4
---
5
5
6
6
External functions in Databend allow you to define custom operations for processing data using external servers written in programming languages like Python. These functions enable you to extend Databend's capabilities by integrating custom logic, leveraging external libraries, and handling complex processing tasks. Key features of external functions include:
@@ -14,7 +14,7 @@ External functions in Databend allow you to define custom operations for process
14
14
The following table lists the supported languages and the required libraries for creating external functions in Databend:
|`input_types`| A list of strings specifying the input data types (e.g., `["INT", "VARCHAR"]`). |
75
+
|`result_type`| A string specifying the return value type (e.g., `"INT"`). |
76
+
|`name`| (Optional) Custom name for the function. If not provided, the original function name is used. |
77
+
|`io_threads`| Number of I/O threads used per data chunk for I/O-bound functions. |
78
+
|`skip_null`| If set to `True`, NULL values are not passed to the function, and the corresponding return value is set to NULL. Default is `False`. |
79
79
80
80
**Data Type Mappings Between Databend and Python:**
81
81
82
-
| Databend Type | Python Type|
83
-
|-----------------------|----------------------|
84
-
| BOOLEAN |`bool`|
85
-
| TINYINT (UNSIGNED) |`int`|
86
-
| SMALLINT (UNSIGNED) |`int`|
87
-
| INT (UNSIGNED) |`int`|
88
-
| BIGINT (UNSIGNED) |`int`|
89
-
| FLOAT |`float`|
90
-
| DOUBLE |`float`|
91
-
| DECIMAL |`decimal.Decimal`|
92
-
| DATE |`datetime.date`|
93
-
| TIMESTAMP |`datetime.datetime`|
94
-
| VARCHAR |`str`|
95
-
| VARIANT |`any`|
96
-
| MAP(K,V) |`dict`|
97
-
| ARRAY(T) |`list[T]`|
98
-
| TUPLE(T,...) |`tuple(T,...)`|
82
+
| Databend Type | Python Type |
83
+
|-------------------|-------------------|
84
+
| BOOLEAN |`bool`|
85
+
| TINYINT (UNSIGNED) |`int`|
86
+
| SMALLINT (UNSIGNED) |`int`|
87
+
| INT (UNSIGNED) |`int`|
88
+
| BIGINT (UNSIGNED) |`int`|
89
+
| FLOAT |`float`|
90
+
| DOUBLE |`float`|
91
+
| DECIMAL |`decimal.Decimal`|
92
+
| DATE |`datetime.date`|
93
+
| TIMESTAMP |`datetime.datetime`|
94
+
| VARCHAR |`str`|
95
+
| VARIANT |`any`|
96
+
| MAP(K,V) |`dict`|
97
+
| ARRAY(T) |`list[T]`|
98
+
| TUPLE(T,...) |`tuple(T,...)`|
99
99
100
100
### 3. Run the External Server
101
101
@@ -141,6 +141,78 @@ You can now use the external function `gcd` in your SQL queries:
141
141
SELECT gcd(48, 18); -- Returns 6
142
142
```
143
143
144
+
## Load Balancing External Functions
145
+
146
+
When deploying multiple external function servers, you can implement load balancing based on function names. Databend includes a `X-DATABEND-FUNCTION` header in each UDF request, which contains the function name being called. This header can be used to route requests to different backend servers.
147
+
148
+
### Using Nginx for Function-Based Routing
149
+
150
+
Here's an example of how to configure Nginx to route different UDF requests to specific backend servers:
151
+
152
+
```nginx
153
+
# Define upstream servers for different UDF functions
When registering your functions in Databend, use the Nginx server's domain:
207
+
208
+
```sql
209
+
CREATEFUNCTIONgcd (INT, INT)
210
+
RETURNS INT
211
+
LANGUAGE PYTHON
212
+
HANDLER ='gcd'
213
+
ADDRESS ='https://udf.example.com';
214
+
```
215
+
144
216
## Conclusion
145
217
146
218
External functions in Databend Cloud provide a powerful way to extend the functionality of your data processing pipelines by integrating custom code written in languages like Python. By following the steps outlined above, you can create and use external functions to handle complex processing tasks, leverage external libraries, and implement advanced logic.
0 commit comments