Skip to content

Commit 5b9829e

Browse files
authored
updates (#853)
1 parent bbac82c commit 5b9829e

File tree

2 files changed

+128
-4
lines changed

2 files changed

+128
-4
lines changed

docs/en/developer/00-drivers/01-python.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ title: Python
44

55
Databend offers the following Python packages enabling you to develop Python applications that interact with Databend:
66

7-
- [databend-py (**Recommendation**)](https://github.com/databendcloud/databend-py): Provides a direct interface to the Databend database. It allows you to perform standard Databend operations such as user login, database and table creation, data insertion/loading, and querying.
7+
- [databend-py (**Recommended**)](https://github.com/databendcloud/databend-py): Provides a direct interface to the Databend database. It allows you to perform standard Databend operations such as user login, database and table creation, data insertion/loading, and querying.
88
- [databend-sqlalchemy](https://github.com/databendcloud/databend-sqlalchemy): Provides a SQL toolkit and [Object-Relational Mapping](https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping) to interface with the Databend database. [SQLAlchemy](https://www.sqlalchemy.org/) is a popular SQL toolkit and ORM for Python, and databend-SQLAlchemy is a dialect for SQLAlchemy that allows you to use SQLAlchemy to interact with Databend.
99

1010
Both packages require Python version 3.5 or higher. To check your Python version, run `python --version` in your command prompt. To install the latest `databend-py` or `databend-sqlalchemy` package:

docs/en/guides/54-query/03-udf.md

Lines changed: 127 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: User-Defined Function
33
---
44
import IndexOverviewList from '@site/src/components/IndexOverviewList';
55

6-
User-Defined Functions (UDFs) offer enhanced flexibility by supporting both anonymous lambda expressions and predefined handlers (JavaScript & WebAssembly) for defining UDFs. These features allow users to create custom operations tailored to their specific data processing needs. Databend UDFs are categorized into the following types:
6+
User-Defined Functions (UDFs) offer enhanced flexibility by supporting both anonymous lambda expressions and predefined handlers (Python, JavaScript & WebAssembly) for defining UDFs. These features allow users to create custom operations tailored to their specific data processing needs. Databend UDFs are categorized into the following types:
77

88
- [Lambda UDFs](#lambda-udf)
99
- [Embedded UDFs](#embedded-udfs)
@@ -42,16 +42,140 @@ SELECT get_v1(data), get_v2(data) FROM json_table;
4242

4343
Embedded UDFs allow you to embed code written in the following programming languages within SQL:
4444

45+
- [Python](#python)
4546
- [JavaScript](#javascript)
4647
- [WebAssembly](#webassembly)
4748

4849
:::note
49-
If your program content is large, you can compress it and then pass it to the stage. See the [Usage Examples](#usage-examples-2) for WebAssembly.
50+
If your program content is large, you can compress it and then pass it to a stage. See the [Usage Examples](#usage-examples-2) for WebAssembly.
5051
:::
5152

53+
### Python
54+
55+
A Python UDF allows you to invoke Python code from a SQL query via Databend's built-in handler, enabling seamless integration of Python logic within your SQL queries.
56+
57+
:::note
58+
The Python UDF must use only Python's standard library; third-party imports are not allowed.
59+
:::
60+
61+
#### Data Type Mappings
62+
63+
See [Data Type Mappings](/developer/drivers/python#data-type-mappings) in the Developer Guide.
64+
65+
#### Usage Examples
66+
67+
This example defines a Python UDF for sentiment analysis, creates a table, inserts sample data, and performs sentiment analysis on the text data.
68+
69+
1. Define a Python UDF named `sentiment_analysis`.
70+
71+
```sql
72+
-- Create the sentiment analysis function
73+
CREATE OR REPLACE FUNCTION sentiment_analysis(STRING) RETURNS STRING
74+
LANGUAGE python HANDLER = 'sentiment_analysis'
75+
AS $$
76+
def remove_stop_words(text, stop_words):
77+
"""
78+
Removes common stop words from the text.
79+
80+
Args:
81+
text (str): The input text.
82+
stop_words (set): A set of stop words to remove.
83+
84+
Returns:
85+
str: Text with stop words removed.
86+
"""
87+
return ' '.join([word for word in text.split() if word.lower() not in stop_words])
88+
89+
def calculate_sentiment(text, positive_words, negative_words):
90+
"""
91+
Calculates the sentiment score of the text.
92+
93+
Args:
94+
text (str): The input text.
95+
positive_words (set): A set of positive words.
96+
negative_words (set): A set of negative words.
97+
98+
Returns:
99+
int: Sentiment score.
100+
"""
101+
words = text.split()
102+
score = sum(1 for word in words if word in positive_words) - sum(1 for word in words if word in negative_words)
103+
return score
104+
105+
def get_sentiment_label(score):
106+
"""
107+
Determines the sentiment label based on the sentiment score.
108+
109+
Args:
110+
score (int): The sentiment score.
111+
112+
Returns:
113+
str: Sentiment label ('Positive', 'Negative', 'Neutral').
114+
"""
115+
if score > 0:
116+
return 'Positive'
117+
elif score < 0:
118+
return 'Negative'
119+
else:
120+
return 'Neutral'
121+
122+
def sentiment_analysis(text):
123+
"""
124+
Analyzes the sentiment of the input text.
125+
126+
Args:
127+
text (str): The input text.
128+
129+
Returns:
130+
str: Sentiment analysis result including the score and label.
131+
"""
132+
stop_words = set(["a", "an", "the", "and", "or", "but", "if", "then", "so"])
133+
positive_words = set(["good", "happy", "joy", "excellent", "positive", "love"])
134+
negative_words = set(["bad", "sad", "pain", "terrible", "negative", "hate"])
135+
136+
clean_text = remove_stop_words(text, stop_words)
137+
sentiment_score = calculate_sentiment(clean_text, positive_words, negative_words)
138+
sentiment_label = get_sentiment_label(sentiment_score)
139+
140+
return f'Sentiment Score: {sentiment_score}; Sentiment Label: {sentiment_label}'
141+
$$;
142+
```
143+
144+
2. Perform sentiment analysis on the text data using the `sentiment_analysis` function.
145+
146+
```sql
147+
CREATE OR REPLACE TABLE texts (
148+
original_text STRING
149+
);
150+
151+
-- Insert sample data
152+
INSERT INTO texts (original_text)
153+
VALUES
154+
('The quick brown fox feels happy and joyful'),
155+
('A hard journey, but it was painful and sad'),
156+
('Uncertain outcomes leave everyone unsure and hesitant'),
157+
('The movie was excellent and everyone loved it'),
158+
('A terrible experience that made me feel bad');
159+
160+
161+
SELECT
162+
original_text,
163+
sentiment_analysis(original_text) AS processed_text
164+
FROM
165+
texts;
166+
167+
| original_text | processed_text |
168+
|----------------------------------------------------------|---------------------------------------------------|
169+
| The quick brown fox feels happy and joyful | Sentiment Score: 1; Sentiment Label: Positive |
170+
| A hard journey, but it was painful and sad | Sentiment Score: -1; Sentiment Label: Negative |
171+
| Uncertain outcomes leave everyone unsure and hesitant | Sentiment Score: 0; Sentiment Label: Neutral |
172+
| The movie was excellent and everyone loved it | Sentiment Score: 1; Sentiment Label: Positive |
173+
| A terrible experience that made me feel bad | Sentiment Score: -2; Sentiment Label: Negative |
174+
```
175+
52176
### JavaScript
53177

54-
A JavaScript UDF allows you to invoke JavaScript code from a SQL query via Databend's built-in handler, enabling seamless integration of JavaScript logic within your SQL queries.
178+
A JavaScript UDF allows you to invoke JavaScript code from a SQL query via Databend's built-in handler, enabling seamless integration of JavaScript logic within your SQL queries.
55179

56180
#### Data Type Mappings
57181

0 commit comments

Comments
 (0)