Skip to content

Commit a5345be

Browse files
committed
Merge branch 'feat-website-adjust' of github.com:datafuselabs/databend-docs into feat-website-adjust
* 'feat-website-adjust' of github.com:datafuselabs/databend-docs: Update aggregate-histogram.md Update aggregate-histogram.md fix: links
2 parents f17d953 + d718389 commit a5345be

File tree

1 file changed

+52
-39
lines changed

1 file changed

+52
-39
lines changed

Diff for: docs/en/sql-reference/20-sql-functions/07-aggregate-functions/aggregate-histogram.md

+52-39
Original file line numberDiff line numberDiff line change
@@ -5,43 +5,37 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
55

66
<FunctionDescription description="Introduced or updated: v1.2.377"/>
77

8-
Computes the distribution of the data. It uses an "equal height" bucketing strategy to generate the histogram. The result of the function returns an empty or Json string.
8+
Generates a data distribution histogram using an "equal height" bucketing strategy.
99

1010
## Syntax
1111

1212
```sql
1313
HISTOGRAM(<expr>)
14-
HISTOGRAM(<expr> [, max_num_buckets])
15-
```
16-
17-
`max_num_buckets` means the maximum number of buckets that can be used, by default it is 128.
1814

19-
For example:
20-
```sql
21-
select histogram(c_id) from histagg;
22-
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
23-
│ histogram(c_id) │
24-
│ Nullable(String) │
25-
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
26-
│ [{"lower":"1","upper":"1","ndv":1,"count":6,"pre_sum":0},{"lower":"2","upper":"2","ndv":1,"count":6,"pre_sum":6}] │
27-
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
15+
-- The following two forms are equivalent:
16+
HISTOGRAM(<max_num_buckets>)(<expr>)
17+
HISTOGRAM(<expr> [, <max_num_buckets>])
2818
```
29-
:::
30-
31-
## Arguments
3219

33-
| Arguments | Description |
34-
|-------------------|--------------------------------------------------------------------------------------------|
35-
| `<expr>` | The data type of `<expr>` should be sortable. |
36-
| `max_num_buckets` | Optional constant positive integer, the maximum number of buckets that can be used. |
20+
| Parameter | Description |
21+
|-------------------|-------------------------------------------------------------------------------------|
22+
| `expr` | The data type of `expr` should be sortable. |
23+
| `max_num_buckets` | Optional positive integer specifying the maximum number of buckets. Default is 128. |
3724

3825
## Return Type
3926

40-
the Nullable String type
27+
Returns either an empty string or a JSON object with the following structure:
4128

42-
## Example
29+
- **buckets**: List of buckets with detailed information:
30+
- **lower**: Lower bound of the bucket.
31+
- **upper**: Upper bound of the bucket.
32+
- **count**: Number of elements in the bucket.
33+
- **pre_sum**: Cumulative count of elements up to the current bucket.
34+
- **ndv**: Number of distinct values in the bucket.
4335

44-
**Create a Table and Insert Sample Data**
36+
## Examples
37+
38+
This example shows how the HISTOGRAM function analyzes the distribution of `c_int` values in the `histagg` table, returning bucket boundaries, distinct value counts, element counts, and cumulative counts:
4539

4640
```sql
4741
CREATE TABLE histagg (
@@ -58,24 +52,17 @@ INSERT INTO histagg VALUES
5852
(2, 21, 22, 23),
5953
(2, 31, 32, 33),
6054
(2, 10, 20, 30);
61-
```
6255

63-
**Query Demo 1**
64-
```sql
6556
SELECT HISTOGRAM(c_int) FROM histagg;
66-
```
6757

68-
**Result**
69-
```sql
7058
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
7159
│ histogram(c_int) │
72-
│ Nullable(String) │
7360
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
7461
│ [{"lower":"13","upper":"13","ndv":1,"count":1,"pre_sum":0},{"lower":"23","upper":"23","ndv":1,"count":1,"pre_sum":1},{"lower":"30","upper":"30","ndv":1,"count":2,"pre_sum":2},{"lower":"33","upper":"33","ndv":1,"count":2,"pre_sum":4}] │
7562
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
7663
```
7764

78-
Query result description:
65+
The result is returned as a JSON array:
7966

8067
```json
8168
[
@@ -110,11 +97,37 @@ Query result description:
11097
]
11198
```
11299

113-
Fields description:
100+
This example shows how `HISTOGRAM(2)` groups c_int values into two buckets:
114101

115-
- buckets:All buckets
116-
- lower:Upper bound of the bucket
117-
- upper:Lower bound of the bucket
118-
- count:The number of elements contained in the bucket
119-
- pre_sum:The total number of elements in the front bucket
120-
- ndv:The number of distinct values in the bucket
102+
```sql
103+
SELECT HISTOGRAM(2)(c_int) FROM histagg;
104+
-- Or
105+
SELECT HISTOGRAM(c_int, 2) FROM histagg;
106+
107+
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
108+
│ histogram(2)(c_int) │
109+
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
110+
│ [{"lower":"13","upper":"30","ndv":3,"count":4,"pre_sum":0},{"lower":"33","upper":"33","ndv":1,"count":2,"pre_sum":4}] │
111+
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
112+
```
113+
114+
The result is returned as a JSON array:
115+
116+
```json
117+
[
118+
{
119+
"lower": "13",
120+
"upper": "30",
121+
"ndv": 3,
122+
"count": 4,
123+
"pre_sum": 0
124+
},
125+
{
126+
"lower": "33",
127+
"upper": "33",
128+
"ndv": 1,
129+
"count": 2,
130+
"pre_sum": 4
131+
}
132+
]
133+
```

0 commit comments

Comments
 (0)