Skip to content

Commit f227b91

Browse files
authored
fix(query): cte recursive check (#17427)
* fix(query): cte check * fix(query): cte check * fix(query): cte check * update * fix(query): group checker subquery * fix(query): group checker subquery * fix(query): group checker subquery * fix(query): fix tests
1 parent a8f694f commit f227b91

File tree

13 files changed

+87
-48
lines changed

13 files changed

+87
-48
lines changed

.github/workflows/links.yml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,13 @@ jobs:
2424
id: lychee
2525
uses: lycheeverse/[email protected]
2626
with:
27-
args: "--base . --cache --max-cache-age 1d . --exclude 'https://github.com/databendlabs/databend/issues/' --exclude 'https?://twitter\\.com(?:/.*$)?$'"
28-
27+
args: >-
28+
--base .
29+
--cache
30+
--max-cache-age 1d .
31+
--exclude 'https://github\.com/datafuselabs/databend/issues/.*'
32+
--exclude 'https?://github\.com/databendlabs/databend/issues/.*'
33+
--exclude 'https?://twitter\.com(?:/.*$)?$'
2934
- name: Save lychee cache
3035
uses: actions/cache/save@v3
3136
if: always()

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<a href="https://docs.databend.com/guides/cloud">Databend Serverless Cloud (beta)</a> |
88
<a href="https://docs.databend.com/">Documentation</a> |
99
<a href="https://benchmark.clickhouse.com/">Benchmarking</a> |
10-
<a href="https://github.com/datafuselabs/databend/issues/11868">Roadmap (v1.3)</a>
10+
<a href="https://github.com/databendlabs/databend/issues/11868">Roadmap (v1.3)</a>
1111

1212
</h4>
1313

@@ -22,7 +22,7 @@
2222

2323
<br>
2424

25-
<a href="https://github.com/datafuselabs/databend/actions/workflows/release.yml">
25+
<a href="https://github.com/databendlabs/databend/actions/workflows/release.yml">
2626
<img src="https://img.shields.io/github/actions/workflow/status/datafuselabs/databend/release.yml?branch=main" alt="CI Status" />
2727
</a>
2828

@@ -35,11 +35,11 @@
3535
</div>
3636
</div>
3737

38-
<img src="https://github.com/datafuselabs/databend/assets/172204/9997d8bc-6462-4dbd-90e3-527cf50a709c" alt="databend" />
38+
<img src="https://github.com/databendlabs/databend/assets/172204/9997d8bc-6462-4dbd-90e3-527cf50a709c" alt="databend" />
3939

4040
## 🐋 Introduction
4141

42-
**Databend**, built in Rust, is an open-source cloud data warehouse that serves as a cost-effective [alternative to Snowflake](https://github.com/datafuselabs/databend/issues/13059). With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.
42+
**Databend**, built in Rust, is an open-source cloud data warehouse that serves as a cost-effective [alternative to Snowflake](https://github.com/databendlabs/databend/issues/13059). With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.
4343

4444
**Production-Proven Scale:**
4545
- 🤝 **Enterprise Adoption**: Trusted by over **50 organizations** processing more than **100 million queries daily**
@@ -53,15 +53,15 @@
5353

5454
</div>
5555

56-
![Databend vs. Snowflake](https://github.com/datafuselabs/wizard/assets/172204/d796acf0-0a66-4b1d-8754-cd2cd1de04c7)
56+
![Databend vs. Snowflake](https://github.com/databendlabs/wizard/assets/172204/d796acf0-0a66-4b1d-8754-cd2cd1de04c7)
5757

5858
<div align="center">
5959

6060
[Data Ingestion Benchmark: Databend Cloud vs. Snowflake](https://docs.databend.com/guides/benchmark/data-ingest)
6161

6262
</div>
6363

64-
![Databend vs. Snowflake](https://github.com/datafuselabs/databend/assets/172204/c61d7a40-f6fe-4fb9-83e8-06ea9599aeb4)
64+
![Databend vs. Snowflake](https://github.com/databendlabs/databend/assets/172204/c61d7a40-f6fe-4fb9-83e8-06ea9599aeb4)
6565

6666

6767
## 🚀 Why Databend
@@ -90,7 +90,7 @@
9090

9191
## 📐 Architecture
9292

93-
![Databend Architecture](https://github.com/datafuselabs/databend/assets/172204/68b1adc6-0ec1-41d4-9e1d-37b80ce0e5ef)
93+
![Databend Architecture](https://github.com/databendlabs/databend/assets/172204/68b1adc6-0ec1-41d4-9e1d-37b80ce0e5ef)
9494

9595
## 🚀 Try Databend
9696

@@ -280,15 +280,15 @@ Here are some resources to help you get started:
280280
For guidance on using Databend, we recommend starting with the official documentation. If you need further assistance, explore the following community channels:
281281

282282
- [Slack](https://link.databend.com/join-slack) (For live discussion with the Community)
283-
- [GitHub](https://github.com/datafuselabs/databend) (Feature/Bug reports, Contributions)
283+
- [GitHub](https://github.com/databendlabs/databend) (Feature/Bug reports, Contributions)
284284
- [Twitter](https://twitter.com/DatabendLabs/) (Get the news fast)
285285
- [I'm feeling lucky](https://link.databend.com/i-m-feeling-lucky) (Pick up a good first issue now!)
286286

287287
## 🛣️ Roadmap
288288

289289
Stay updated with Databend's development journey. Here are our roadmap milestones:
290290

291-
- [Roadmap 2025](https://github.com/datafuselabs/databend/issues/14167)
291+
- [Roadmap 2025](https://github.com/databendlabs/databend/issues/14167)
292292

293293
## 📜 License
294294

src/query/sql/src/planner/binder/bind_context.rs

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,12 @@ impl CteContext {
216216
pub fn set_cte_context(&mut self, cte_context: CteContext) {
217217
self.cte_map = cte_context.cte_map;
218218
}
219+
220+
// Set cte context to current `BindContext`.
221+
pub fn set_cte_context_and_name(&mut self, cte_context: CteContext) {
222+
self.cte_map = cte_context.cte_map;
223+
self.cte_name = cte_context.cte_name;
224+
}
219225
}
220226

221227
#[derive(Clone, Debug)]
@@ -252,9 +258,31 @@ impl BindContext {
252258
}
253259
}
254260

255-
pub fn with_parent(parent: Box<BindContext>) -> Self {
256-
BindContext {
257-
parent: Some(parent.clone()),
261+
pub fn depth(&self) -> usize {
262+
if let Some(ref p) = self.parent {
263+
return p.depth() + 1;
264+
}
265+
1
266+
}
267+
268+
pub fn with_opt_parent(parent: Option<&BindContext>) -> Result<Self> {
269+
if let Some(p) = parent {
270+
Self::with_parent(p.clone())
271+
} else {
272+
Self::with_parent(Self::new())
273+
}
274+
}
275+
276+
pub fn with_parent(parent: BindContext) -> Result<Self> {
277+
const MAX_DEPTH: usize = 4096;
278+
if parent.depth() >= MAX_DEPTH {
279+
return Err(ErrorCode::Internal(
280+
"Query binder exceeds the maximum iterations",
281+
));
282+
}
283+
284+
Ok(BindContext {
285+
parent: Some(Box::new(parent.clone())),
258286
columns: vec![],
259287
bound_internal_columns: BTreeMap::new(),
260288
aggregate_info: Default::default(),
@@ -273,7 +301,7 @@ impl BindContext {
273301
expr_context: ExprContext::default(),
274302
planning_agg_index: false,
275303
window_definitions: DashMap::new(),
276-
}
304+
})
277305
}
278306

279307
/// Create a new BindContext with self's parent as its parent

src/query/sql/src/planner/binder/bind_table_reference/bind_subquery.rs

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -32,18 +32,14 @@ impl Binder {
3232
// If the subquery is a lateral subquery, we need to let it see the columns
3333
// from the previous queries.
3434
let (result, mut result_bind_context) = if lateral {
35-
let mut new_bind_context = BindContext::with_parent(Box::new(bind_context.clone()));
35+
let mut new_bind_context = BindContext::with_parent(bind_context.clone())?;
3636
self.bind_query(&mut new_bind_context, subquery)?
3737
} else {
38-
let mut new_bind_context = BindContext::with_parent(
39-
bind_context
40-
.parent
41-
.clone()
42-
.unwrap_or_else(|| Box::new(BindContext::new())),
43-
);
38+
let mut new_bind_context =
39+
BindContext::with_opt_parent(bind_context.parent.as_ref().map(|c| c.as_ref()))?;
4440
new_bind_context
4541
.cte_context
46-
.set_cte_context(bind_context.cte_context.clone());
42+
.set_cte_context_and_name(bind_context.cte_context.clone());
4743
self.bind_query(&mut new_bind_context, subquery)?
4844
};
4945

src/query/sql/src/planner/binder/bind_table_reference/bind_table.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -200,7 +200,7 @@ impl Binder {
200200
self.ctx
201201
.add_streams_ref(&catalog, &database, &table_name, consume);
202202
}
203-
let mut new_bind_context = BindContext::with_parent(Box::new(bind_context.clone()));
203+
let mut new_bind_context = BindContext::with_parent(bind_context.clone())?;
204204
let tokens = tokenize_sql(query.as_str())?;
205205
let (stmt, _) = parse_sql(&tokens, self.dialect)?;
206206
let Statement::Query(query) = &stmt else {
@@ -246,7 +246,7 @@ impl Binder {
246246
let tokens = tokenize_sql(query.as_str())?;
247247
let (stmt, _) = parse_sql(&tokens, self.dialect)?;
248248
// For view, we need use a new context to bind it.
249-
let mut new_bind_context = BindContext::with_parent(Box::new(bind_context.clone()));
249+
let mut new_bind_context = BindContext::with_parent(bind_context.clone())?;
250250
new_bind_context.view_info = Some((database.clone(), table_name));
251251
if let Statement::Query(query) = &stmt {
252252
self.metadata.write().add_table(

src/query/sql/src/planner/binder/bind_table_reference/bind_table_function.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -324,7 +324,7 @@ impl Binder {
324324
alias,
325325
..
326326
} => {
327-
let mut bind_context = BindContext::with_parent(Box::new(parent_context.clone()));
327+
let mut bind_context = BindContext::with_parent(parent_context.clone())?;
328328
let func_name = normalize_identifier(name, &self.name_resolution_ctx);
329329

330330
if BUILTIN_FUNCTIONS

src/query/sql/src/planner/binder/ddl/index.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -166,8 +166,7 @@ impl Binder {
166166
for (index_id, _, index_meta) in indexes {
167167
let tokens = tokenize_sql(&index_meta.query)?;
168168
let (stmt, _) = parse_sql(&tokens, self.dialect)?;
169-
let mut new_bind_context =
170-
BindContext::with_parent(Box::new(bind_context.clone()));
169+
let mut new_bind_context = BindContext::with_parent(bind_context.clone())?;
171170
new_bind_context.planning_agg_index = true;
172171
if let Statement::Query(query) = &stmt {
173172
let (s_expr, _) = self.bind_query(&mut new_bind_context, query)?;

src/query/sql/src/planner/binder/table.rs

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ impl Binder {
9696
}
9797
}
9898
}
99-
let bind_context = BindContext::with_parent(Box::new(bind_context.clone()));
99+
let bind_context = BindContext::with_parent(bind_context.clone())?;
100100
Ok((
101101
SExpr::create_leaf(Arc::new(DummyTableScan.into())),
102102
bind_context,
@@ -162,15 +162,14 @@ impl Binder {
162162
cte_info: &CteInfo,
163163
) -> Result<(SExpr, BindContext)> {
164164
if let Some(cte_name) = &bind_context.cte_context.cte_name {
165-
// `cte_name` exists, which means the current cte is a nested cte
166-
// If the `cte_name` is the same as the current cte's name, it means the cte is recursive
167165
if cte_name == table_name {
168-
return Err(ErrorCode::SemanticError(
169-
"The cte is not recursive, but it references itself.".to_string(),
170-
)
166+
return Err(ErrorCode::SemanticError(format!(
167+
"The cte {table_name} is not recursive, but it references itself.",
168+
))
171169
.set_span(span));
172170
}
173171
}
172+
174173
let mut new_bind_context = BindContext {
175174
parent: Some(Box::new(bind_context.clone())),
176175
bound_internal_columns: BTreeMap::new(),
@@ -236,7 +235,7 @@ impl Binder {
236235
cte_name: &str,
237236
alias: &Option<TableAlias>,
238237
) -> Result<(SExpr, BindContext)> {
239-
let mut new_bind_ctx = BindContext::with_parent(Box::new(bind_context.clone()));
238+
let mut new_bind_ctx = BindContext::with_parent(bind_context.clone())?;
240239
let mut metadata = self.metadata.write();
241240
let mut columns = cte_info.columns.clone();
242241
for (index, column_name) in cte_info.columns_alias.iter().enumerate() {
@@ -357,7 +356,7 @@ impl Binder {
357356
change_type: Option<ChangeType>,
358357
sample: &Option<SampleConfig>,
359358
) -> Result<(SExpr, BindContext)> {
360-
let mut bind_context = BindContext::with_parent(Box::new(bind_context.clone()));
359+
let mut bind_context = BindContext::with_parent(bind_context.clone())?;
361360

362361
let table = self.metadata.read().table(table_index).clone();
363362
let table_name = table.name();

src/query/sql/src/planner/optimizer/decorrelate/flatten_plan.rs

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -71,9 +71,14 @@ impl SubqueryRewriter {
7171
let mut metadata = self.metadata.write();
7272
// Currently, we don't support left plan's from clause contains subquery.
7373
// Such as: select t2.a from (select a + 1 as a from t) as t2 where (select sum(a) from t as t1 where t1.a < t2.a) = 1;
74-
let table_index = metadata
75-
.table_index_by_column_indexes(correlated_columns)
76-
.unwrap();
74+
let table_index =
75+
match metadata.table_index_by_column_indexes(correlated_columns) {
76+
Some(index) => index,
77+
None => return Err(ErrorCode::SemanticError(
78+
"Join left plan's from clause can't contain subquery to dcorrelated join right plan",
79+
)),
80+
};
81+
7782
let mut data_types = Vec::with_capacity(correlated_columns.len());
7883
let mut scalar_items = vec![];
7984
let mut scan_columns = ColumnSet::new();
@@ -231,7 +236,7 @@ impl SubqueryRewriter {
231236
self.flatten_expression_scan(plan, scan, correlated_columns)
232237
}
233238

234-
_ => Err(ErrorCode::Internal(
239+
_ => Err(ErrorCode::SemanticError(
235240
"Invalid plan type for flattening subquery",
236241
)),
237242
}
@@ -694,7 +699,7 @@ impl SubqueryRewriter {
694699
.iter()
695700
.any(|index| correlated_columns.contains(index))
696701
{
697-
return Err(ErrorCode::Internal(
702+
return Err(ErrorCode::SemanticError(
698703
"correlated columns in window functions not supported",
699704
));
700705
}

src/query/sql/src/planner/semantic/type_check.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3134,11 +3134,11 @@ impl<'a> TypeChecker<'a> {
31343134
);
31353135

31363136
// Create new `BindContext` with current `bind_context` as its parent, so we can resolve outer columns.
3137-
let mut bind_context = BindContext::with_parent(Box::new(self.bind_context.clone()));
3137+
let mut bind_context = BindContext::with_parent(self.bind_context.clone())?;
31383138
let (s_expr, output_context) = binder.bind_query(&mut bind_context, subquery)?;
31393139
self.bind_context
31403140
.cte_context
3141-
.set_cte_context(output_context.cte_context);
3141+
.set_cte_context_and_name(output_context.cte_context);
31423142

31433143
if (typ == SubqueryType::Scalar || typ == SubqueryType::Any)
31443144
&& output_context.columns.len() > 1
@@ -4644,7 +4644,7 @@ impl<'a> TypeChecker<'a> {
46444644
expr: &Expr,
46454645
list: &[Expr],
46464646
) -> Result<Box<(ScalarExpr, DataType)>> {
4647-
let mut bind_context = BindContext::with_parent(Box::new(self.bind_context.clone()));
4647+
let mut bind_context = BindContext::with_parent(self.bind_context.clone())?;
46484648
let mut values = Vec::with_capacity(list.len());
46494649
for val in list.iter() {
46504650
values.push(vec![val.clone()])

tests/sqllogictests/suites/query/cte/cte.test

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -483,6 +483,9 @@ CREATE OR REPLACE TABLE sales (
483483
net_paid DECIMAL(10, 2) NOT NULL
484484
) row_per_block=51113;
485485

486+
query error
487+
WITH t4(x) AS (select x + 3 from (select * from t4) where x < 10) SELECT * FROM t4;
488+
486489
query error
487490
WITH InitialSales AS (
488491
SELECT

tests/sqllogictests/suites/query/subquery.test

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,10 @@ DROP TABLE IF EXISTS c
77
statement ok
88
DROP TABLE IF EXISTS o
99

10+
11+
query error 1065
12+
SELECT * FROM (SELECT 1 AS x) AS ss1 LEFT OUTER JOIN (SELECT 2 DIV 228 AS y) AS ss2 ON TRUE, LATERAL (SELECT ss2.y AS z LIMIT 1) AS ss3
13+
1014
statement ok
1115
CREATE TABLE c (c_id INT NULL, bill VARCHAR NULL)
1216

@@ -817,7 +821,7 @@ CREATE TABLE `merge_log` (
817821
`created_at` TIMESTAMP NULL
818822
) ENGINE = FUSE;
819823

820-
query
824+
query
821825
SELECT
822826
(SELECT MAX(created_at)
823827
FROM merge_log
@@ -851,7 +855,7 @@ statement ok
851855
insert into test_group values ('2024-01-01', 100),('2024-01-01', 200),('2024-01-02', 400),('2024-01-02', 800);
852856

853857
query TI
854-
select input_date, sum(value),
858+
select input_date, sum(value),
855859
(select sum(value)
856860
from test_group b
857861
where b.input_date >= DATE_TRUNC(YEAR, input_date)

tests/suites/0_stateless/05_hints/05_0001_set_var.result

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ storage_read_buffer_size 1048576
2121
America/Toronto
2222
America/Toronto
2323
1
24-
2022-02-02 03:00:00
25-
2022-02-02 03:00:00
24+
2022-02-02 03:00:00.000000
25+
2022-02-02 03:00:00.000000
2626
1 13
2727
Asia/Shanghai

0 commit comments

Comments
 (0)