-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Open
Labels
A-interop-arrowArea: interoperability with other Arrow implementations (such as pyarrow)Area: interoperability with other Arrow implementations (such as pyarrow)bugSomething isn't workingSomething isn't workingneeds triageAwaiting prioritization by a maintainerAwaiting prioritization by a maintainerpythonRelated to Python PolarsRelated to Python Polars
Description
Checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of Polars.
Reproducible example
import polars as pl
import pyarrow as pa
import pyarrow.dataset as ds
data = pa.table({
"vector": [[3.1, 4.1], [5.9, 26.5]],
"item": ["foo", "bar"],
"price": [10.0, 20.0],
})
pyds = ds.dataset(data)
# if allow_pyarrow_filter is True then this passes, currently this fails
print(pl.scan_pyarrow_dataset(pyds, allow_pyarrow_filter=False).first().collect())Log output
Traceback (most recent call last):
File "/home/pace/dev/lance-experiments/polars-bug/simple_repr.py", line 11, in <module>
print(pl.scan_pyarrow_dataset(pyds, allow_pyarrow_filter=False).first().collect())
File "/home/pace/miniconda3/envs/lance/lib/python3.10/site-packages/polars/_utils/deprecation.py", line 97, in wrapper
return function(*args, **kwargs)
File "/home/pace/miniconda3/envs/lance/lib/python3.10/site-packages/polars/lazyframe/opt_flags.py", line 328, in wrapper
return function(*args, **kwargs)
File "/home/pace/miniconda3/envs/lance/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 2422, in collect
return wrap_df(ldf.collect(engine, callback))
File "/home/pace/miniconda3/envs/lance/lib/python3.10/site-packages/polars/_utils/scan.py", line 27, in _execute_from_rust
return function(with_columns, *args)
TypeError: _scan_pyarrow_dataset_impl() got multiple values for argument 'batch_size'Issue description
This worked correctly in previous versions of polars (worked in 1.3.0 and fails in 1.4.1)
Expected behavior
The code should run without error
Installed versions
--------Version info---------
Polars: 1.35.2
Index type: UInt32
Platform: Linux-6.8.0-87-generic-x86_64-with-glibc2.39
Python: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
Runtime: rt32
----Optional dependencies----
Azure CLI 2.76.0
adbc_driver_manager <not installed>
altair <not installed>
azure.identity <not installed>
boto3 1.35.58
cloudpickle <not installed>
connectorx <not installed>
deltalake <not installed>
fastexcel <not installed>
fsspec 2024.6.1
gevent <not installed>
google.auth 2.27.0
great_tables <not installed>
matplotlib 3.9.2
numpy 2.2.3
openpyxl <not installed>
pandas 2.2.2
polars_cloud <not installed>
pyarrow 20.0.0
pydantic 2.12.2
pyiceberg <not installed>
sqlalchemy 1.4.54
torch 2.7.1+cu126
xlsx2csv <not installed>
xlsxwriter <not installed>
geoHeil and AdrienDart
Metadata
Metadata
Assignees
Labels
A-interop-arrowArea: interoperability with other Arrow implementations (such as pyarrow)Area: interoperability with other Arrow implementations (such as pyarrow)bugSomething isn't workingSomething isn't workingneeds triageAwaiting prioritization by a maintainerAwaiting prioritization by a maintainerpythonRelated to Python PolarsRelated to Python Polars