Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow benchmarking against remote files #2297

Open
wants to merge 26 commits into
base: develop
Choose a base branch
from

Conversation

danking
Copy link
Member

@danking danking commented Feb 10, 2025

Also with configurable: tracing & scale-factor.

When I tried to use DataFusion to write directly into S3 objects, it returned successfully but no objects were created.

I think my NamedLocks is a bit janky, but we don't atomic renames in the cloud.

a10y and others added 6 commits February 10, 2025 16:47
fix

fix

works

cleanup toml

remove distraction

remove enable_compression: true

remove commented code
@CLAassistant
Copy link

CLAassistant commented Feb 10, 2025

CLA assistant check
All committers have signed the CLA.

@danking danking force-pushed the dk/tpch-objectstore2 branch from 6166566 to e22de86 Compare February 10, 2025 21:03
@danking danking force-pushed the dk/tpch-objectstore2 branch from e22de86 to cc99cae Compare February 10, 2025 21:04
@danking danking marked this pull request as ready for review February 10, 2025 21:22
@danking danking requested a review from a10y February 10, 2025 21:22
@danking danking enabled auto-merge (squash) February 10, 2025 21:22
@danking danking added the benchmark Run benchmarks on this branch label Feb 10, 2025
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Feb 10, 2025
Copy link
Contributor

Benchmarks: random_access

Table of Results
name PR 9bc8f60 base cc1658f ratio (PR/base) unit
random-access/vortex-tokio-local-disk 2.64977e+06 2.61702e+06 1.01251 ns
random-access/vortex-local-fs 3.03388e+06 2.94477e+06 1.03026 ns
random-access/parquet-tokio-local-disk 2.27741e+08 2.21545e+08 1.02797 ns

Copy link
Contributor

Benchmarks: Clickbench

Table of Results
name PR 9bc8f60 base cc1658f ratio (PR/base) unit
clickbench_q00/parquet 2045049 2.59061e+06 0.789408 ns
clickbench_q01/parquet 63343664 5.92364e+07 1.06934 ns
clickbench_q02/parquet 120700805 1.16711e+08 1.03418 ns
clickbench_q03/parquet 85573553 8.42009e+07 1.0163 ns
clickbench_q04/parquet 658849731 6.39746e+08 1.02986 ns
clickbench_q05/parquet 717737271 6.90136e+08 1.03999 ns
clickbench_q06/parquet 2173301 2.17884e+06 0.997456 ns
clickbench_q07/parquet 63487509 6.37675e+07 0.99561 ns
clickbench_q08/parquet 729894203 7.48423e+08 0.975242 ns
clickbench_q09/parquet 1021250591 1.00776e+09 1.01338 ns
clickbench_q10/parquet 261975393 2.5645e+08 1.02154 ns
clickbench_q11/parquet 307118197 3.01177e+08 1.01973 ns
clickbench_q12/parquet 721413104 7.15447e+08 1.00834 ns
clickbench_q13/parquet 1002336498 9.60124e+08 1.04397 ns
clickbench_q14/parquet 721379408 6.88595e+08 1.04761 ns
clickbench_q15/parquet 751054365 7.17595e+08 1.04663 ns
clickbench_q16/parquet 1515529163 1.49382e+09 1.01453 ns
clickbench_q17/parquet 1360707574 1.37908e+09 0.986675 ns
clickbench_q18/parquet 2894390328 2.94041e+09 0.984349 ns
clickbench_q19/parquet 66127566 6.63815e+07 0.996174 ns
clickbench_q20/parquet 1135876839 1.13586e+09 1.00002 ns
clickbench_q21/parquet 1296126180 1.3052e+09 0.993045 ns
clickbench_q22/parquet 1890678838 1.90636e+09 0.991775 ns
clickbench_q23/parquet 7795941727 7.66922e+09 1.01652 ns
clickbench_q24/parquet 441756350 4.34675e+08 1.01629 ns
clickbench_q25/parquet 397515279 3.87961e+08 1.02463 ns
clickbench_q26/parquet 489750452 4.82974e+08 1.01403 ns
clickbench_q27/parquet 1528920607 1.53305e+09 0.997308 ns
clickbench_q28/parquet 11251947415 1.12568e+10 0.999571 ns
clickbench_q29/parquet 431120881 4.27929e+08 1.00746 ns
clickbench_q30/parquet 670091129 6.71464e+08 0.997955 ns
clickbench_q31/parquet 719642155 6.99992e+08 1.02807 ns
clickbench_q32/parquet 2782644390 2.72873e+09 1.01976 ns
clickbench_q33/parquet 2834572886 2.80403e+09 1.01089 ns
clickbench_q34/parquet 2845000684 2.785e+09 1.02154 ns
clickbench_q35/parquet 867061784 8.43442e+08 1.028 ns
clickbench_q36/parquet 180346211 1.71769e+08 1.04994 ns
clickbench_q37/parquet 84172880 8.23848e+07 1.0217 ns
clickbench_q38/parquet 111193436 1.08483e+08 1.02498 ns
clickbench_q39/parquet 331999978 3.12967e+08 1.06081 ns
clickbench_q40/parquet 53149759 5.02829e+07 1.05701 ns
clickbench_q41/parquet 49773503 4.88934e+07 1.018 ns
clickbench_q42/parquet 70975507 6.65442e+07 1.06659 ns
clickbench_q00/vortex-file-compressed 2104681 2.09687e+06 1.00373 ns
clickbench_q01/vortex-file-compressed 14078006 1.43645e+07 0.980058 ns
clickbench_q02/vortex-file-compressed 137839603 1.60241e+08 0.8602 ns
clickbench_q03/vortex-file-compressed 105007046 1.08821e+08 0.964954 ns
clickbench_q04/vortex-file-compressed 633110788 6.43671e+08 0.983594 ns
clickbench_q05/vortex-file-compressed 682102558 7.07003e+08 0.964781 ns
clickbench_q06/vortex-file-compressed 2187340 2.21329e+06 0.988276 ns
clickbench_q07/vortex-file-compressed 24486273 2.57998e+07 0.949087 ns
clickbench_q08/vortex-file-compressed 761751976 8.16275e+08 0.933205 ns
clickbench_q09/vortex-file-compressed 804426050 8.92021e+08 0.901802 ns
clickbench_q10/vortex-file-compressed 196656651 1.99819e+08 0.984175 ns
clickbench_q11/vortex-file-compressed 217640656 2.21792e+08 0.981282 ns
clickbench_q12/vortex-file-compressed 549943380 6.05717e+08 0.907921 ns
clickbench_q13/vortex-file-compressed 790857068 8.41198e+08 0.940156 ns
clickbench_q14/vortex-file-compressed 547767497 5.70883e+08 0.95951 ns
clickbench_q15/vortex-file-compressed 705538634 7.39013e+08 0.954704 ns
clickbench_q16/vortex-file-compressed 1459150709 1.44482e+09 1.00992 ns
clickbench_q17/vortex-file-compressed 1337929700 1.34193e+09 0.997017 ns
clickbench_q18/vortex-file-compressed 2837096534 2.85653e+09 0.993198 ns
clickbench_q19/vortex-file-compressed 40090768 5.05046e+07 0.793805 ns
clickbench_q20/vortex-file-compressed 711756057 7.13119e+08 0.998089 ns
clickbench_q21/vortex-file-compressed 840387806 8.19881e+08 1.02501 ns
clickbench_q22/vortex-file-compressed 1691021138 1.69601e+09 0.997057 ns
clickbench_q23/vortex-file-compressed 3978781545 3.98552e+09 0.998309 ns
clickbench_q24/vortex-file-compressed 283017088 2.81595e+08 1.00505 ns
clickbench_q25/vortex-file-compressed 254888908 2.62251e+08 0.971927 ns
clickbench_q26/vortex-file-compressed 345571997 3.45204e+08 1.00107 ns
clickbench_q27/vortex-file-compressed 1327040692 1.32837e+09 0.999001 ns
clickbench_q28/vortex-file-compressed 10728089774 1.06937e+10 1.00321 ns
clickbench_q29/vortex-file-compressed 2416538629 2.32283e+09 1.04034 ns
clickbench_q30/vortex-file-compressed 503602338 4.94993e+08 1.01739 ns
clickbench_q31/vortex-file-compressed 570283510 5.43779e+08 1.04874 ns
clickbench_q32/vortex-file-compressed 2838384334 2.7873e+09 1.01833 ns
clickbench_q33/vortex-file-compressed 2371052914 2.27646e+09 1.04155 ns
clickbench_q34/vortex-file-compressed 2354818674 2.27725e+09 1.03406 ns
clickbench_q35/vortex-file-compressed 984897171 9.66615e+08 1.01891 ns
clickbench_q36/vortex-file-compressed 197671285 1.92703e+08 1.02578 ns
clickbench_q37/vortex-file-compressed 122975720 1.15473e+08 1.06497 ns
clickbench_q38/vortex-file-compressed 127170944 1.17036e+08 1.0866 ns
clickbench_q39/vortex-file-compressed 350976558 3.38038e+08 1.03828 ns
clickbench_q40/vortex-file-compressed 37933122 3.90929e+07 0.970333 ns
clickbench_q41/vortex-file-compressed 44341333 4.63027e+07 0.957641 ns
clickbench_q42/vortex-file-compressed 55599297 5.51087e+07 1.0089 ns

Copy link
Contributor

Benchmarks: compress

Table of Results
name PR 9bc8f60 base cc1658f ratio (PR/base) unit
compress time/wide table cols=10 chunks=1 rows=1000 3.58591e+06 3.55211e+06 1.00952 ns
compress time/wide table cols=10 chunks=1 rows=1000 throughput 0.0335039 0.0338227 0.990574 bytes/ns
parquet_rs-zstd compress time/wide table cols=10 chunks=1 rows=1000 706158 709286 0.99559 ns
parquet_rs-zstd compress time/wide table cols=10 chunks=1 rows=1000 throughput 0.170135 0.169384 1.00443 bytes/ns
decompress time/wide table cols=10 chunks=1 rows=1000 238109 234604 1.01494 ns
decompress time/wide table cols=10 chunks=1 rows=1000 throughput 0.504568 0.512105 0.985282 bytes/ns
parquet_rs-zstd decompress time/wide table cols=10 chunks=1 rows=1000 243590 243145 1.00183 ns
parquet_rs-zstd decompress time/wide table cols=10 chunks=1 rows=1000 throughput 0.493214 0.494116 0.998174 bytes/ns
compress time/wide table cols=100 chunks=1 rows=1000 3.67168e+07 3.64375e+07 1.00766 ns
compress time/wide table cols=100 chunks=1 rows=1000 throughput 0.0327159 0.0329666 0.992395 bytes/ns
parquet_rs-zstd compress time/wide table cols=100 chunks=1 rows=1000 7.93163e+06 7.42265e+06 1.06857 ns
parquet_rs-zstd compress time/wide table cols=100 chunks=1 rows=1000 throughput 0.151447 0.161832 0.935829 bytes/ns
decompress time/wide table cols=100 chunks=1 rows=1000 2.46211e+06 2.45664e+06 1.00223 ns
decompress time/wide table cols=100 chunks=1 rows=1000 throughput 0.487884 0.48897 0.997779 bytes/ns
parquet_rs-zstd decompress time/wide table cols=100 chunks=1 rows=1000 2.49504e+06 2.51005e+06 0.994019 ns
parquet_rs-zstd decompress time/wide table cols=100 chunks=1 rows=1000 throughput 0.481444 0.478564 1.00602 bytes/ns
compress time/wide table cols=1000 chunks=1 rows=1000 3.91352e+08 3.8949e+08 1.00478 ns
compress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.0306937 0.0308404 0.995241 bytes/ns
parquet_rs-zstd compress time/wide table cols=1000 chunks=1 rows=1000 9.50086e+07 9.48116e+07 1.00208 ns
parquet_rs-zstd compress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.126431 0.126694 0.997927 bytes/ns
decompress time/wide table cols=1000 chunks=1 rows=1000 2.80262e+07 2.72788e+07 1.0274 ns
decompress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.4286 0.440343 0.973332 bytes/ns
parquet_rs-zstd decompress time/wide table cols=1000 chunks=1 rows=1000 2.74689e+07 2.74406e+07 1.00103 ns
parquet_rs-zstd decompress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.437295 0.437746 0.998971 bytes/ns
compress time/wide table cols=10 chunks=50 rows=1000 7.20412e+07 7.16159e+07 1.00594 ns
compress time/wide table cols=10 chunks=50 rows=1000 throughput 0.00175543 0.00176585 0.994097 bytes/ns
parquet_rs-zstd compress time/wide table cols=10 chunks=50 rows=1000 987052 995917 0.991099 ns
parquet_rs-zstd compress time/wide table cols=10 chunks=50 rows=1000 throughput 0.128122 0.126981 1.00898 bytes/ns
decompress time/wide table cols=10 chunks=50 rows=1000 1.16384e+06 1.17359e+06 0.991696 ns
decompress time/wide table cols=10 chunks=50 rows=1000 throughput 0.10866 0.107758 1.00837 bytes/ns
parquet_rs-zstd decompress time/wide table cols=10 chunks=50 rows=1000 248226 250406 0.991292 ns
parquet_rs-zstd decompress time/wide table cols=10 chunks=50 rows=1000 throughput 0.509468 0.505031 1.00878 bytes/ns
compress time/wide table cols=100 chunks=50 rows=1000 7.19886e+08 7.20626e+08 0.998972 ns
compress time/wide table cols=100 chunks=50 rows=1000 throughput 0.00175092 0.00174912 1.00103 bytes/ns
parquet_rs-zstd compress time/wide table cols=100 chunks=50 rows=1000 1.17588e+07 1.08376e+07 1.085 ns
parquet_rs-zstd compress time/wide table cols=100 chunks=50 rows=1000 throughput 0.107193 0.116305 0.92166 bytes/ns
decompress time/wide table cols=100 chunks=50 rows=1000 1.1632e+07 1.15722e+07 1.00517 ns
decompress time/wide table cols=100 chunks=50 rows=1000 throughput 0.108362 0.108922 0.994858 bytes/ns
parquet_rs-zstd decompress time/wide table cols=100 chunks=50 rows=1000 2.49487e+06 2.52878e+06 0.986588 ns
parquet_rs-zstd decompress time/wide table cols=100 chunks=50 rows=1000 throughput 0.505222 0.498446 1.01359 bytes/ns
compress time/wide table cols=1000 chunks=50 rows=1000 7.3445e+09 7.37642e+09 0.995673 ns
compress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.00171563 0.00170821 1.00435 bytes/ns
parquet_rs-zstd compress time/wide table cols=1000 chunks=50 rows=1000 1.63417e+08 1.6171e+08 1.01056 ns
parquet_rs-zstd compress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.0771061 0.0779201 0.989552 bytes/ns
decompress time/wide table cols=1000 chunks=50 rows=1000 1.24199e+08 1.23497e+08 1.00568 ns
decompress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.101454 0.102031 0.994351 bytes/ns
parquet_rs-zstd decompress time/wide table cols=1000 chunks=50 rows=1000 2.76119e+07 2.73035e+07 1.01129 ns
parquet_rs-zstd decompress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.456342 0.461496 0.988833 bytes/ns
compress time/taxi 1.5783e+09 1.58963e+09 0.992869 ns
compress time/taxi throughput 0.298303 0.296176 1.00718 bytes/ns
parquet_rs-zstd compress time/taxi 1.81235e+09 1.83092e+09 0.989857 ns
parquet_rs-zstd compress time/taxi throughput 0.259778 0.257143 1.01025 bytes/ns
decompress time/taxi 2.29595e+08 2.28613e+08 1.0043 ns
decompress time/taxi throughput 2.05061 2.05942 0.995721 bytes/ns
parquet_rs-zstd decompress time/taxi 2.94111e+08 2.95912e+08 0.993916 ns
parquet_rs-zstd decompress time/taxi throughput 1.60079 1.59105 1.00612 bytes/ns
compress time/AirlineSentiment 267761 265091 1.01007 ns
compress time/AirlineSentiment throughput 0.00761874 0.00769546 0.990031 bytes/ns
parquet_rs-zstd compress time/AirlineSentiment 46626.7 47587.7 0.979806 ns
parquet_rs-zstd compress time/AirlineSentiment throughput 0.0437518 0.0428683 1.02061 bytes/ns
decompress time/AirlineSentiment 98382.1 98356.4 1.00026 ns
decompress time/AirlineSentiment throughput 0.0207355 0.0207409 0.999738 bytes/ns
parquet_rs-zstd decompress time/AirlineSentiment 27908.8 28362.2 0.984017 ns
parquet_rs-zstd decompress time/AirlineSentiment throughput 0.0730951 0.0719268 1.01624 bytes/ns
compress time/Arade 3.0385e+09 2.98163e+09 1.01907 ns
compress time/Arade throughput 0.259019 0.26396 0.981284 bytes/ns
parquet_rs-zstd compress time/Arade 3.2844e+09 3.2945e+09 0.996933 ns
parquet_rs-zstd compress time/Arade throughput 0.239627 0.238892 1.00308 bytes/ns
decompress time/Arade 5.8912e+08 5.77904e+08 1.01941 ns
decompress time/Arade throughput 1.33594 1.36187 0.98096 bytes/ns
parquet_rs-zstd decompress time/Arade 6.20952e+08 6.17934e+08 1.00488 ns
parquet_rs-zstd decompress time/Arade throughput 1.26746 1.27365 0.99514 bytes/ns
compress time/Bimbo 8.86349e+09 8.64457e+09 1.02532 ns
compress time/Bimbo throughput 0.803448 0.823795 0.975301 bytes/ns
parquet_rs-zstd compress time/Bimbo 2.10786e+10 2.11462e+10 0.996801 ns
parquet_rs-zstd compress time/Bimbo throughput 0.337848 0.336767 1.00321 bytes/ns
decompress time/Bimbo 3.39527e+09 3.43964e+09 0.987099 ns
decompress time/Bimbo throughput 2.09743 2.07037 1.01307 bytes/ns
parquet_rs-zstd decompress time/Bimbo 2.44764e+09 2.43204e+09 1.00642 ns
parquet_rs-zstd decompress time/Bimbo throughput 2.90947 2.92814 0.993625 bytes/ns
compress time/CMSprovider 2.32431e+10 2.27298e+10 1.02259 ns
compress time/CMSprovider throughput 0.221534 0.226538 0.977913 bytes/ns
parquet_rs-zstd compress time/CMSprovider 1.84162e+10 1.83475e+10 1.00375 ns
parquet_rs-zstd compress time/CMSprovider throughput 0.279599 0.280647 0.996267 bytes/ns
decompress time/CMSprovider 7.64724e+09 7.62356e+09 1.00311 ns
decompress time/CMSprovider throughput 0.673335 0.675427 0.996903 bytes/ns
parquet_rs-zstd decompress time/CMSprovider 3.33929e+09 3.30093e+09 1.01162 ns
parquet_rs-zstd decompress time/CMSprovider throughput 1.54199 1.55991 0.988514 bytes/ns
compress time/Euro2016 2.06466e+09 2.03921e+09 1.01248 ns
compress time/Euro2016 throughput 0.190469 0.192847 0.987671 bytes/ns
parquet_rs-zstd compress time/Euro2016 1.70907e+09 1.71627e+09 0.995809 ns
parquet_rs-zstd compress time/Euro2016 throughput 0.230098 0.229134 1.00421 bytes/ns
decompress time/Euro2016 3.05918e+08 3.03125e+08 1.00922 ns
decompress time/Euro2016 throughput 1.28549 1.29734 0.990869 bytes/ns
parquet_rs-zstd decompress time/Euro2016 4.50355e+08 4.48424e+08 1.00431 ns
parquet_rs-zstd decompress time/Euro2016 throughput 0.873211 0.87697 0.995713 bytes/ns
compress time/Food 9.04451e+08 8.80519e+08 1.02718 ns
compress time/Food throughput 0.367869 0.377868 0.97354 bytes/ns
parquet_rs-zstd compress time/Food 1.10016e+09 1.09527e+09 1.00446 ns
parquet_rs-zstd compress time/Food throughput 0.302428 0.303778 0.995556 bytes/ns
decompress time/Food 2.08066e+08 2.05581e+08 1.01209 ns
decompress time/Food throughput 1.59911 1.61843 0.988057 bytes/ns
parquet_rs-zstd decompress time/Food 2.1263e+08 2.08629e+08 1.01918 ns
parquet_rs-zstd decompress time/Food throughput 1.56478 1.59479 0.981182 bytes/ns
compress time/HashTags 3.77655e+09 3.69958e+09 1.02081 ns
compress time/HashTags throughput 0.213025 0.217457 0.979618 bytes/ns
parquet_rs-zstd compress time/HashTags 2.72174e+09 2.72629e+09 0.99833 ns
parquet_rs-zstd compress time/HashTags throughput 0.295583 0.295089 1.00167 bytes/ns
decompress time/HashTags 1.12923e+09 1.12197e+09 1.00646 ns
decompress time/HashTags throughput 0.712434 0.717039 0.993578 bytes/ns
parquet_rs-zstd decompress time/HashTags 7.03457e+08 6.80726e+08 1.03339 ns
parquet_rs-zstd decompress time/HashTags throughput 1.14364 1.18183 0.967687 bytes/ns
compress time/TPC-H l_comment chunked 1.08606e+09 1.06821e+09 1.01672 ns
compress time/TPC-H l_comment chunked throughput 0.229451 0.233287 0.983559 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment chunked 9.57993e+08 9.50081e+08 1.00833 ns
parquet_rs-zstd compress time/TPC-H l_comment chunked throughput 0.260126 0.262292 0.991741 bytes/ns
decompress time/TPC-H l_comment chunked 1.09868e+08 1.04206e+08 1.05433 ns
decompress time/TPC-H l_comment chunked throughput 2.26817 2.39141 0.948467 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked 2.45063e+08 2.41175e+08 1.01612 ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked throughput 1.01688 1.03327 0.984136 bytes/ns
compress time/TPC-H l_comment canonical 1.07324e+09 1.04202e+09 1.02996 ns
compress time/TPC-H l_comment canonical throughput 0.232193 0.239149 0.970913 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment canonical 9.68258e+08 9.54593e+08 1.01431 ns
parquet_rs-zstd compress time/TPC-H l_comment canonical throughput 0.257367 0.261051 0.985888 bytes/ns
decompress time/TPC-H l_comment canonical 1.10489e+08 1.04676e+08 1.05554 ns
decompress time/TPC-H l_comment canonical throughput 2.2554 2.38066 0.947387 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment canonical 2.44619e+08 2.40521e+08 1.01704 ns
parquet_rs-zstd decompress time/TPC-H l_comment canonical throughput 1.01872 1.03608 0.983246 bytes/ns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants