|
| 1 | +## <span id="query_transform">Query transformation</span> |
| 2 | + |
| 3 | +VOPS is vectorized executor for Postgres. It is implemented as extension and can be used with any version of Postgres |
| 4 | +and doesn't require any changes in Postgres core. Using vector (tile) types VOPS provides columnar format (like Apache Arrow) |
| 5 | +on top of standard Postgres heap. And custom operators for VOPS types implement vector operations, minimizing interpretation overhead. |
| 6 | +Comparing with other vertical solutions for Postgres: cstore_fwd and zedstore, |
| 7 | +VOPS provides more than 10 times improvement of execution speed on some OLAP queries. |
| 8 | + |
| 9 | +VOPS was developed more than three yeas ago and while main paradigm is not changed, |
| 10 | +my vision of how it should be used was significantly changed. |
| 11 | +My original indention was to define VOPS-specific tile types (i.e. vops_int4,...) |
| 12 | +which provides the same operations as native scalar types and thanks to Postgres |
| 13 | +user defined operators allow to write queries for VOPS tables almost in the same was as |
| 14 | +for standard tables. |
| 15 | + |
| 16 | +But very soon it becomes clear that although it is possible to redefine most of operators |
| 17 | +and aggregates, for some of them it is not possible: AND, OR, BETWEEN, COUNT(*). |
| 18 | +And my attempt to replace them with VOPS analogs, i.e. BETWIXT,&,|,... |
| 19 | +require users to learn VOPS "dialect" of SQL. Also there are other cases, |
| 20 | +when VOPS requires changing of standard queries, for example adding explicit type cast for |
| 21 | +string literal. |
| 22 | + |
| 23 | +Another my idea was to use foreign-data-wrappers (FDW) mechanism |
| 24 | +to provide "normal" (row-based) access to VOPS table. vops_fdw should transpose columns into rows. |
| 25 | +Unfortunately limitation of Postgres FDW mechanism, like not supporting parallel query execution, |
| 26 | +devalue any advantages of columnar storage and performance is several times worse than for normal tables. |
| 27 | + |
| 28 | +Finally I come to the conclusion that data should be kept in normal (horizontal) |
| 29 | +table which is most efficient for OLTP operations and importing data. |
| 30 | +And user should create one or more VOPS projections of this table, |
| 31 | +where some subset of columns are "tiles": scalar types are replaced with VOPS types. |
| 32 | +Grouping and ordering of rows in the project allows to store data in the format most efficient for |
| 33 | +some particular subclass of queries. For example Q1 query of TPC-H benchmark performs group by |
| 34 | +linestatus and returnflag, so we can create projection grouped by this columns. |
| 35 | + |
| 36 | +Postgres planner hook tries to substitute query to main table with tranformed query to one of the available |
| 37 | +VOPS projections. If it is possible (requirements for queries which can be executed on projections |
| 38 | +will be listed below), then it transforms query using VOPS operators. |
| 39 | +If it is not possible, then query is executed as usual on the original table. |
| 40 | + |
| 41 | +So user writes query using standard SQL. If there is some existed application, then |
| 42 | +it should not be changed. But if query can be executed on some of the available VOPS projections, |
| 43 | +then it's speed can be increased several times (for example speedup on TPC-H Q1 is about ten times). |
| 44 | + |
| 45 | +Synchronization of main table and VOPS projects is responsibly of user. |
| 46 | +The straightforward solution is to use on-insert trigger to propagate changes to projection. |
| 47 | +But it is actually bad idea: to efficiently pack records in vertical format we need to |
| 48 | +perform bulk inserts. So we need to hold data in some buffer and then perform massive update. |
| 49 | +Fortunately VOPS is useful for OLAP queries and such queries usually do not require most recent data. |
| 50 | +So it is possible to periodically transfer data from main table to projects (for example at night |
| 51 | +when system is idle). If projection is ordered by some monotonic key (i.e timestamp), |
| 52 | +then VOPS is smart enough to append only most recent records which are not parent in the projection. |
| 53 | +If there is no such key, then projections should be recreated from scratch. |
| 54 | + |
| 55 | +As far as main table and projections may not be synchronized, but default |
| 56 | +such implicit query transformation is disabled. |
| 57 | +You should explicitly switch it on using `vops.auto_substitute_projections` parameter. |
| 58 | +It can be done locally for the current session or for the whole Postgres instance. |
| 59 | + |
| 60 | +## <span id="projection_creation">Creation of projections</span> |
| 61 | + |
| 62 | +Now consider creation of partitions more precisely. |
| 63 | + |
| 64 | +In future it may be added to SQL grammar, so that it is possible to write |
| 65 | +`CREATE PROJECTION xxx OF TABLE yyy(column1, column2,...) GROUP BY (column1, column2, ...)`. |
| 66 | +But right now it can be done using `create_projection(projection_name text, source_table regclass, vector_columns text[], scalar_columns text[] default null, order_by text default null)` function. |
| 67 | +First argument of this function specifies name of the projection, second refers to existed Postgres table, `vector_columns` is array of |
| 68 | +column names which should be stores as VOPS tiles, `scalar_columns` is array of grouping columns which type is preserved and |
| 69 | +optional `order_by` parameter specifies name of ordering attribute (explained below). |
| 70 | +The `create_projection(PNAME,...)` functions does the following: |
| 71 | + |
| 72 | +1. Creates projection table with specified name and attributes. |
| 73 | +2. Creates PNAME_refresh() functions which can be used to update projection. |
| 74 | +3. Creates functional BRIN indexes for `first()` and `last()` functions of ordering attribute (if any) |
| 75 | +4. Creates BRIN index on grouping attributes (if any) |
| 76 | +5. Insert information about created projection in `vops_projections` table. This table is used by optimizer to |
| 77 | + automatically substitute table with partition. |
| 78 | + |
| 79 | +The `order_by` attribute is one of the VOPS projection vector columns by which data is sorted. Usually it is some kind of timestamp |
| 80 | +used in *time series* (for example trade date). Presence of such column in projection allows to incrementally update projection. |
| 81 | +Generated `PNAME_refresh()` method calls `populate` method with correspondent values of `predicate` and |
| 82 | +`sort` parameters, selecting from original table only rows with `order_by` column value greater than maximal |
| 83 | +value of this column in the projection. It assumes that `order_by` is unique or at least refresh is done at the moment when there is some gap |
| 84 | +in collected events. In addition to `order_by`, sort list for `populate` includes all scalar (grouping) columns. |
| 85 | +It allows to efficiently group imported data by scalar columns and fill VOPS tiles (vector columns) with data. |
| 86 | + |
| 87 | +When `order_by` attribute is specified, VOPS creates two functional BRIN indexes on `first()` and `last()` |
| 88 | +functions of this attribute. Presence of such indexes allows to efficiently select time slices. If original query contains |
| 89 | +predicate like `(trade_date between '01-01-2017' and '01-01-2018')` then VOPS projection substitution mechanism adds |
| 90 | +`(first(trade_date) >= '01-01-2017' and last(trade_date) >= '01-01-2018')` conjuncts which allow Postgres optimizer to use BRIN |
| 91 | +index to locate affected pages. |
| 92 | + |
| 93 | +In in addition to BRIN indexes for `order_by` attribute, VOPS also creates BRIN index for grouping (scalar) columns. |
| 94 | +Such index allows to efficiently select groups and perform index join. |
| 95 | + |
| 96 | +Presence of scalar columns in VOPS projections allows to used them in index search, grouping or ordering. |
| 97 | +Please notice, that if you are importing data in VOPS projection with scalar columns, |
| 98 | +then input data should be sorted by these columns. And number of duplicated combinations of this columns should be large enough (greater than hundreds). |
| 99 | +Only in this case tiles will be efficiently filled with data. Otherwise you will only loose disk space without any positive effect on performance. |
| 100 | + |
| 101 | + |
| 102 | +Right now projections can be automatically substituted only if: |
| 103 | + |
| 104 | +1. Query doesn't contain joins. |
| 105 | +2. Query performs aggregation of vector (tile) columns. |
| 106 | +3. All other expressions in target list, `ORDER BY` / `GROUP BY` clauses refer only to scalar attributes of projection. |
| 107 | + |
| 108 | +Projection can be removed using `drop_projection(projection_name text)` function. |
| 109 | +It not only drops the correspondent table, but also removes information about it from `vops_partitions` table |
| 110 | +and drops generated refresh function. |
| 111 | + |
| 112 | +## <span id="example">Example of using projections</span> |
| 113 | + |
| 114 | + |
| 115 | +``` |
| 116 | +create extension vops; |
| 117 | +
|
| 118 | +create table lineitem( |
| 119 | + l_orderkey integer, |
| 120 | + l_partkey integer, |
| 121 | + l_suppkey integer, |
| 122 | + l_linenumber integer, |
| 123 | + l_quantity real, |
| 124 | + l_extendedprice real, |
| 125 | + l_discount real, |
| 126 | + l_tax real, |
| 127 | + l_returnflag "char", |
| 128 | + l_linestatus "char", |
| 129 | + l_shipdate date, |
| 130 | + l_commitdate date, |
| 131 | + l_receiptdate date, |
| 132 | + l_shipinstruct char(25), |
| 133 | + l_shipmode char(10), |
| 134 | + l_comment char(44), |
| 135 | + l_dummy char(1)); |
| 136 | +
|
| 137 | +-- Create VOPS projection |
| 138 | +select create_projection('vops_lineitem','lineitem',array['l_shipdate','l_quantity','l_extendedprice','l_discount','l_tax'],array['l_returnflag','l_linestatus']); |
| 139 | +
|
| 140 | +\timing |
| 141 | +
|
| 142 | +-- Load data in main table |
| 143 | +copy lineitem from '/mnt/data/lineitem.tbl' delimiter '|' csv; |
| 144 | +
|
| 145 | +-- Transfer data from main table to projections. |
| 146 | +select vops_lineitem_refresh(); |
| 147 | +
|
| 148 | +-- Allow query substition |
| 149 | +set vops.auto_substitute_projections TO on; |
| 150 | +
|
| 151 | +-- Now let VOPS planner hook to use VOPS projections instead of main table |
| 152 | +select |
| 153 | + l_returnflag, |
| 154 | + l_linestatus, |
| 155 | + sum(l_quantity) as sum_qty, |
| 156 | + sum(l_extendedprice) as sum_base_price, |
| 157 | + sum(l_extendedprice*(1-l_discount)) as sum_disc_price, |
| 158 | + sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, |
| 159 | + avg(l_quantity) as avg_qty, |
| 160 | + avg(l_extendedprice) as avg_price, |
| 161 | + avg(l_discount) as avg_disc, |
| 162 | + count(*) as count_order |
| 163 | +from |
| 164 | + lineitem |
| 165 | +where |
| 166 | + l_shipdate <= '1998-12-01' |
| 167 | +group by |
| 168 | + l_returnflag, |
| 169 | + l_linestatus |
| 170 | +order by |
| 171 | + l_returnflag, |
| 172 | + l_linestatus; |
| 173 | +``` |
| 174 | + |
| 175 | +## <span id="vops_rules">Three rules of using VOPS</span> |
| 176 | + |
| 177 | +Now lets formulate 1-2-3 rules of using VOPS. |
| 178 | + |
| 179 | +1. Consider whether VOPS can help to speed-up your application. VOPS will be useful if |
| 180 | +your applications runs OLAP queries on large volumes of data and these queries mostly do some |
| 181 | +filtering and aggregation (may be with grouping). If queries contain joins, then VOPS can not help you |
| 182 | +unless you perform join on foreign key which was left as scalar in projections. |
| 183 | +But even in the last case automatic substations of query will not work and you have to write query manually. |
| 184 | + |
| 185 | +2. Find out whether your queries perform some particular grouping. Size of each group should be large enough |
| 186 | +(thousands) and number of different grouping combination should be small (because maintaining more than few projections |
| 187 | +will be too expensive). Create projection for each group. |
| 188 | + |
| 189 | +3. Define projection refresh policy. If original table contains some monotonic key (like timestamp or |
| 190 | +auto-generated column), then specify it as ordering field during projection creation. It will allow to incrementally update projection, |
| 191 | +appending only new records. VOPS is mostly oriented on work with append-only tables. There is no way to efficiently handle updates. |
| 192 | +To make transformation from horizontal to vertical format as efficient as possible, you should perform bulk update, |
| 193 | +adding relatively large number of records. |
| 194 | +Adding records to projection one-by-one leads to very bad space utilization, because VOPS is not able to append data to existed tiles. |
| 195 | + |
| 196 | +Once you created projections and populated them with data, you can try to run queries. |
| 197 | +Please use `EXPLAIN` to check whether plan for projection was chosen. |
| 198 | + |
| 199 | +## <span id="vops_pros_and_contras">VOPS pros and contras</span> |
| 200 | + |
| 201 | +Advantages of VOPS: |
| 202 | +1. Significant (several times) increase of execution speed on some OLAP queries. |
| 203 | +VOPS provides speed comparable with Yandex Clickhouse, preserving all power of Postgres relational model. |
| 204 | +2. Can be used with any version of Postgres: no need to install Postgres forks and migrate data to them |
| 205 | +(unlike CitusDB/GreenPlum) |
| 206 | +3. Fully transnational, data can be managed using standard Postgres utilities, like basebackup, pg_dump... |
| 207 | +4. VOPS doesn't affect original table, so presence of VOPS doesn't somehow affect data consistency/durability, |
| 208 | +as well as performance of OLTP operations on original table, for example data insertion speed. |
| 209 | +5. Doesn't require rewriting queries and changing your application (in case of using automatic query substitution). |
| 210 | + |
| 211 | +Certainly VOPS approach has some limitations and disadvantages: |
| 212 | +1. Extra space needed for VOPS projections: maintaining N projections requires N times more storage space. |
| 213 | +2. Original table and projections are not synchronized, so online analytic is not possible |
| 214 | +(OLAP queries are not working with most recent data). |
| 215 | +3. Transferring data from main table to projection may take significant amount of time and right now this operation can not be done in parallel. |
| 216 | +4. Only small subset of OLAP queries can be efficiently handled by VOPS (joins, subqueries, window functions and many other things are not supported). |
| 217 | +5. No columnar specific compression is performed (VOPS relies on standard Postgres compression mechanism which is not so efficient). |
| 218 | +6. Some query plans may return incorrect results (for example VOPS correctly works for hash aggregate, but not for sort aggregate). |
| 219 | + |
| 220 | +I hope that this small document can help you to understand whether VOPS may be useful in your case. |
0 commit comments