Skip to content

Latest commit

 

History

History
424 lines (329 loc) · 16.9 KB

File metadata and controls

424 lines (329 loc) · 16.9 KB

Batching

JDBC batching

JDBC offers support for batching together SQL statements that can be represented as a single PreparedStatement. Implementation wise this generally means that drivers will send the batched operation to the server in one call, which can save on network calls to the database. Hibernate can leverage JDBC batching. The following settings control this behavior.

hibernate.jdbc.batch_size

Controls the maximum number of statements Hibernate will batch together before asking the driver to execute the batch. Zero or a negative number disables this feature.

hibernate.jdbc.batch.builder

Names the implementation class used to manage batching capabilities. It is almost never a good idea to switch from Hibernate’s default implementation. But if you wish to, this setting would name the org.hibernate.engine.jdbc.batch.spi.BatchBuilder implementation to use.

hibernate.order_updates

Forces Hibernate to order SQL updates by the entity type and the primary key value of the items being updated. This allows for more batching to be used. It will also result in fewer transaction deadlocks in highly concurrent systems. Comes with a performance hit, so benchmark before and after to see if this actually helps or hurts your application.

hibernate.order_inserts

Forces Hibernate to order inserts to allow for more batching to be used. Comes with a performance hit, so benchmark before and after to see if this actually helps or hurts your application.

Note

Since version 5.2, Hibernate allows overriding the global JDBC batch size given by the hibernate.jdbc.batch_size configuration property on a per Session basis.

Example 1. Hibernate specific JDBC batch size configuration on a per Session basis
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]

Session batching

The following example shows an anti-pattern for batch inserts.

Example 2. Naive way to insert 100 000 entities with Hibernate
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]

There are several problems associated with this example:

  1. Hibernate caches all the newly inserted Person instances in the session-level cache, so, when the transaction ends, 100 000 entities are managed by the persistence context. If the maximum memory allocated to the JVM is rather low, this example could fail with an OutOfMemoryException. The Java 1.8 JVM allocated either 1/4 of available RAM or 1Gb, which can easily accommodate 100 000 objects on the heap.

  2. long-running transactions can deplete a connection pool so other transactions don’t get a chance to proceed.

  3. JDBC batching is not enabled by default, so every insert statement requires a database round trip. To enable JDBC batching, set the hibernate.jdbc.batch_size property to an integer between 10 and 50.

Important

Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.

Batch inserts

When you make new objects persistent, employ methods flush() and clear() to the session regularly, to control the size of the first-level cache.

Example 3. Flushing and clearing the Session
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]

Session scroll

When you retrieve and update data, flush() and clear() the session regularly. In addition, use method scroll() to take advantage of server-side cursors for queries that return many rows of data.

Example 4. Using scroll()
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]
Important

If left unclosed by the application, Hibernate will automatically close the underlying resources (e.g. ResultSet and PreparedStatement) used internally by the ScrollableResults when the current transaction is ended (either commit or rollback).

However, it is good practice to close the ScrollableResults explicitly.

StatelessSession

StatelessSession is an alternative to Session and provides:

  • a command-oriented API

  • with no associated persistence context.

Thus, a stateless session is a slightly lower-level abstraction that’s closer to the underlying JDBC activity:

  • there’s no first-level cache, and

  • there’s no transactional write-behind or automatic dirty checking.

Instead, persistence operations occur synchronously when a method of StatelessSession is invoked, and entities returned by a stateless session are always detached.

Tip

A stateless session may be used to stream data to and from the database in the form of detached objects. With a stateless session, there’s no need to explicitly manage the size of the first-level cache by explicitly clearing the persistence context.

The StatelessSession API comes with certain limitations:

  • operations performed using a stateless session never cascade to associated instances,

  • lazy loading of associations is not transparent, and is only available via an explicit operation named fetch(), and

  • operations performed via a stateless session bypass Hibernate’s event model and action queue.

Important
Due to the lack of a first-level cache, stateless sessions are vulnerable to data aliasing effects.
Example 5. Using a StatelessSession
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]

The Person instances returned by the query are immediately detached. They are never associated with any persistence context.

Note

The insert(), update(), and delete() operations defined by the StatelessSession interface operate directly on database rows. They cause the corresponding SQL operations to be executed immediately. They have different semantics from the save(), saveOrUpdate(), and delete() operations defined by the Session interface.

Hibernate Query Language for DML

DML, or Data Manipulation Language, refers to SQL statements such as INSERT, UPDATE, and DELETE. Hibernate provides methods for bulk SQL-style DML statement execution, in the form of Hibernate Query Language (HQL).

HQL/JPQL for UPDATE and DELETE

Both the Hibernate native Query Language and JPQL (Java Persistence Query Language) provide support for bulk UPDATE and DELETE.

Example 6. Pseudo-syntax for UPDATE and DELETE statements using HQL
UPDATE FROM EntityName e WHERE e.name = ?

DELETE FROM EntityName e WHERE e.name = ?
Note

Although the FROM and WHERE clauses are optional, it is good practice to declare them explicitly.

The FROM clause can only refer to a single entity, which can be aliased. If the entity name is aliased, any property references must be qualified using that alias. If the entity name is not aliased, then it is illegal for any property references to be qualified.

Note

Joins, either implicit or explicit, are prohibited in a bulk HQL query. You can use sub-queries in the WHERE clause, and the sub-queries themselves can contain joins.

Example 7. Executing a JPQL UPDATE, using the Query.executeUpdate()
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]
Example 8. Executing an HQL UPDATE, using the Query.executeUpdate()
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]

In keeping with the EJB3 specification, HQL UPDATE statements, by default, do not effect the version or the timestamp property values for the affected entities. You can use a versioned update to force Hibernate to reset the version or timestamp property values, by adding the VERSIONED keyword after the UPDATE keyword.

Example 9. Updating the version of timestamp
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]
Note

If you use the VERSIONED statement, you cannot use custom version types that implement the org.hibernate.usertype.UserVersionType.

This feature is only available in HQL since it’s not standardized by Jakarta Persistence.

Example 10. A JPQL DELETE statement
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]
Example 11. An HQL DELETE statement
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]

Method Query.executeUpdate() returns an int value, which indicates the number of entities affected by the operation. This may or may not correlate to the number of rows affected in the database. A JPQL/HQL bulk operation might result in multiple SQL statements being executed, such as for joined-subclass. In the example of joined-subclass, a DELETE against one of the subclasses may actually result in deletes in the tables underlying the join, or further down the inheritance hierarchy.

HQL syntax for INSERT

Example 12. Pseudo-syntax for INSERT-SELECT statements
INSERT INTO EntityName
	properties_list
SELECT select_list
FROM ...

Alternatively one can also declare individual values

Example 13. Pseudo-syntax for INSERT-VALUES statements
INSERT INTO EntityName
	properties_list
VALUES values_list

The properties_list is analogous to the column specification in the SQL INSERT statement. Note that INSERT statements are inherently non-polymorphic, so it is not possible to use an EntityName which is abstract or refer to subclass properties.

The SELECT statement can be any valid HQL select query, but the return types must match the types expected by the INSERT. Hibernate verifies the return types during query compilation, instead of expecting the database to check it. Problems might result from Hibernate types which are equivalent, rather than equal. One such example is a mismatch between a property defined as an org.hibernate.type.StandardBasicTypes.DATE and a property defined as an org.hibernate.type.StandardBasicTypes.TIMESTAMP, even though the database may not make a distinction, or may be capable of handling the conversion.

If id property is not specified in the properties_list, Hibernate generates a value automatically. Automatic generation is only available if you use ID generators which operate on the database. Otherwise, Hibernate throws an exception during parsing. Available in-database generators implement org.hibernate.id.PostInsertIdentifierGenerator.

For properties mapped as either version or timestamp, the insert statement gives you two options. You can either specify the property in the properties_list, in which case its value is taken from the corresponding select expressions or omit it from the properties_list, in which case the seed value defined by the org.hibernate.type.descriptor.java.VersionJavaType is used.

Example 14. HQL INSERT statement
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/batch/BatchTests.java[role=include]

This section is only a brief overview of HQL. For more information, see Hibernate Query Language.

Bulk mutation strategies

When a bulk mutation involves multiple tables, Hibernate has to issue individual DML statements to the respective tables. Since the mutation itself could have an effect on the conditions used in the statement, it’s generally not possible to simply execute parts of the DML statement against the respective tables. Instead, Hibernate has to temporarily remember which rows will be affected, and execute the DML statements based on these rows.

Usually, Hibernate will make use of local or global temporary tables to remember the primary keys of the rows. For some databases, currently only PostgreSQL and DB2, a more advanced strategy (CteMutationStrategy) is used, which makes use of DML in CTE support to execute the whole operation in one SQL statement.

The chosen strategy, unless overridden through the hibernate.query.mutation_strategy setting, is based on the Dialect support through org.hibernate.dialect.Dialect.getFallbackSqmMutationStrategy.

Class diagram

Considering we have the following entities:

Entity class diagram

The Person entity is the base class of this entity inheritance model, and is mapped as follows:

Example 15. Bulk mutation base class entity
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/bulkid/AbstractMutationStrategyCompositeIdTest.java[role=include]

Both the Doctor and Engineer entity classes extend the Person base class:

Example 16. Bulk mutation subclass entities
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/bulkid/AbstractMutationStrategyIdTest.java[role=include]
Inheritance tree bulk processing

Now, when you try to execute a bulk entity delete query:

Example 17. Bulk mutation delete query example
link:../../../../../../../hibernate-core/src/test/java/org/hibernate/orm/test/bulkid/AbstractMutationStrategyCompositeIdTest.java[role=include]
link:extras/batch-bulk-hql-temp-table-delete-query-example.sql[role=include]

HT_Person is a temporary table that Hibernate creates to hold all the entity identifiers that are to be updated or deleted by the bulk operation. The temporary table can be either global or local, depending on the underlying database capabilities.

Non-temporary table bulk mutation strategies

The strategies outlined above depend on the creation of temporary tables, which Hibernate creates on startup if they don’t already exist. At present this process is not integrated in the schema management tooling, and this requires that the user have the required permissions to alter the database schema.

If the Hibernate session user lacks these permissions, you will need to either:

  • alter your schema through a different user with more permissions, to add a global temporary table named HTE_<root entity table name>, which contains all columns of all tables involved in the entity hierarchy.
    This will allow insert, update and delete in HQL for multi-table entities.

  • OR configure Hibernate ORM to use the (badly-performing) inline strategy (for mutations only!):

<property name="hibernate.query.mutation_strategy"
          value="org.hibernate.query.sqm.mutation.internal.inline.InlineMutationStrategy"
/>
Important

We strongly recommend the use of the first option, i.e. manually adding the temporary tables, because the inline strategy is set to be removed in a future release. Also, there is no equivalent strategy for inserts.

Additionally, automatic creation of temporary tables should be deactivated. This is done by setting the

hibernate.query.mutation_strategy.global_temporary.create_tables and hibernate.query.mutation_strategy.global_temporary.drop_tables

or

hibernate.query.mutation_strategy.persistent.create_tables and hibernate.query.mutation_strategy.persistent.drop_tables

properties (depending on the default strategy for the dialect) to false

With the inline strategy, when running the previous test case, Hibernate generates the following SQL statements:

Example 18. InlineIdsInClauseBulkIdStrategy delete entity query example
link:extras/batch-bulk-hql-InlineIdsInClauseBulkIdStrategy-delete-query-example.sql[role=include]

So, the entity identifiers are selected first and used for each particular update or delete statement.