Skip to content

Latest commit

 

History

History
432 lines (327 loc) · 14.8 KB

elasticsearch-object-mapping.adoc

File metadata and controls

432 lines (327 loc) · 14.8 KB

Elasticsearch Object Mapping

Spring Data Elasticsearch Object Mapping is the process that maps a Java object - the domain entity - into the JSON representation that is stored in Elasticsearch and back.

Earlier versions of Spring Data Elasticsearch used a Jackson based conversion, Spring Data Elasticsearch 3.2.x introduced the Meta Model Object Mapping. As of version 4.0 only the Meta Object Mapping is used, the Jackson based mapper is not available anymore and the MappingElasticsearchConverter is used.

The main reasons for the removal of the Jackson based mapper are:

  • Custom mappings of fields needed to be done with annotations like @JsonFormat or @JsonInclude. This often caused problems when the same object was used in different JSON based datastores or sent over a JSON based API.

  • Custom field types and formats also need to be stored into the Elasticsearch index mappings. The Jackson based annotations did not fully provide all the information that is necessary to represent the types of Elasticsearch.

  • Fields must be mapped not only when converting from and to entities, but also in query argument, returned data and on other places.

Using the MappingElasticsearchConverter now covers all these cases.

Meta Model Object Mapping

The Metamodel based approach uses domain type information for reading/writing from/to Elasticsearch. This allows to register Converter instances for specific domain type mapping.

Mapping Annotation Overview

The MappingElasticsearchConverter uses metadata to drive the mapping of objects to documents. The metadata is taken from the entity’s properties which can be annotated.

The following annotations are available:

  • @Document: Applied at the class level to indicate this class is a candidate for mapping to the database. The most important attributes are:

    • indexName: the name of the index to store this entity in. This can contain a SpEL template expression like "log-#{T(java.time.LocalDate).now().toString()}"

    • createIndex: flag whether to create an index on repository bootstrapping. Default value is true. See [elasticsearch.repositories.autocreation]

    • versionType: Configuration of version management. Default value is EXTERNAL.

  • @Id: Applied at the field level to mark the field used for identity purpose.

  • @Transient: By default all fields are mapped to the document when it is stored or retrieved, this annotation excludes the field.

  • @PersistenceConstructor: Marks a given constructor - even a package protected one - to use when instantiating the object from the database. Constructor arguments are mapped by name to the key values in the retrieved Document.

  • @Field: Applied at the field level and defines properties of the field, most of the attributes map to the respective Elasticsearch Mapping definitions (the following list is not complete, check the annotation Javadoc for a complete reference):

    • name: The name of the field as it will be represented in the Elasticsearch document, if not set, the Java field name is used.

    • type: The field type, can be one of Text, Keyword, Long, Integer, Short, Byte, Double, Float, Half_Float, Scaled_Float, Date, Date_Nanos, Boolean, Binary, Integer_Range, Float_Range, Long_Range, Double_Range, Date_Range, Ip_Range, Object, Nested, Ip, TokenCount, Percolator, Flattened, Search_As_You_Type. See Elasticsearch Mapping Types. If the field type is not specified, it defaults to FieldType.Auto. This means, that no mapping entry is written for the property and that Elasticsearch will add a mapping entry dynamically when the first data for this property is stored (check the Elasticsearch documentation for dynamic mapping rules).

    • format: One or more built-in date formats, see the next section Date format mapping.

    • pattern: One or more custom date formats, see the next section Date format mapping.

    • store: Flag whether the original field value should be store in Elasticsearch, default value is false.

    • analyzer, searchAnalyzer, normalizer for specifying custom analyzers and normalizer.

  • @GeoPoint: Marks a field as geo_point datatype. Can be omitted if the field is an instance of the GeoPoint class.

  • @ValueConverter defines a class to be used to convert the given property. In difference to a registered Spring Converter this only converts the annotated property and not every property of the given type.

The mapping metadata infrastructure is defined in a separate spring-data-commons project that is technology agnostic.

Date format mapping

Properties that derive from TemporalAccessor or are of type java.util.Date must either have a @Field annotation of type FieldType.Date or a custom converter must be registered for this type. This paragraph describes the use of FieldType.Date.

There are two attributes of the @Field annotation that define which date format information is written to the mapping (also see Elasticsearch Built In Formats and Elasticsearch Custom Date Formats)

The format attributes is used to define at least one of the predefined formats. If it is not defined, then a default value of _date_optional_time and epoch_millis is used.

The pattern attribute can be used to add additional custom format strings. If you want to use only custom date formats, you must set the format property to empty {}.

The following table shows the different attributes and the mapping created from their values:

annotation format string in Elasticsearch mapping

@Field(type=FieldType.Date)

"date_optional_time||epoch_millis",

@Field(type=FieldType.Date, format=DateFormat.basic_date)

"basic_date"

@Field(type=FieldType.Date, format={DateFormat.basic_date, DateFormat.basic_time})

"basic_date||basic_time"

@Field(type=FieldType.Date, pattern="dd.MM.uuuu")

"date_optional_time||epoch_millis||dd.MM.uuuu",

@Field(type=FieldType.Date, format={}, pattern="dd.MM.uuuu")

"dd.MM.uuuu"

Note
If you are using a custom date format, you need to use uuuu for the year instead of yyyy. This is due to a change in Elasticsearch 7.

Range types

When a field is annotated with a type of one of Integer_Range, Float_Range, Long_Range, Double_Range, Date_Range, or Ip_Range the field must be an instance of a class that will be mapped to an Elasticsearch range, for example:

class SomePersonData {

    @Field(type = FieldType.Integer_Range)
    private ValidAge validAge;

    // getter and setter
}

class ValidAge {
    @Field(name="gte")
    private Integer from;

    @Field(name="lte")
    private Integer to;

    // getter and setter
}

As an alternative Spring Data Elasticsearch provides a Range<T> class so that the previous example can be written as:

class SomePersonData {

    @Field(type = FieldType.Integer_Range)
    private Range<Integer> validAge;

    // getter and setter
}

Supported classes for the type <T> are Integer, Long, Float, Double, Date and classes that implement the TemporalAccessor interface.

Mapped field names

Without further configuration, Spring Data Elasticsearch will use the property name of an object as field name in Elasticsearch. This can be changed for individual field by using the @Field annotation on that property.

It is also possible to define a FieldNamingStrategy in the configuration of the client ([elasticsearch.clients]). If for example a SnakeCaseFieldNamingStrategy is configured, the property sampleProperty of the object would be mapped to sample_property in Elasticsearch. A FieldNamingStrategy applies to all entities; it can be overwritten by setting a specific name with @Field on a property.

Mapping Rules

Type Hints

Mapping uses type hints embedded in the document sent to the server to allow generic type mapping. Those type hints are represented as _class attributes within the document and are written for each aggregate root.

Example 1. Type Hints
public class Person {              (1)

  @Id String id;
  String firstname;
  String lastname;
}
{
  "_class" : "com.example.Person", (1)
  "id" : "cb7bef",
  "firstname" : "Sarah",
  "lastname" : "Connor"
}
  1. By default the domain types class name is used for the type hint.

Type hints can be configured to hold custom information. Use the @TypeAlias annotation to do so.

Note
Make sure to add types with @TypeAlias to the initial entity set (AbstractElasticsearchConfiguration#getInitialEntitySet) to already have entity information available when first reading data from the store.
Example 2. Type Hints with Alias
@TypeAlias("human")                (1)
public class Person {

  @Id String id;
  // ...
}
{
  "_class" : "human",              (1)
  "id" : ...
}
  1. The configured alias is used when writing the entity.

Note
Type hints will not be written for nested Objects unless the properties type is Object, an interface or the actual value type does not match the properties declaration.
Disabling Type Hints

It may be necessary to disable writing of type hints when the index that should be used already exists without having the type hints defined in its mapping and with the mapping mode set to strict. In this case, writing the type hint will produce an error, as the field cannot be added automatically.

Type hints can be disabled for the whole application by overriding the method writeTypeHints() in a configuration class derived from AbstractElasticsearchConfiguration (see [elasticsearch.clients]).

As an alternativ they can be disabled for a single index with the @Document annotation:

@Document(indexName = "index", writeTypeHint = WriteTypeHint.FALSE)
Warning
We strongly advise against disabling Type Hints. Only do this if you are forced to. Disabling type hints can lead to documents not being retrieved correctly from Elasticsearch in case of polymorphic data or document retrieval may fail completely.

Geospatial Types

Geospatial types like Point & GeoPoint are converted into lat/lon pairs.

Example 3. Geospatial types
public class Address {

  String city, street;
  Point location;
}
{
  "city" : "Los Angeles",
  "street" : "2800 East Observatory Road",
  "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
}

GeoJson Types

Spring Data Elasticsearch supports the GeoJson types by providing an interface GeoJson and implementations for the different geometries. They are mapped to Elasticsearch documents according to the GeoJson specification. The corresponding properties of the entity are specified in the index mappings as geo_shape when the index mappings is written. (check the Elasticsearch documentation as well)

Example 4. GeoJson types
public class Address {

  String city, street;
  GeoJsonPoint location;
}
{
  "city": "Los Angeles",
  "street": "2800 East Observatory Road",
  "location": {
    "type": "Point",
    "coordinates": [-118.3026284, 34.118347]
  }
}

The following GeoJson types are implemented:

  • GeoJsonPoint

  • GeoJsonMultiPoint

  • GeoJsonLineString

  • GeoJsonMultiLineString

  • GeoJsonPolygon

  • GeoJsonMultiPolygon

  • GeoJsonGeometryCollection

Collections

For values inside Collections apply the same mapping rules as for aggregate roots when it comes to type hints and Custom Conversions.

Example 5. Collections
public class Person {

  // ...

  List<Person> friends;

}
{
  // ...

  "friends" : [ { "firstname" : "Kyle", "lastname" : "Reese" } ]
}

Maps

For values inside Maps apply the same mapping rules as for aggregate roots when it comes to type hints and Custom Conversions. However the Map key needs to a String to be processed by Elasticsearch.

Example 6. Collections
public class Person {

  // ...

  Map<String, Address> knownLocations;

}
{
  // ...

  "knownLocations" : {
    "arrivedAt" : {
       "city" : "Los Angeles",
       "street" : "2800 East Observatory Road",
       "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
     }
  }
}

Custom Conversions

Looking at the Configuration from the previous section ElasticsearchCustomConversions allows registering specific rules for mapping domain and simple types.

Example 7. Meta Model Object Mapping Configuration
@Configuration
public class Config extends AbstractElasticsearchConfiguration {

  @Override
  public RestHighLevelClient elasticsearchClient() {
    return RestClients.create(ClientConfiguration.create("localhost:9200")).rest();
  }

  @Bean
  @Override
  public ElasticsearchCustomConversions elasticsearchCustomConversions() {
    return new ElasticsearchCustomConversions(
      Arrays.asList(new AddressToMap(), new MapToAddress()));       (1)
  }

  @WritingConverter                                                 (2)
  static class AddressToMap implements Converter<Address, Map<String, Object>> {

    @Override
    public Map<String, Object> convert(Address source) {

      LinkedHashMap<String, Object> target = new LinkedHashMap<>();
      target.put("ciudad", source.getCity());
      // ...

      return target;
    }
  }

  @ReadingConverter                                                 (3)
  static class MapToAddress implements Converter<Map<String, Object>, Address> {

    @Override
    public Address convert(Map<String, Object> source) {

      // ...
      return address;
    }
  }
}
{
  "ciudad" : "Los Angeles",
  "calle" : "2800 East Observatory Road",
  "localidad" : { "lat" : 34.118347, "lon" : -118.3026284 }
}
  1. Add Converter implementations.

  2. Set up the Converter used for writing DomainType to Elasticsearch.

  3. Set up the Converter used for reading DomainType from search result.