-
Notifications
You must be signed in to change notification settings - Fork 477
Client Improvements Plan
This is a list of improvements that could form the next major relase of the InfluxDb java client.
It is a best practice in java world to represent data as objects. The current library provides capability to serialize query results to java beans:
@Measurement(name = "cpu")
public class Cpu {
@Column(name = "time")
private Instant time;
@Column(name = "host", tag = true)
private String hostname;
@Column(name = "region", tag = true)
private String region;
@Column(name = "idle")
private Double idle;
@Column(name = "happydevop")
private Boolean happydevop;
@Column(name = "uptimesecs")
private Long uptimeSecs;
// getters (and setters if you need)
}
QueryResult queryResult = influxDB.query(new Query("SELECT * FROM cpu", dbName));
InfluxDBResultMapper resultMapper = new InfluxDBResultMapper(); // thread-safe - can be reused
List<Cpu> cpuList = resultMapper.toPOJO(queryResult, Cpu.class);
- The API isn't very nice - we don't need InfluxDBResultMapper.
- It is not efficient - this will always reinstantiate the whole result as java beans at once
Iterator<Cpu> cpus=influxDB.query(new Query("SELECT * FROM cpu", dbName, Cpu.class));
This mechanism works nicely also with chunking.
To write a data point you need to serialize it as follows:
Point point = Point.measurement("disk")
.time(System.currentTimeMillis(), TimeUnit.MILLISECONDS)
.addField("used", 80L)
.addField("free", 1L)
.build();
influxDB.write(dbName, rpName, point);
There is no way to use the previously defined and annotated Cpu class to write a data point.
@Measurement(name = "cpu")
public class Cpu {
public Cpu(Instant time, String host, String region, Double idle) {
this.time=time;
this.host=host;
this.region=region;
this.idle=idle;
}
....
}
....
Cpu cpu=new Cpu(new Instant(), 80L, 1L);
influxDB.write(dbName, rpName, cpu);
Point point = Point.measurement("disk")
.time(System.currentTimeMillis(), TimeUnit.MILLISECONDS)
.addField("used", 80)
.addField("free", 1.0)
-
If the current approach of writing data points into InfluxDB is used it requires user to be quite careful - first write into a measurement defines data types of its fields. In the example above the 'used' field should have been a float, but if submitted as is it will be an integer. The measurement will have to be dropped and recreated to fix this.
-
Also tag structure is usually defined for a measurement (some fields/ tags are required etc). With this approach it is easy forget about a tag/field.
-
It is easy to make a typo when defining field/tag or evean measurement.
Allow the user to define a schema for the measurement where each write will have to be compliant with the schema. The javascript client has solved this:
https://node-influx.github.io/class/src/index.js%7EInfluxDB.html
There is a request to provide async API for the client.
https://github.com/influxdata/influxdb-java/issues/386
This is also related to the error handling problem since we need to signal errors correctly in the async scenario. We don't want to introduce new API that would be changed after fixing the problem below.
Currently asynchronous processing is available for certaing use cases. For example when writing data points you have to explicitly enable batching to get async behavior.
Provide asynchronous (callback based) method for missing use cases.
The error handling of the library is very basic - it just detects errors based on non-2xx error code returned by influx DB. No further analysis of the error information contained in the response is performed.
There is already a request to make distinction between errors caused by the client (http status 4xx) and server failures (http staus 5xx)
https://github.com/influxdata/influxdb-java/issues/375
As a solution we would implement a hiearchy of exception objects (for example InfluxDBClientException, InfluxDBServerException inheriting from the existing InfluxDBException.
Also 4xx status codes we would parse the JSON delivered and provide correct error message sent by Influx as the InfluxDBException message property.
The current client doesn't expect there might be partial writes. The client should get error information only for the data points that failed to be written. This applies when the batching mode is used.
As mentioned in the previous section we would provide additional method on the exception object to return information about data point that failed to be written.
Recently it has been resolved the problem of setting the consistency level setting:
https://github.com/influxdata/influxdb-java/pull/385
However, there still a problem of propagating the detailed error information to the client.
There was a request to handle errors during batch writes (and it has been fulfilled), however the solution using BiConsumer interface is not very nice and should be reworked so that it is able to transfer error information mentioned above.
Also the user might get notified not only on errors but also when the points were sucessfully written into influx db.
Related information:
https://github.com/influxdata/influxdb-java/pull/319 https://github.com/influxdata/influxdb-java/issues/381
Current interface doesn't force the user to handle/catch errors that happen when evaluating chunked query responses.
The documentation even shows an example where error handling is completely missing.
All the improvements above will have impact on the API, especially write methods. Therefore we should avoid having too many of them and we want them to behave predictable.
Still we would keep the current API available for backward compatibility, perhaps deprecate some of the existing methods we we see these are no londer necessary.
We would also fix the following issue: