Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IGNITE-6141 JDBC: add basic support for BLOB and CLOB types #11492

Merged
merged 16 commits into from
Sep 11, 2024

Conversation

skorotkov
Copy link
Contributor

@skorotkov skorotkov commented Aug 22, 2024

Add simple support for BLOB/CLOB to thin JDBC driver and CLOB support to client JDBC driver.

Note, this implementation doesn't support the creation of LOBs via streams.


Thank you for submitting the pull request to the Apache Ignite.

In order to streamline the review of the contribution
we ask you to ensure the following steps have been taken:

The Contribution Checklist

  • There is a single JIRA ticket related to the pull request.
  • The web-link to the pull request is attached to the JIRA ticket.
  • The JIRA ticket has the Patch Available state.
  • The pull request body describes changes that have been made.
    The description explains WHAT and WHY was made instead of HOW.
  • The pull request title is treated as the final commit message.
    The following pattern must be used: IGNITE-XXXX Change summary where XXXX - number of JIRA issue.
  • A reviewer has been mentioned through the JIRA comments
    (see the Maintainers list)
  • The pull request has been checked by the Teamcity Bot and
    the green visa attached to the JIRA ticket (see TC.Bot: Check PR)

Notes

If you need any help, please email [email protected] or ask anу advice on http://asf.slack.com #ignite channel.

@skorotkov skorotkov marked this pull request as ready for review August 30, 2024 09:51
@skorotkov skorotkov changed the title IGNITE-6141 JDBC thin: support BLOB and CLOB types IGNITE-6141 JDBC: support BLOB and CLOB types Sep 3, 2024
@skorotkov skorotkov changed the title IGNITE-6141 JDBC: support BLOB and CLOB types IGNITE-6141 JDBC: add basic support for BLOB and CLOB types Sep 3, 2024
@@ -1090,14 +1091,14 @@ private List<? extends FieldsQueryCursor<List<?>>> executeDml(
if (roEx != null) {
throw new IgniteSQLException(
"Failed to execute DML statement. Cluster in read-only mode [stmt=" + qryDesc.sql() +
", params=" + Arrays.deepToString(qryParams.arguments()) + "]",
", params=" + S.toString(QueryParameters.class, qryParams) + "]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one of the query parameter is huge (say large CLOB/BLOB) usage of Arrays.deepToString does two "bad" things which increases heap memory usage:

  • creates full string representation of the large argument
  • once an exception is logged this large string is passed to logging framework

This can cause unneeded and unexpected OOM error just because of logging. The actual error is not logged in such case. Say I get the following error stack during testing:

[2024-08-29T22:03:37,692][ERROR][client-connector-#149%jdbc_thin_blob_test.JdbcThinLobTest.test_jdbc_thin_lob.blob_size.1073741824.clob_size.1024.server_heap.12.client_heap.8.ignite_version.dev%] ....
java.lang.OutOfMemoryError: null
at java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:214) ~[?:?]
at java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:206) ~[?:?]
at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:173) ~[?:?]
at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:775) ~[?:?]
at java.base/java.lang.StringBuilder.append(StringBuilder.java:252) ~[?:?]
at java.base/java.util.Arrays.toString(Arrays.java:4988) ~[?:?]
at java.base/java.util.Arrays.deepToString(Arrays.java:5189) ~[?:?]
at java.base/java.util.Arrays.deepToString(Arrays.java:5161) ~[?:?]
at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeDml(IgniteH2Indexing.java:1100) 

On the other hand the S.toString handles huge strings and huge byte arrays with care. It skips most of the data and replaces it with elipsis. Like [1, 2, ... and 1073741724 more] for byte arrays and "aaa ... and 1041353 skipped ... aaa" for strings.

At the similar context SQL arguments are already logged this carefull way. See for example the JdbcQueryExecuteRequest::toString. If such a request is added into the log string the huge arguments are stripped.


/**
* Query parameters which vary between requests having the same execution plan. Essentially, these are the arguments
* of original {@link org.apache.ignite.cache.query.SqlFieldsQuery} which are not part of {@link QueryDescriptor}.
*/
public class QueryParameters {
/** Arguments. */
@GridToStringInclude(sensitive = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only field that will be printed out?
Don't we need other fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. It worth adding all parameters as well.

Looks like In current code the QueryParameters are not logged as a whole. Only as the qryParams.arguments().

But adding other parameters should not break anything and would add more info for errors analisys.

Wil add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -54,7 +54,7 @@ public JdbcBlob(byte[] arr) {
@Override public byte[] getBytes(long pos, int len) throws SQLException {
ensureNotClosed();

if (pos < 1 || arr.length - pos < 0 || len < 0)
if (pos < 1 || (arr.length - pos < 0 && arr.length > 0) || len < 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you, please, clarify - why do we need new clause arr.length > 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed to support the empty Blobs (blobs backed by an empty array).

Such blobs are returned from the java.sql.Connection::createBlob() which is a standard way to get blob from arbitrary JDBC driver. It also may be created via the concrete constructor as well like in the unittest below:

        blob = new JdbcBlob(new byte[0]);
        assertEquals(0, blob.getBytes(1, 0).length);

This is a correct code according to java.sql.Blob contract.


The corrsponding testcase was added into the file modules/clients/src/test/java/org/apache/ignite/internal/jdbc2/JdbcBlobTest.java in this PR.

@Override public String getSubString(long pos, int len) throws SQLException {
ensureNotClosed();

if (pos < 1 || len < 0 || pos - 1 + len > chars.length())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we cast pos to standard value as a first step to overcome API inconsistency and make code more readable:

pos = pos - 1;

       if (pos < 0 || len < 0 || pos + len > chars.length())
            throw new SQLException("Invalid argument. Position should be greater than 0. Length should not be " +
                "negative. Position + length should be less than CLOB size [pos=" + (pos+1) + ", length=" + len + ']');

        return chars.substring((int)pos, (int)pos + len);

/**
* CLOB implementation for Ignite JDBC driver.
*/
public class JdbcClob implements Clob {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix all pos argument somehow to make value 0 based and consistent with String and any other API.
For example, make copy of all methods with pos, like:

public String getSubString(int pos, int len) {
    return getSubStringInternal(pos - 1, len);
}

private getSubstringInternal(int pos, int len) {
    // 0 index based implementation.
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we migrate to the char[] based implementation?
It's not clear from the API, but looks like Reader getCharacterStream() or getAsciiStream() expects to see changes in underlying object if they happens after getCharacterStream() invocation.

Copy link
Contributor Author

@skorotkov skorotkov Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not worth doing so for current basic and naive Clob support.

Say postgres jdbc driver just doesn't support any methods to change the Clob object at all. The mysql jdbc driver does almost the same as we do here - it creates streams around the underlying string. Oracle possibly does something else but it has a real Clob support on protocol and storage level, which ignite doesn't. In ignite Clob for now is just a thin wrapper around the Strings.

General JDBC application wouldn't create a stream to read data from Clob from ResutSet and change it at the same time. Especially considering the "JDBC Thin Driver is not thread safe".

As far as the char[] implementation it would increase the memory usage in the "read" case. It would require the creation of extra char[] array in the ResultSet::getClob. With current implementation just a string reference is passed and no extra memory is needed (especially if the get*Stream() APIs are used by the application).


clob.free();

try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use assertThrows here and below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/**
* Internal setString implementation with zero-based position parameter.
*/
private int setStringInternal(long zeroBasedPos, String str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can we have int parameter instead of long? This will reduce cast count.

/**
* Internal setString implementation with zero-based position parameter.
*/
private int setStringInternal(long zeroBasedPos, String str, int off, int len) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can we have int parameter instead of long? This will reduce cast count.

encodeNextChunk();
}

int encoded_chunk_size = Math.min(len - i, buf.length - bufPos);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encoded_chunk_size -> encodedChunkSize

@Test
public void testGetAsciiStreamForNonAsciiDataReadByByte() throws Exception {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10000; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need 10000 copies of the string?
Let's help CI and keep 10 copies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to cover all variants in the JdbcClob.Utf8EncodedStringInputStream.encodeNextChunk() implementaion.

I reduce this to 3277 in one of two testcases (the minimum which produces the malformed surrogate).

And change to 10 in another one.

Comment on lines 595 to 600
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 2;

cnt++;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 2;
cnt++;
}
assertTrue(rs.next());
assertEquals(2, rs.getInt("id"));
assertFalse(rs.next());

Comment on lines 620 to 629
int cnt = 0;

while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;

cnt++;
}

assertEquals(1, cnt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int cnt = 0;
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;
cnt++;
}
assertEquals(1, cnt);
assertTrue(rs.next());
assertEquals(1, rs.getInt("id"));
assertFalse(rs.next());

Comment on lines 685 to 699
int cnt = 0;

while (rs.next()) {
if (cnt == 0) {
Blob blob = rs.getBlob("blobVal");
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});

blob = rs.getBlob(23);
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
}

cnt++;
}

assert cnt == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int cnt = 0;
while (rs.next()) {
if (cnt == 0) {
Blob blob = rs.getBlob("blobVal");
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
blob = rs.getBlob(23);
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
}
cnt++;
}
assert cnt == 1;
assertTrue(rs.next());
Blob blob = rs.getBlob("blobVal");
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
blob = rs.getBlob(23);
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
assertFalse(rs.next());

Comment on lines 709 to 723
int cnt = 0;

while (rs.next()) {
if (cnt == 0) {
Clob clob = rs.getClob("clobVal");
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));

clob = rs.getClob(24);
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
}

cnt++;
}

assert cnt == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int cnt = 0;
while (rs.next()) {
if (cnt == 0) {
Clob clob = rs.getClob("clobVal");
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
clob = rs.getClob(24);
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
}
cnt++;
}
assert cnt == 1;
assertTrue(rs.next());
Clob clob = rs.getClob("clobVal");
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
clob = rs.getClob(24);
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
assertFalse(rs.next());

Comment on lines 622 to 629
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;

cnt++;
}

assertEquals(1, cnt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;
cnt++;
}
assertEquals(1, cnt);
assertTrue(rs.next());
assertFalse(rs.next());

Comment on lines 876 to 885
int cnt = 0;

while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;

cnt++;
}

assertEquals(1, cnt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int cnt = 0;
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;
cnt++;
}
assertEquals(1, cnt);
assertTrue(rs.next());
assertEquals(1, rs.getInt("id");
assertFalse(rs.next());

Comment on lines 891 to 900
cnt = 0;

while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 2;

cnt++;
}

assert cnt == 1;
Copy link
Contributor

@nizhikov nizhikov Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cnt = 0;
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 2;
cnt++;
}
assert cnt == 1;
assertTrue(rs.next());
assertEquals(2, rs.getInt("id");
assertFalse(rs.next());

Comment on lines 918 to 927
int cnt = 0;

while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;

cnt++;
}

assertEquals(1, cnt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int cnt = 0;
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 1;
cnt++;
}
assertEquals(1, cnt);
assertTrue(rs.next());
assertEquals(1, rs.getInt("id");
assertFalse(rs.next());```

Comment on lines 933 to 942
cnt = 0;

while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 2;

cnt++;
}

assert cnt == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cnt = 0;
while (rs.next()) {
if (cnt == 0)
assert rs.getInt("id") == 2;
cnt++;
}
assert cnt == 1;
assertTrue(rs.next());
assertEquals(2, rs.getInt("id");
assertFalse(rs.next());

Comment on lines 611 to 625
int cnt = 0;

while (rs.next()) {
if (cnt == 0) {
Blob blob = rs.getBlob("blobVal");
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});

blob = rs.getBlob(16);
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
}

cnt++;
}

assert cnt == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int cnt = 0;
while (rs.next()) {
if (cnt == 0) {
Blob blob = rs.getBlob("blobVal");
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
blob = rs.getBlob(16);
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
}
cnt++;
}
assert cnt == 1;
assertTrue(rs.next());
Blob blob = rs.getBlob("blobVal");
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
blob = rs.getBlob(16);
Assert.assertArrayEquals(blob.getBytes(1, (int)blob.length()), new byte[] {1});
assertFalse(rs.next());

Comment on lines 635 to 649
int cnt = 0;

while (rs.next()) {
if (cnt == 0) {
Clob clob = rs.getClob("clobVal");
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));

clob = rs.getClob(17);
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
}

cnt++;
}

assert cnt == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int cnt = 0;
while (rs.next()) {
if (cnt == 0) {
Clob clob = rs.getClob("clobVal");
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
clob = rs.getClob(17);
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
}
cnt++;
}
assert cnt == 1;
assertTrue(rs.next());
Clob clob = rs.getClob("clobVal");
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
clob = rs.getClob(17);
Assert.assertEquals("str", clob.getSubString(1, (int)clob.length()));
assertFalse(rs.next());

Copy link

@nizhikov nizhikov merged commit 95afef0 into apache:master Sep 11, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants