-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How could I persist Object Meta Data along with Object into S3 using the Connector #109
Comments
Hi Felix,
You can store the object in s3 as avro format, later you can extract the
schema from avro object stored in s3 bucket and create table in athena
using the schema you onbtained to query over data.
regards,
Abhishek Sahani
…On Fri, Aug 2, 2019 at 8:27 PM FelixKJose ***@***.***> wrote:
I have requirement that I have to persist the object metadata along with
object. So later we could use that in Amazon Athena to do some queries and
also avoid applications to pull only meta data instead of entire object. Is
there any support in the connector to do persist the meta data (which AWS
S3 SDK supports)?
I have seen great provisions to dynamically create S3 object Key by
deriving from Object fields etc, but couldn't find a way to derive the meta
data and persist that along with Object.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#109?email_source=notifications&email_token=AGEZV6RC44TDOBFKUAYXEP3QCRDONA5CNFSM4II6RF2KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HDCSOBQ>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGEZV6TNU7QSQDDHVAUCFLLQCRDONANCNFSM4II6RF2A>
.
|
Thank you Abhishek. The provision s3 gives to retrieve just meta data using AmazonS3.getObjectMetadata(bucket, key). |
Can someone please give me an answer for this? |
If your question is about the S3 connector, that repo is here - https://github.com/confluentinc/kafka-connect-storage-cloud It's not clear what metadata you would expect a Kafka connector to add other than what it generically knows about (topic name, partition, and offset) Seems the only metadata that is added, though, is the SSE Algorithm -https://github.com/confluentinc/kafka-connect-storage-cloud/blob/master/kafka-connect-s3/src/main/java/io/confluent/connect/s3/storage/S3OutputStream.java#L180-L193 |
Yes, I was asking whether I could put some more custom meta information along with SSEAlgorithm. For example: appId, user name etc. Could kafka publisher publish some meta data along with the message and that meta data can be stored along with the S3 object. Object MetaData reference from AWS S3: |
Sure, it could, but currently does not allow that to be configurable, and that should be an issue for a differernt repo. https://github.com/confluentinc/kafka-connect-storage-cloud |
I have requirement that I have to persist the object metadata along with object. So later we could use that in Amazon Athena to do some queries and also avoid applications to pull only meta data instead of entire object. Is there any support in the connector to do persist the meta data (which AWS S3 SDK supports)?
I have seen great provisions to dynamically create S3 object Key by deriving from Object fields etc, but couldn't find a way to derive the meta data and persist that along with Object.
The text was updated successfully, but these errors were encountered: