You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently DIL supports many structured format like CSV, Json, Avro and also many compression formats. Unstructured text format is supported only through FileDumpExtractor, which dumps output to HDFS. With FileDumpExtractor, output cannot be passed to any converter. Text Extractor should be supported, which can extract output in any format and pass it to some converter for further ETL rather than directly pushing this to HDFS. This is useful in cases where we want to get some URL output and then apply some custom parsing to get the required output.
The text was updated successfully, but these errors were encountered:
Currently DIL supports many structured format like CSV, Json, Avro and also many compression formats. Unstructured text format is supported only through FileDumpExtractor, which dumps output to HDFS. With FileDumpExtractor, output cannot be passed to any converter. Text Extractor should be supported, which can extract output in any format and pass it to some converter for further ETL rather than directly pushing this to HDFS. This is useful in cases where we want to get some URL output and then apply some custom parsing to get the required output.
The text was updated successfully, but these errors were encountered: