File tree Expand file tree Collapse file tree 8 files changed +55
-3
lines changed
java/com/marklogic/flux/impl/importdata
resources/delimited-files Expand file tree Collapse file tree 8 files changed +55
-3
lines changed Original file line number Diff line number Diff line change @@ -32,10 +32,10 @@ are followed by a list of options common to every Flux command.
3232You can specify a command name without entering its full name, as long as you enter a sufficient number of characters
3333such that Flux can uniquely identify the command name.
3434
35- For example, instead of entering ` import-aggregate-xml- files ` , you can enter ` import-ag ` as it is the only command in
36- Flux with that sequence of letters:
35+ For example, instead of entering ` import-parquet- files ` , you can enter ` import-p ` as it is the only command in
36+ Flux beginning with that sequence of letters:
3737
38- ./bin/flux import-ag --path path/to/data etc...
38+ ./bin/flux import-p --path path/to/data etc...
3939
4040If Flux cannot uniquely identify the command name, it will print an error and list the command names that match what
4141you entered.
Original file line number Diff line number Diff line change @@ -86,6 +86,12 @@ it may be important to query for documents that have a particular field with a v
8686The ` import-avro-files ` command supports aggregating related rows together to produce hierarchical documents. See
8787[ Aggregating rows] ( ../aggregating-rows.md ) for more information.
8888
89+ ## Reading compressed files
90+
91+ Flux will automatically read files compressed with GZIP when they have a filename ending in ` .gz ` ; you do not need to
92+ specify a compression option. As noted in the "Advanced options" section below, you can use ` -Pcompression= ` to
93+ explicitly specify a compression algorithm if Flux is not able to read your compressed files automatically.
94+
8995## Advanced options
9096
9197The ` import-avro-files ` command reuses Spark's support for reading Avro files. You can include any of
Original file line number Diff line number Diff line change @@ -106,6 +106,12 @@ the content can be correctly translated to UTF-8 when written to MarkLogic - e.g
106106The ` import-delimited-files ` command supports aggregating related rows together to produce hierarchical documents. See
107107[ Aggregating rows] ( ../aggregating-rows.md ) for more information.
108108
109+ ## Reading compressed files
110+
111+ Flux will automatically read files compressed with GZIP when they have a filename ending in ` .gz ` ; you do not need to
112+ specify a compression option. As noted in the "Advanced options" section below, you can use ` -Pcompression= ` to
113+ explicitly specify a compression algorithm if Flux is not able to read your compressed files automatically.
114+
109115## Advanced options
110116
111117The ` import-delimited-files ` command reuses Spark's support for reading delimited text data. You can include any of
Original file line number Diff line number Diff line change @@ -83,6 +83,12 @@ the content can be correctly translated to UTF-8 when written to MarkLogic:
8383 etc...
8484```
8585
86+ ## Reading compressed files
87+
88+ Flux will automatically read files compressed with GZIP when they have a filename ending in ` .gz ` ; you do not need to
89+ specify a compression option. As noted in the "Advanced options" section below, you can use ` -Pcompression= ` to
90+ explicitly specify a compression algorithm if Flux is not able to read your compressed files automatically.
91+
8692## Advanced options
8793
8894The ` import-aggregate-json-files ` command reuses Spark's support for reading JSON files. You can include any of
Original file line number Diff line number Diff line change @@ -86,6 +86,12 @@ it may be important to query for documents that have a particular field with a v
8686The ` import-orc-files ` command supports aggregating related rows together to produce hierarchical documents. See
8787[ Aggregating rows] ( ../aggregating-rows.md ) for more information.
8888
89+ ## Reading compressed files
90+
91+ Flux will automatically read files compressed with GZIP when they have a filename ending in ` .gz ` ; you do not need to
92+ specify a compression option. As noted in the "Advanced options" section below, you can use ` -Pcompression= ` to
93+ explicitly specify a compression algorithm if Flux is not able to read your compressed files automatically.
94+
8995## Advanced options
9096
9197The ` import-orc-files ` command reuses Spark's support for reading ORC files. You can include any of
Original file line number Diff line number Diff line change @@ -86,6 +86,12 @@ it may be important to query for documents that have a particular field with a v
8686The ` import-parquet-files ` command supports aggregating related rows together to produce hierarchical documents. See
8787[ Aggregating rows] ( ../aggregating-rows.md ) for more information.
8888
89+ ## Reading compressed files
90+
91+ Flux will automatically read files compressed with GZIP when they have a filename ending in ` .gz ` ; you do not need to
92+ specify a compression option. As noted in the "Advanced options" section below, you can use ` -Pcompression= ` to
93+ explicitly specify a compression algorithm if Flux is not able to read your compressed files automatically.
94+
8995## Advanced options
9096
9197The ` import-parquet-files ` command reuses Spark's support for reading Parquet files. You can include any of
Original file line number Diff line number Diff line change @@ -124,6 +124,28 @@ void jsonLines() {
124124 verifyDoc ("/delimited/lastName-3.json" , "firstName-3" , "lastName-3" );
125125 }
126126
127+ @ Test
128+ void gzippedJsonLines () {
129+ run (
130+ "import-aggregate-json-files" ,
131+ "--path" , "src/test/resources/delimited-files/line-delimited-json.txt.gz" ,
132+ "--json-lines" ,
133+ "--connection-string" , makeConnectionString (),
134+ "--permissions" , DEFAULT_PERMISSIONS ,
135+ "--collections" , "delimited-json-test" ,
136+ "--uri-template" , "/delimited/{lastName}.json"
137+ );
138+
139+ assertCollectionSize (
140+ "Spark data sources will automatically handle .gz files without -Pcompression=gzip being specified." ,
141+ "delimited-json-test" , 3
142+ );
143+ verifyDoc ("/delimited/lastName-1.json" , "firstName-1" , "lastName-1" );
144+ verifyDoc ("/delimited/lastName-2.json" , "firstName-2" , "lastName-2" );
145+ verifyDoc ("/delimited/lastName-3.json" , "firstName-3" , "lastName-3" );
146+ }
147+
148+
127149 @ Test
128150 void jsonRootName () {
129151 run (
You can’t perform that action at this time.
0 commit comments