@@ -74,13 +74,13 @@ You can link against this library in your program at the following coordinates:
74
74
</tr >
75
75
<tr >
76
76
<td >
77
- <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.11<br >version: 2.6.11 </pre >
77
+ <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.11<br >version: 2.7.0 </pre >
78
78
</td >
79
79
<td >
80
- <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.12<br >version: 2.6.11 </pre >
80
+ <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.12<br >version: 2.7.0 </pre >
81
81
</td >
82
82
<td >
83
- <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.13<br >version: 2.6.11 </pre >
83
+ <pre >groupId: za.co.absa.cobrix<br >artifactId: spark-cobol_2.13<br >version: 2.7.0 </pre >
84
84
</td >
85
85
</tr >
86
86
</table >
@@ -91,17 +91,17 @@ This package can be added to Spark using the `--packages` command line option. F
91
91
92
92
### Spark compiled with Scala 2.11
93
93
```
94
- $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.11:2.6.11
94
+ $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.11:2.7.0
95
95
```
96
96
97
97
### Spark compiled with Scala 2.12
98
98
```
99
- $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.6.11
99
+ $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.7.0
100
100
```
101
101
102
102
### Spark compiled with Scala 2.13
103
103
```
104
- $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.13:2.6.11
104
+ $SPARK_HOME/bin/spark-shell --packages za.co.absa.cobrix:spark-cobol_2.13:2.7.0
105
105
```
106
106
107
107
## Usage
@@ -238,17 +238,17 @@ to decode various binary formats.
238
238
239
239
The jars that you need to get are:
240
240
241
- * spark-cobol_2.12-2.6.11 .jar
242
- * cobol-parser_2.12-2.6.11 .jar
241
+ * spark-cobol_2.12-2.7.0 .jar
242
+ * cobol-parser_2.12-2.7.0 .jar
243
243
* scodec-core_2.12-1.10.3.jar
244
244
* scodec-bits_2.12-1.1.4.jar
245
245
* antlr4-runtime-4.8.jar
246
246
247
247
After that you can specify these jars in ` spark-shell ` command line. Here is an example:
248
248
```
249
- $ spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.6.11
249
+ $ spark-shell --packages za.co.absa.cobrix:spark-cobol_2.12:2.7.0
250
250
or
251
- $ spark-shell --master yarn --deploy-mode client --driver-cores 4 --driver-memory 4G --jars spark-cobol_2.12-2.6.11 .jar,cobol-parser_2.12-2.6.11 .jar,scodec-core_2.12-1.10.3.jar,scodec-bits_2.12-1.1.4.jar,antlr4-runtime-4.8.jar
251
+ $ spark-shell --master yarn --deploy-mode client --driver-cores 4 --driver-memory 4G --jars spark-cobol_2.12-2.7.0 .jar,cobol-parser_2.12-2.7.0 .jar,scodec-core_2.12-1.10.3.jar,scodec-bits_2.12-1.1.4.jar,antlr4-runtime-4.8.jar
252
252
253
253
Setting default log level to "WARN".
254
254
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
@@ -315,11 +315,11 @@ Creating an uber jar for Cobrix is very easy. Steps to build:
315
315
316
316
You can collect the uber jar of ` spark-cobol` either at
317
317
` spark-cobol/target/scala-2.11/` or in ` spark-cobol/target/scala-2.12/` depending on the Scala version you used.
318
- The fat jar will have ' -bundle' suffix. You can also download pre-built bundles from https://github.com/AbsaOSS/cobrix/releases/tag/v2.6.11
318
+ The fat jar will have ' -bundle' suffix. You can also download pre-built bundles from https://github.com/AbsaOSS/cobrix/releases/tag/v2.7.0
319
319
320
320
Then, run ` spark-shell` or ` spark-submit` adding the fat jar as the option.
321
321
` ` ` sh
322
- $ spark-shell --jars spark-cobol_2.12_3.3-2.7.0 -SNAPSHOT-bundle.jar
322
+ $ spark-shell --jars spark-cobol_2.12_3.3-2.7.1 -SNAPSHOT-bundle.jar
323
323
` ` `
324
324
325
325
> < b> A note for building and running tests on Windows< /b>
@@ -1751,6 +1751,33 @@ at org.apache.hadoop.io.nativeio.NativeIO$POSIX.getStat(NativeIO.java:608)
1751
1751
A: Update hadoop dll to version 3.2.2 or newer.
1752
1752
1753
1753
## Changelog
1754
+ - #### 2.7.0 released 8 April 2024.
1755
+ - [ #666 ] ( https://github.com/AbsaOSS/cobrix/issues/666 ) Added support for record length value mapping.
1756
+ ``` scala
1757
+ .option(" record_format" , " F" )
1758
+ .option(" record_length_field" , " FIELD_STR" )
1759
+ .option(" record_length_map" , """ {"SEG1":100,"SEG2":200}""" )
1760
+ ```
1761
+ - [# 669 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 669 ) Allow 'V' to be at the end of scaled PICs .
1762
+ ```cobol
1763
+ 10 SCALED - DECIMAL - FIELD PIC S9PPPV DISPLAY .
1764
+ ```
1765
+ - [# 672 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 672 ) Add the ability to parse copybooks with options normally passed to the `spark-cobol` Spark data source.
1766
+ ```scala
1767
+ // Same options that you use for spark.read.format("cobol").option()
1768
+ val options = Map (" schema_retention_policy" -> " keep_original" )
1769
+
1770
+ val cobolSchema = CobolSchema .fromSparkOptions(Seq (copybook), options)
1771
+ val sparkSchema = cobolSchema.getSparkSchema.toString()
1772
+
1773
+ println(sparkSchema)
1774
+ ```
1775
+ - [# 674 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 674 ) Extended the usage of indexes for variable record length files with a record length field.
1776
+ ```scala
1777
+ .option(" record_length_field" , " RECORD-LENGTH" )
1778
+ .option(" enable_indexes" , " true" ) // true by default so can me omitted
1779
+ ```
1780
+
1754
1781
- #### 2.6.11 released 8 April 2024 .
1755
1782
- [# 659 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 659 ) Fixed record length option when record id generation is turned on.
1756
1783
@@ -1810,6 +1837,9 @@ A: Update hadoop dll to version 3.2.2 or newer.
1810
1837
- [# 521 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 521 ) Fixed index generation and improved performance of variable
1811
1838
block length files processing (record_format= 'VB ' ).
1812
1839
1840
+ <details ><summary >Older versions</summary >
1841
+ <p >
1842
+
1813
1843
- #### 2.5.1 released 24 August 2022 .
1814
1844
- [# 510 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 510 ) Fixed dropping of FILLER fields in Spack Schema if the FILLER has OCCURS of GROUPS .
1815
1845
@@ -1823,9 +1853,6 @@ A: Update hadoop dll to version 3.2.2 or newer.
1823
1853
- [# 501 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 501 ) Fixed decimal field null detection when 'improved_null_detection ' is turned on.
1824
1854
- [# 502 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 502 ) Fixed parsing of scaled decimals that have a pattern similar to `SVP9(5)`.
1825
1855
1826
- <details ><summary >Older versions</summary >
1827
- <p >
1828
-
1829
1856
- #### 2.4.10 released 8 April 2022 .
1830
1857
- [# 481 ](https:// github.com/ AbsaOSS / cobrix/ issues/ 481 ) ASCII control characters are now ignored instead of being replaced with spaces.
1831
1858
A new string trimming policy (`keep_all`) allows keeping all control characters in strings (including `0x00`).
0 commit comments