Skip to content

Commit eeaddfc

Browse files
author
Abduqodiri Qurbonzoda
authored
Add NumberHexFormat.minLength option to the HexFormat proposal (#384)
1 parent 4039274 commit eeaddfc

File tree

1 file changed

+55
-21
lines changed

1 file changed

+55
-21
lines changed

proposals/stdlib/hex-format.md

+55-21
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
* **Status**: Implemented in Kotlin 1.9.0
66
* **Prototype**: Implemented
77
* **Target issue**: [KT-57762](https://youtrack.jetbrains.com/issue/KT-57762/)
8-
* **Discussion**: TBD
8+
* **Discussion**: [KEEP-362](https://github.com/Kotlin/KEEP/issues/362)
99

1010
## Summary
1111

@@ -63,6 +63,7 @@ For formatting a numeric value:
6363
* The prefix of the hex representation
6464
* The suffix of the hex representation
6565
* Whether to remove leading zeros in the hex representation
66+
* The minimum number of hexadecimal digits to be used in the hex representation
6667

6768
For formatting `ByteArray`:
6869
* Whether upper case or lower case hexadecimal digits should be used
@@ -116,26 +117,33 @@ public class HexFormat internal constructor(
116117
public class NumberHexFormat internal constructor(
117118
val prefix: String,
118119
val suffix: String,
119-
val removeLeadingZeros: Boolean
120+
val removeLeadingZeros: Boolean,
121+
val minLength: Int
120122
) {
121123
122124
public class Builder internal constructor() {
123125
var prefix: String = ""
124126
var suffix: String = ""
125127
var removeLeadingZeros: Boolean = false
128+
var minLength: Int = 1
126129
}
127130
}
128131
}
129132
```
130133

131134
`BytesHexFormat` and `NumberHexFormat` classes hold format options for `ByteArray` and numeric values, correspondingly.
132-
`upperCase` option, which is common to both `ByteArray` and numeric values, is stored in `HexFormat`.
135+
The `upperCase` option, which is common to both `ByteArray` and numeric values, is stored in `HexFormat`.
133136

134137
It's not possible to instantiate a `HexFormat` or its builder directly. The following function is provided instead:
135138
```
136139
public inline fun HexFormat(builderAction: HexFormat.Builder.() -> Unit): HexFormat
137140
```
138141

142+
Additionally, two predefined `HexFormat` instances are provided for convenience:
143+
* `HexFormat.Default` - the hexadecimal format with all options set to their default values.
144+
* `HexFormat.UpperCase` - the hexadecimal format with all options set to their default values,
145+
except for the `upperCase` option, which is set to `true`.
146+
139147
### Formatting
140148

141149
For formatting, the following extension functions are proposed:
@@ -145,7 +153,7 @@ public fun ByteArray.toHexString(format: HexFormat = HexFormat.Default): String
145153
146154
public fun ByteArray.toHexString(
147155
startIndex: Int = 0,
148-
endIndex: Int = size,
156+
endIndex: Int = this.size,
149157
format: HexFormat = HexFormat.Default
150158
): String
151159
@@ -154,27 +162,58 @@ public fun ByteArray.toHexString(
154162
public fun N.toHexString(format: HexFormat = HexFormat.Default): String
155163
```
156164

165+
When formatting a byte array, one can assume the following steps:
166+
1. The bytes are split into lines with `bytesPerLine` bytes in each line,
167+
except for the last line, which may have fewer bytes.
168+
2. Each line is split into groups with `bytesPerGroup` bytes in each group,
169+
except for the last group in a line, which may have fewer bytes.
170+
3. All bytes are converted to their two-digit hexadecimal representation,
171+
each prefixed by `bytePrefix` and suffixed by `byteSuffix`.
172+
The `upperCase` option determines the case (`A-F` or `a-f`) of the hexadecimal digits.
173+
4. Adjacent formatted bytes within each group are separated by `byteSeparator`.
174+
5. Adjacent groups within each line are separated by `groupSeparator`.
175+
6. Adjacent lines are separated by the line feed (LF) character `'\n'`.
176+
177+
When formatting a numeric value, the result consists of a `prefix` string,
178+
the hex representation of the numeric value, and a `suffix` string.
179+
The hex representation of a value is calculated by mapping each four-bit chunk of its binary representation
180+
to the corresponding hexadecimal digit, starting with the most significant bits.
181+
The `upperCase` option determines the case of the hexadecimal digits (`A-F` or `a-f`).
182+
If the `removeLeadingZeros` option is `true` and the hex representation is longer than `minLength`,
183+
leading zeros are removed until the length matches `minLength`. However, if `minLength` exceeds the length of the
184+
hex representation, `removeLeadingZeros` is ignored, and zeros are added to the start of the representation to
185+
achieve the specified `minLength`.
186+
157187
### Parsing
158188

159-
It is critical to be able to parse the results of the formatting functions above.
189+
It is critical to be able to parse the results of the formatting functions mentioned above.
160190
For parsing, the following extension functions are proposed:
161191
```
162-
// Parses a byte array
192+
// Parses a byte array using the options from HexFormat.bytes
163193
public fun String.hexToByteArray(format: HexFormat = HexFormat.Default): ByteArray
164194
165-
// Parses a numeric value
166-
// N is Byte, Short, Int, Long, and their unsigned counterparts
195+
// Parses a numeric value using the options from HexFormat.number
196+
// N represents Byte, Short, Int, Long, and their unsigned counterparts
167197
public fun String.hexToN(format: HexFormat = HexFormat.Default): N
168198
```
169199

170-
## Contracts
171-
172-
* When formatting a `ByteArray`, the LF character is used to separate lines.
173-
* When parsing a `ByteArray`, any of the char sequences CRLF (`"\r\n"`), LF (`"\n"`) and CR (`"\r"`) are considered a valid line separator.
174-
* Parsing is performed in a case-insensitive manner.
175-
* `NumberHexFormat.removeLeadingZeros` is ignored when parsing.
176-
* Assigning a non-positive value to `BytesHexFormat.Builder.bytesPerLine/bytesPerGroup` is prohibited.
177-
In this case `IllegalArgumentException` is thrown.
200+
When parsing, the input string must conform to the structure defined by the specified format options.
201+
However, parsing is somewhat lenient:
202+
* For byte arrays:
203+
* Parsing is performed in a case-insensitive manner for both the hexadecimal digits and the format elements
204+
(prefix, suffix, separators) defined in the `HexFormat.bytes` property.
205+
* Any of the char sequences CRLF (`"\r\n"`), LF (`"\n"`) and CR (`"\r"`) is considered a valid line separator.
206+
* For numeric values:
207+
* Parsing is performed in a case-insensitive manner for both the hexadecimal digits and the format elements
208+
(prefix, suffix) defined in the `HexFormat.number` property.
209+
* The `removeLeadingZeros` and `minLength` options are ignored.
210+
However, the input string must contain at least one hexadecimal digit between the `prefix` and `suffix`.
211+
If the number of hexadecimal digits exceeds the capacity of the type being parsed, based on its bit size,
212+
the excess leading digits must be zeros.
213+
214+
### Contracts
215+
* Assigning a non-positive value to `BytesHexFormat.Builder.bytesPerLine/bytesPerGroup`
216+
and `NumberHexFormat.Builder.minLength` is prohibited. In this case `IllegalArgumentException` is thrown.
178217
* Assigning a string containing LF or CR character to `BytesHexFormat.Builder.byteSeparator/bytePrefix/byteSuffix`
179218
and `NumberHexFormat.Builder.prefix/suffix` is prohibited. In this case `IllegalArgumentException` is thrown.
180219

@@ -303,11 +342,6 @@ Only a subset of Kotlin Standard Library available on all supported platforms is
303342

304343
## Future advancements
305344

306-
* Adding the ability to limit the number of hex digits when formatting numeric values
307-
* `NumberHexFormat.maxLength` could be introduced
308-
* When formatting an `Int`, combination of `maxLength = 6` and `removeLeadingZeros = false` results to exactly 6 least significant hex digits
309-
* Combination of `maxLength = 6` and `removeLeadingZeros = true` returns at most 6 hex (least-significant) digits without leading zeros
310-
* Related request: [KT-60787](https://youtrack.jetbrains.com/issue/KT-60787)
311345
* Overloads for parsing a substring: [KT-58277](https://youtrack.jetbrains.com/issue/KT-58277)
312346
* Overloads for appending format result to an `Appendable`
313347
* `toHexString` might need to be renamed to `hexToString/Appendable` or `hexifyToString/Appendable`, because

0 commit comments

Comments
 (0)