Skip to content

CLDR-19466 Document decimal format test data files and usage in README#5813

Open
younies wants to merge 1 commit into
unicode-org:mainfrom
younies:decimal-testdata-readme
Open

CLDR-19466 Document decimal format test data files and usage in README#5813
younies wants to merge 1 commit into
unicode-org:mainfrom
younies:decimal-testdata-readme

Conversation

@younies

@younies younies commented Jun 10, 2026

Copy link
Copy Markdown
Member

CLDR-19466

  • This PR completes the ticket.

Summary

Addressed PR #5709 feedback by adding a comprehensive README.md to common/testData/decimal/.

Key Details

  • File Scopes: Documented what is in each test file, explicitly clarifying that decimals_modern_locales.tsv covers modern locales minus the core locales in decimals.tsv.
  • TSV Schema: Defined all columns (locale, number_format, format_length, input, expected).
  • Testing & Maintenance: Added instructions for automated test execution (TestDecimalFormat), manual spreadsheet inspection, and data regeneration (GenerateDecimalFormatTestData).

ALLOW_MANY_COMMITS=true

@younies younies force-pushed the decimal-testdata-readme branch from 9bf7848 to d45ee3e Compare June 10, 2026 13:26
@jira-pull-request-webhook

Copy link
Copy Markdown

Notice: the branch changed across the force-push!

  • common/testData/decimal/README.md is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@younies younies requested review from macchiati and sffc June 10, 2026 13:27
Comment thread common/testData/decimal/README.md Outdated
@@ -0,0 +1,62 @@
# CLDR Decimal Format Test Data (`common/testData/decimal`)

This directory contains Tab-Separated Values (TSV) files used as a solid blueprint for testing and verifying decimal and compact decimal formats in CLDR, based on ICU4J.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This directory contains Tab-Separated Values (TSV) files used as a solid blueprint for testing and verifying decimal and compact decimal formats in CLDR, based on ICU4J.
This directory contains Tab-Separated Values (TSV) files used for testing decimal and compact decimal formats in CLDR.

This is more accurate and avoids the awkward phrasing "solid blueprint"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread common/testData/decimal/README.md Outdated
The test data is organized into three separate files to separate core verification from extended coverage:

1. **`decimals.tsv`**
Contains verification tests for the **core** CLDR locales across standard number formats, compact format lengths, and representative numeric input values.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Contains verification tests for the **core** CLDR locales across standard number formats, compact format lengths, and representative numeric input values.
Contains verification tests for a selected set of numbers and locales that illustrate most features of number formatting, covering standard number formats, compact format lengths, and representative numeric input values.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread common/testData/decimal/README.md Outdated
Contains verification tests for the **core** CLDR locales across standard number formats, compact format lengths, and representative numeric input values.

2. **`decimals_modern_locales.tsv`**
Contains verification tests for all **modern** CLDR locales **minus** the core locales covered in `decimals.tsv`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Contains verification tests for all **modern** CLDR locales **minus** the core locales covered in `decimals.tsv`.
Contains verification tests for all **modern-coverage** CLDR locales **minus** the locales covered in `decimals.tsv`. It also only uses a small set of selected numbers.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread common/testData/decimal/README.md Outdated
Contains verification tests for all **modern** CLDR locales **minus** the core locales covered in `decimals.tsv`.

3. **`decimals_extended_numbers.tsv`**
Contains extended numeric test inputs (covering edge cases, large numbers, and small fractions) across standard formats and core locales for comprehensive verification.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Contains extended numeric test inputs (covering edge cases, large numbers, and small fractions) across standard formats and core locales for comprehensive verification.
Contains extended numeric test inputs (covering edge cases, large numbers, and small fractions) across standard formats and selected locales for more comprehensive verification.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread common/testData/decimal/README.md Outdated
* **`number_format`**: The number format dimension (`decimal`, `percent`, `scientific`).
* **`format_length`**: The compact format length (`short`, `long`). If empty/blank, it represents standard non-compact formatting.
* **`input`**: The floating-point numeric input value (e.g., `1.2`, `1234565.0`).
* **`expected`**: The literal expected output string, including all correct localized digits, grouping separators, percent signs, and bi-directional control characters (such as `\u200E` / LRM).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **`expected`**: The literal expected output string, including all correct localized digits, grouping separators, percent signs, and bi-directional control characters (such as `\u200E` / LRM).
* **`expected`**: The expected output string, including all correct localized digits, grouping separators, percent signs, and bi-directional control characters (such as `\u200E` / LRM).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread common/testData/decimal/README.md Outdated

### 1. Automated Testing

#### Testing in Java

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this whole section

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread common/testData/decimal/README.md Outdated
mvn test -pl tools/cldr-code -Dtest=TestDecimalFormat
```

#### Testing in Other Languages

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this whole section also

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread common/testData/decimal/README.md Outdated
3. Invokes your target number formatting implementation using the parsed parameters and numeric input.
4. Asserts that the generated output exactly matches the `expected` UTF-8 literal string.

### 2. Manual Verification

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the numbering on these sections

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@sffc sffc left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on all of @macchiati's comments

@younies

younies commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

thanks @macchiati for the comments, I will follow the same style with the currency testing too.

@younies younies requested review from macchiati and sffc June 10, 2026 17:58
@younies younies force-pushed the decimal-testdata-readme branch from 9cf8126 to e6663a7 Compare June 10, 2026 17:59
@jira-pull-request-webhook

Copy link
Copy Markdown

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@sffc sffc left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prose is fine, though in currency data, where we have more dimensions, I think we should be more clear and specific about which sets of options are used where.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants