Skip to content

Commit f5f6a7c

Browse files
authored
Require a shorter test for the (optional) consistent probability sampler (#2319)
1 parent 7149d19 commit f5f6a7c

File tree

1 file changed

+42
-52
lines changed

1 file changed

+42
-52
lines changed

specification/trace/tracestate-probability-sampling.md

Lines changed: 42 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -72,9 +72,9 @@
7272
- [Recommendation: Recognize inconsistent r-values](#recommendation-recognize-inconsistent-r-values)
7373
* [Appendix: Statistical test requirements](#appendix-statistical-test-requirements)
7474
+ [Test procedure: non-powers of two](#test-procedure-non-powers-of-two)
75-
- [Requirement: Pass 15 non-power-of-two statistical tests](#requirement-pass-15-non-power-of-two-statistical-tests)
75+
- [Requirement: Pass 12 non-power-of-two statistical tests](#requirement-pass-12-non-power-of-two-statistical-tests)
7676
+ [Test procedure: exact powers of two](#test-procedure-exact-powers-of-two)
77-
- [Requirement: Pass 5 power-of-two statistical tests](#requirement-pass-5-power-of-two-statistical-tests)
77+
- [Requirement: Pass 3 power-of-two statistical tests](#requirement-pass-3-power-of-two-statistical-tests)
7878
+ [Test implementation](#test-implementation)
7979
- [Appendix](#appendix)
8080
* [Methods for generating R-values](#methods-for-generating-r-values)
@@ -870,7 +870,7 @@ this a strict test for random behavior, we take the following approach:
870870

871871
- Generate a pre-determined list of 20 random seeds
872872
- Use fixed values for significance level (5%) and trials (20)
873-
- Use a population size of one million spans
873+
- Use a population size of 100,000 spans
874874
- For each trial, simulate the population and compute ChiSquared
875875
test statistic
876876
- Locate the first seed value in the ordered list such that the
@@ -895,29 +895,26 @@ In this case there are two degrees of freedom for the Chi-Squared test.
895895
The following table summarizes the test parameters.
896896

897897
| Test case | Sampling probability | Lower, Upper p-value when sampled | Expect<sub>lower</sub> | Expect<sub>upper</sub> | Expect<sub>unsampled</sub> |
898-
| --- | --- | --- | --- | --- | --- |
899-
| 1 | 0.900000 | 0, 1 | 100000 | 800000 | 100000 |
900-
| 2 | 0.600000 | 0, 1 | 400000 | 200000 | 400000 |
901-
| 3 | 0.330000 | 1, 2 | 170000 | 160000 | 670000 |
902-
| 4 | 0.130000 | 2, 3 | 120000 | 10000 | 870000 |
903-
| 5 | 0.100000 | 3, 4 | 25000 | 75000 | 900000 |
904-
| 6 | 0.050000 | 4, 5 | 12500 | 37500 | 950000 |
905-
| 7 | 0.017000 | 5, 6 | 14250 | 2750 | 983000 |
906-
| 8 | 0.010000 | 6, 7 | 5625 | 4375 | 990000 |
907-
| 9 | 0.005000 | 7, 8 | 2812.5 | 2187.5 | 995000 |
908-
| 10 | 0.002900 | 8, 9 | 1006.25 | 1893.75 | 997100 |
909-
| 11 | 0.001000 | 9, 10 | 953.125 | 46.875 | 999000 |
910-
| 12 | 0.000500 | 10, 11 | 476.5625 | 23.4375 | 999500 |
911-
| 13 | 0.000260 | 11, 12 | 228.28125 | 31.71875 | 999740 |
912-
| 14 | 0.000230 | 12, 13 | 14.140625 | 215.859375 | 999770 |
913-
| 15 | 0.000100 | 13, 14 | 22.0703125 | 77.9296875 | 999900 |
898+
|-----------|----------------------|-----------------------------------|------------------------|------------------------|----------------------------|
899+
| 1 | 0.900000 | 0, 1 | 10000 | 80000 | 10000 |
900+
| 2 | 0.600000 | 0, 1 | 40000 | 20000 | 40000 |
901+
| 3 | 0.330000 | 1, 2 | 17000 | 16000 | 67000 |
902+
| 4 | 0.130000 | 2, 3 | 12000 | 1000 | 87000 |
903+
| 5 | 0.100000 | 3, 4 | 2500 | 7500 | 90000 |
904+
| 6 | 0.050000 | 4, 5 | 1250 | 3750 | 95000 |
905+
| 7 | 0.017000 | 5, 6 | 1425 | 275 | 98300 |
906+
| 8 | 0.010000 | 6, 7 | 562.5 | 437.5 | 99000 |
907+
| 9 | 0.005000 | 7, 8 | 281.25 | 218.75 | 99500 |
908+
| 10 | 0.002900 | 8, 9 | 100.625 | 189.375 | 99710 |
909+
| 11 | 0.001000 | 9, 10 | 95.3125 | 4.6875 | 99900 |
910+
| 12 | 0.000500 | 10, 11 | 47.65625 | 2.34375 | 99950 |
914911

915912
The formula for computing Chi-Squared in this case is:
916913

917914
```
918915
ChiSquared = math.Pow(sampled_lowerP - expect_lowerP, 2) / expect_lowerP +
919916
math.Pow(sampled_upperP - expect_upperP, 2) / expect_upperP +
920-
math.Pow(1000000 - sampled_lowerP - sampled_upperP - expect_unsampled, 2) / expect_unsampled
917+
math.Pow(100000 - sampled_lowerP - sampled_upperP - expect_unsampled, 2) / expect_unsampled
921918
```
922919

923920
This should be compared with 0.102587, the value of the Chi-Squared
@@ -926,9 +923,9 @@ For each probability in the table above, the test is required to
926923
demonstrate a seed that produces exactly one ChiSquared value less
927924
than 0.102587.
928925

929-
##### Requirement: Pass 15 non-power-of-two statistical tests
926+
##### Requirement: Pass 12 non-power-of-two statistical tests
930927

931-
For the test with 20 trials and 1 million spans each, the test MUST
928+
For the test with 20 trials and 100,000 spans each, the test MUST
932929
demonstrate a random number generator seed such that the ChiSquared
933930
test statistic is below 0.102587 exactly 1 out of 20 times.
934931

@@ -937,19 +934,17 @@ test statistic is below 0.102587 exactly 1 out of 20 times.
937934
In this case there is one degree of freedom for the Chi-Squared test.
938935
The following table summarizes the test parameters.
939936

940-
| Test case | Sampling probability | P-value when sampled | Expect<sub>sampled</sub> | Expect<sub>unsampled</sub> | |
941-
| --- | --- | --- | --- | --- | |
942-
| 16 | 0x1p-01 (0.500000) | 1 | 500000 | n/a | 500000 |
943-
| 17 | 0x1p-04 (0.062500) | 4 | 62500 | n/a | 937500 |
944-
| 18 | 0x1p-07 (0.007812) | 7 | 7812.5 | n/a | 992187.5 |
945-
| 19 | 0x1p-10 (0.000977) | 10 | 976.5625 | n/a | 999023.4375 |
946-
| 20 | 0x1p-13 (0.000122) | 13 | 122.0703125 | n/a | 999877.9297 |
937+
| Test case | Sampling probability | P-value when sampled | Expect<sub>sampled</sub> | Expect<sub>unsampled</sub> |
938+
|-----------|----------------------|----------------------|--------------------------|----------------------------|
939+
| 13 | 0x1p-01 (0.500000) | 1 | 50000 | 50000 |
940+
| 14 | 0x1p-04 (0.062500) | 4 | 6250 | 93750 |
941+
| 15 | 0x1p-07 (0.007812) | 7 | 781.25 | 99218.75 |
947942

948943
The formula for computing Chi-Squared in this case is:
949944

950945
```
951946
ChiSquared = math.Pow(sampled - expect_sampled, 2) / expect_sampled +
952-
math.Pow(1000000 - sampled - expect_unsampled, 2) / expect_unsampled
947+
math.Pow(100000 - sampled - expect_unsampled, 2) / expect_unsampled
953948
```
954949

955950
This should be compared with 0.003932, the value of the Chi-Squared
@@ -958,51 +953,46 @@ For each probability in the table above, the test is required to
958953
demonstrate a seed that produces exactly one ChiSquared value less
959954
than 0.003932.
960955

961-
##### Requirement: Pass 5 power-of-two statistical tests
956+
##### Requirement: Pass 3 power-of-two statistical tests
962957

963-
For the teset with 20 trials and 1 million spans each, the test MUST
958+
For the test with 20 trials and 100,000 spans each, the test MUST
964959
demonstrate a random number generator seed such that the ChiSquared
965960
test statistic is below 0.003932 exactly 1 out of 20 times.
966961

967962
#### Test implementation
968963

969-
The recommended structure for this test uses a table listing the 20
964+
The recommended structure for this test uses a table listing the 15
970965
probability values, the expected p-values, whether the ChiSquared
971966
statistic has one or two degrees of freedom, and the index into the
972967
predetermined list of seeds.
973968

974969
```
975970
for _, test := range []testCase{
976971
// Non-powers of two
977-
{0.90000, 1, twoDegrees, 5},
978-
{0.60000, 1, twoDegrees, 14},
979-
{0.33000, 2, twoDegrees, 3},
980-
{0.13000, 3, twoDegrees, 2},
972+
{0.90000, 1, twoDegrees, 3},
973+
{0.60000, 1, twoDegrees, 2},
974+
{0.33000, 2, twoDegrees, 2},
975+
{0.13000, 3, twoDegrees, 1},
981976
{0.10000, 4, twoDegrees, 0},
982977
{0.05000, 5, twoDegrees, 0},
983978
{0.01700, 6, twoDegrees, 2},
984-
{0.01000, 7, twoDegrees, 3},
985-
{0.00500, 8, twoDegrees, 1},
986-
{0.00290, 9, twoDegrees, 1},
987-
{0.00100, 10, twoDegrees, 5},
988-
{0.00050, 11, twoDegrees, 1},
989-
{0.00026, 12, twoDegrees, 3},
990-
{0.00023, 13, twoDegrees, 0},
991-
{0.00010, 14, twoDegrees, 2},
979+
{0.01000, 7, twoDegrees, 2},
980+
{0.00500, 8, twoDegrees, 2},
981+
{0.00290, 9, twoDegrees, 4},
982+
{0.00100, 10, twoDegrees, 6},
983+
{0.00050, 11, twoDegrees, 0},
992984
993985
// Powers of two
994986
{0x1p-1, 1, oneDegree, 0},
995-
{0x1p-4, 4, oneDegree, 2},
996-
{0x1p-7, 7, oneDegree, 3},
997-
{0x1p-10, 10, oneDegree, 0},
998-
{0x1p-13, 13, oneDegree, 1},
987+
{0x1p-4, 4, oneDegree, 0},
988+
{0x1p-7, 7, oneDegree, 1},
999989
} {
1000990
```
1001991

1002992
Note that seed indexes in the example above have what appears to be
1003-
the correct distribution. The five 0s, four 1s, four 2s, four 3s, and
1004-
two 5s demonstrate that it is relatively easy to find examples where
1005-
there is exactly one failure. Seed index 14, for probability 0.6 in
993+
the correct distribution. The five 0s, two 1s, five 2s, one 3s, and
994+
one 4 demonstrate that it is relatively easy to find examples where
995+
there is exactly one failure. Probability 0.001, with seed index 6 in
1006996
this case, is a reminder that outliers exist. Further significance
1007997
testing of this distribution is not recommended.
1008998

0 commit comments

Comments
 (0)