Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: inconsistent behavior of JsonRelativeFieldPatternBuilder-based masking based on key ordering #28

Open
OmarAlJarrah opened this issue Jan 17, 2025 · 1 comment

Comments

@OmarAlJarrah
Copy link

OmarAlJarrah commented Jan 17, 2025

Environment Metadata

  • java-version: 21
  • jdk-distribution: amazon-corretto
  • kotlin-flavor: jvm
  • kotlin-version: 2.0.21
  • os: macOS-15.2-arm64

Situation

There are multiple pattern builders available out-of-the-box within the ejmask library, such as:

  • JsonFullValuePatternBuilder
  • JsonRelativeFieldPatternBuilder
  • Others.

When creating a filter instance using the JsonRelativeFieldPatternBuilder class, then registering the filter using the API provided through the EJMaskInitializer class, we expect the filter to accept one to two keys as a relative path as a target for matching.

val filter = BaseFilter(
    JsonRelativeFieldPatternBuilder::class.java,
    "field1", "field2"
)

EJMaskInitializer.addFilter(filter)

Once the filter is registered, we expect it to be applied whenever the EJMask.mask(content: String) function is called.

EJMask.mask(jsonString) //  any previously registered filters are applied.

Any JsonRelativeFieldPatternBuilder based filters works as expected. However, inconsistent behavior is observed based on the order of other key-value pairs in the json string on the same level/depth around the target key-value pair. Let's look at a few examples:

Given the following json string as our sample:

{
    "number": "12345678912345",
    "payments":[
        {
            "type":"visa",
            "number":"12345678912345",
            "name":{
                "first":"Omar",
                "last":"AlJarrah"
            }
        }
    ]
}

When registering a JsonRelativeFieldPatternBuilder based filter with ["payments", "number"] as our relative path masking target, calling the EJMask.mask() method works as exepcted.

val filter = BaseFilter(
    JsonRelativeFieldPatternBuilder::class.java,
    "payments", "number"
)

EJMaskInitializer.addFilter(filter)

val json = """
    {
        "number": "12345678912345",
        "payments":[
            {
                "type":"visa",
                "number":"12345678912345",
                "name":{
                    "first":"Omar",
                    "last":"AlJarrah"
                }
            }
        ]
    }
""".trimIndent()

val masked: String = EJMask.mask(json)
println(masked)

We expect the key at path .number not to be mutated, while the key at path .payments.number to be masked. The printed output lives-up to those expectations:

{
    "number": "12345678912345",
    "payments":[
        {
            "type":"visa",
            "number":"12-xxxx",
            "name":{
                "first":"Omar",
                "last":"AlJarrah"
            }
        }
    ]
}

Now, let's change the order of fields a little, where .payments.name key will come before the key .payments.number in the json body:

val filter = BaseFilter(
    JsonRelativeFieldPatternBuilder::class.java,
    "payments", "number"
)

EJMaskInitializer.addFilter(filter)

val json = """
    {
        "number": "12345678912345",
        "payments":[
            {
                "type":"visa",
                "name":{
                    "first":"Omar",
                    "last":"AlJarrah"
                },
                "number":"12345678912345" // `number` now comes after `name`.
            }
        ]
    }
""".trimIndent()

val masked: String = EJMask.mask(json)
println(masked)

Expected Behavior

We expect the ordering of keys not to affect the final output:

{
    "number": "12345678912345",
    "payments":[
        {
            "type":"visa",
            "name":{
                "first":"Omar",
                "last":"AlJarrah"
            },
            "number":"12-xxxx"
        }
    ]
}

Actual Behavior

It appears that the ordering of keys is affecting the final output, resulting in the target .payments.number value not to be masked:

{
    "number": "12345678912345",
    "payments":[
        {
            "type":"visa",
            "name":{
                "first":"Omar",
                "last":"AlJarrah"
            },
            "number":"12345678912345"
        }
    ]
}

Looking at the docstrings of the PATTERN_TEMPLATE in the JsonRelativeFieldPatternBuilder class:

/**
 * <pre>
 * (?ui)               - enable ignore case, unicode
 *  \"user"[^\}]
 *  "name\"   matches   "user" followed any char other than "}", then "name"
 * ([^"]{1,n})       matches any char othen than " , at most n char, at least 0
 */
private static final String PATTERN_TEMPLATE = "(?ui)(\"%s\"[^\\}]*\"%s\"" + SKIP_SPACE_TAB_NEWLINE
        + ":" + SKIP_SPACE_TAB_NEWLINE + "\")([^\"]{1,%d})[^\"]*([\"|]?)";

We suspect that .payments.name being an object affects the regular expression to stop matching when faced with a closing json object bracket }.

Task

We expect the ordering of keys not to affect the behavior of any JsonRelativeFieldPatternBuilder-based filters.

@OmarAlJarrah
Copy link
Author

cc @prasanthkv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant