Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursion Update for php-json-schema #167

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

loneregister
Copy link

Amazon's Schema system necessitated an update to allow recursion in the RefResolver object beyond 200. This code updates the system to allow an option to be submitted via option->maxNestLevel if recursion greater than 200 is needed. (or less).

Example:
`$options = new \Swaggest\JsonSchema\Context();

$options->maxNestLevel = 500;

$schema = Schema::import($data, $options);`

My first attempt to provide a fix / better feature - be gentle.

Amazon JSON Schema for APPAREL apparently needs more than 400 nesting levels to resolve.  Code updated to make it easy to adjust the hardcoded nesting level of 200.
Update to allow submitting of a max nest level depth variable via the Schema object.
Add ability to submit option to increase max depth of recursion.
Copy link

@llamapreview llamapreview bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto Pull Request Review from LlamaPReview

1. Overview

1.1 Core Changes

  • Primary purpose and scope: The pull request aims to increase the maximum recursion depth for resolving JSON schema references. The default limit of 200 was insufficient for some use cases, specifically Amazon's schema system. The PR introduces a configurable option to adjust this limit.
  • Key components modified: RefResolver.php and Schema.php were modified to accommodate the new configurable limit.
  • Cross-component impacts: The changes affect how the RefResolver class handles nested references and how the Schema class initializes the resolver with a configurable depth.
  • Business value alignment: This change directly addresses a limitation that prevented the library from processing deeply nested schemas, like those used by Amazon. By making the recursion limit configurable, the library becomes more versatile and usable in a wider range of scenarios.

1.2 Technical Architecture

  • System design modifications: The PR introduces a new configuration option (maxNestLevel) to control the recursion depth. This option is passed through the Context object to the RefResolver.
  • Component interaction changes: The Schema class now passes the maxNestLevel option to the RefResolver constructor. The RefResolver uses this value instead of a hardcoded constant.
  • Integration points impact: This change impacts any integration point where JSON schemas with deep nesting are used.
  • Dependency changes and implications: No new dependencies were introduced.

2. Critical Findings

2.1 Must Fix (P0🔴)

Issue: Property Name Mismatch and Missing Context Property

  • Analysis Confidence: High
  • Impact: The configuration option maxNestLevel will not work as intended, rendering the PR's core functionality useless. The application will likely throw errors related to undefined properties. This is a critical, breaking change.
  • Resolution:
    1. Rename $max_deep_nesting to $maxNestLevel in RefResolver.php.
    2. Rename the constructor parameter $max_nest_level to $maxNestLevel in RefResolver.php.
    3. Add a public property $maxNestLevel with a default value (e.g., 200) to the Context class in src/Context.php.

2.2 Should Fix (P1🟡)

Issue: Inconsistent Error Messaging

  • Analysis Confidence: High
  • Impact: When the recursion limit is reached, the original error message provides limited information, making debugging difficult.
  • Suggested Solution: Improve the error message to include both the maximum allowed nesting level and the current nesting level at the point of failure.

Issue: Null Safety in Schema.php

  • Analysis Confidence: High
  • Impact: While the current null coalescing operator (??) provides a fallback, it's better to use a defined constant for clarity and maintainability.
  • Suggested Solution: Define a constant in the Context class for the default maximum nesting level (e.g., DEFAULT_MAX_NEST) and use that constant in the null coalescing operation.

2.3 Consider (P2🟢)

Area: Validation Guardrails

  • Analysis Confidence: Medium
  • Improvement Opportunity: Adding input validation for the maxNestLevel option can prevent misconfigurations and improve the robustness of the library. A very low value might lead to unexpected failures.

Area: Architectural Documentation

  • Analysis Confidence: Medium
  • Improvement Opportunity: Adding PHPDoc blocks to explain the new recursion mechanism and the maxNestLevel option would improve code maintainability and understanding.

2.4 Summary of Action Items

  • Immediate (P0): Fix the property name mismatch and add the missing maxNestLevel property to the Context class.
  • High Priority (P1): Improve the error message and enhance null safety with a defined constant.
  • Medium Priority (P2): Consider adding input validation and architectural documentation.
  • Timeline Suggestions: P0 issues should be addressed before merging. P1 issues should be addressed as soon as possible after the merge. P2 issues can be addressed in follow-up tasks.

3. Technical Analysis

3.1 Code Logic Analysis

📁 src/RefResolver.php - RefResolver constructor

  • Submitted PR Code:
    class RefResolver
    {
        public $resolutionScope = '';
        public $url;
        /** @var null|RefResolver */
        private $rootResolver;
        private int $max_deep_nesting = 200; //Change this via options submitted to ::import if needed

        /**
         * RefResolver constructor.
         * @param mixed $rootData
         */
        public function __construct($rootData = null, int $max_nest_level = 200)
        {
            $this->rootData = $rootData;
            $this->max_deep_nesting = $max_nest_level;
        }
    ...
  • Analysis:
    • The original code had a hardcoded nesting limit of 200. The PR introduces $max_deep_nesting and $max_nest_level to allow configuration. However, the names are inconsistent and do not follow standard PHP naming conventions (camelCase). The constructor parameter is also inconsistently named.
    • Edge cases: None identified in this specific snippet, but the overall change addresses the edge case of deeply nested schemas.
    • Cross-component impact: This change is directly related to the change in Schema.php, which passes the configuration option.
    • Business logic considerations: The change allows the library to handle schemas that previously caused errors due to excessive nesting.
  • LlamaPReview Suggested Improvements:
    class RefResolver
    {
        public $resolutionScope = '';
        public $url;
        /** @var null|RefResolver */
        private $rootResolver;
        private int $maxNestLevel = 200; //Change this via options submitted to ::import if needed // Use camelCase

        /**
         * RefResolver constructor.
         * @param mixed $rootData
         * @param int $maxNestLevel
         */
        public function __construct($rootData = null, int $maxNestLevel = 200) // Use camelCase
        {
            $this->rootData = $rootData;
            $this->maxNestLevel = $maxNestLevel; // Use camelCase
        }
    ...
  • Improvement rationale:
    • Technical benefits: Consistent naming improves code readability and maintainability. It also ensures the configuration option works as expected.
    • Business value: Fixes a critical bug that prevents the intended functionality from working.
    • Risk assessment: Low risk; the change is a simple renaming.

📁 src/RefResolver.php - preProcessReferences

  • Submitted PR Code:
    /**
     * @param mixed $data
     * @param Context $options
     * @param int $nestingLevel
     * @throws Exception
     */
    public function preProcessReferences($data, Context $options, $nestingLevel = 0)
    {
        if ($nestingLevel > $this->max_deep_nesting) { //Updated due to specific recursion depth from Amazon product JSON Schemas - yep 200 was not enough
            throw new Exception('Too deep nesting level. Suggest submitting maxNestLevel via options', Exception::DEEP_NESTING);
        }
    ...
  • Analysis:
    • This code checks the current nesting level against the configured maximum. The original code used a hardcoded value of 200. The PR correctly uses the $max_deep_nesting property (which needs renaming). The error message, however, is not very informative.
    • Edge cases: The edge case is when $nestingLevel exceeds the maximum.
    • Cross-component impact: None.
    • Business logic considerations: This check prevents infinite loops and stack overflow errors caused by excessively deep or circular references.
  • LlamaPReview Suggested Improvements:
    /**
     * @param mixed $data
     * @param Context $options
     * @param int $nestingLevel
     * @throws Exception
     */
    public function preProcessReferences($data, Context $options, $nestingLevel = 0)
    {
        if ($nestingLevel > $this->maxNestLevel) { // Use corrected property name
            throw new Exception(
                "Maximum recursion depth {$this->maxNestLevel} exceeded at level {$nestingLevel}", // Improved error message
                Exception::DEEP_NESTING
            );
        }
    ...
  • Improvement rationale:
    • Technical benefits: The improved error message provides more context for debugging. Using the corrected property name ensures the check works as intended.
    • Business value: Easier debugging improves developer productivity.
    • Risk assessment: Low risk; the change is a minor improvement to an existing check.

📁 src/Schema.php - import

  • Submitted PR Code:
    class Schema
    {
    ...
        public static function import($data, Context $options = null)
        {
        ...
            if ($options === null) {
                $options = new Context();
            }

            $options->import = true;

            if ($options->refResolver === null) {
                $options->refResolver = new RefResolver($data, $options->maxNestLevel ?? 200);
            } else {
                $options->refResolver->setRootData($data);
            }
        ...
  • Analysis:
    • This code initializes the RefResolver. The PR adds the logic to pass the maxNestLevel option from the Context object. However, the Context class does not have a $maxNestLevel property defined. This will result in an undefined property error. The null coalescing operator provides a fallback to 200, but it's better to use a defined constant.
    • Edge cases: None.
    • Cross-component impact: This code directly interacts with the RefResolver and Context classes.
    • Business logic considerations: This is the entry point for setting the recursion limit.
  • LlamaPReview Suggested Improvements:
    class Schema
    {
    ...
        public static function import($data, Context $options = null)
        {
        ...
            if ($options === null) {
                $options = new Context();
            }

            $options->import = true;

            if ($options->refResolver === null) {
                $options->refResolver = new RefResolver($data, $options->maxNestLevel ?? Context::DEFAULT_MAX_NEST); // Use constant
            } else {
                $options->refResolver->setRootData($data);
            }
        ...
    // Add to Context class:
    class Context {
        const DEFAULT_MAX_NEST = 200; // Define the constant
        public $maxNestLevel = self::DEFAULT_MAX_NEST; // Add the property and initialize
        ...
    }
  • Improvement rationale:
    • Technical benefits: Adding the missing property to the Context class fixes the undefined property error. Using a constant for the default value improves code clarity and maintainability.
    • Business value: Fixes a critical bug and improves code quality.
    • Risk assessment: Low risk; the change is straightforward and addresses a clear error.

3.2 Key Quality Aspects

  • System scalability considerations: The change allows for deeper recursion, which could potentially impact performance for extremely large and deeply nested schemas. However, this is a trade-off for increased flexibility.
  • Performance bottlenecks and optimizations: No specific performance bottlenecks were introduced by this PR, but deep recursion always carries a potential performance cost.
  • Testing strategy and coverage: The deep analysis suggests adding integration tests for deep nesting (500 levels) and failure tests for the default limit (200). This is a good recommendation.
  • Documentation needs: The new maxNestLevel option should be documented in the project's documentation (e.g., README or wiki).

4. Overall Evaluation

  • Technical assessment: The PR addresses a significant limitation but introduces critical bugs related to property naming and missing class properties. These bugs must be fixed before merging.
  • Business impact: The PR enables the library to handle a wider range of JSON schemas, specifically those with deep nesting, which is a significant business benefit.
  • Risk evaluation: High risk due to the introduced bugs. The risk is mitigated by the clear identification of the issues and the proposed solutions.
  • Notable positive aspects and good practices: The PR addresses a real-world problem and provides a configurable solution. The use of a Context object for configuration is a good practice.
  • Implementation quality: The implementation quality is currently low due to the bugs, but the overall approach is sound.
  • Final recommendation: Request Changes. The P0 issues (property name mismatch and missing Context property) must be addressed before merging. The P1 issues (improved error message and null safety) should also be addressed, either before merging or immediately after.

💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.

Update P0 Critical Finding
Address P0 Issue by adding $maxNestLevel public property
Address P1 Issue and improve error messaging when nesting level exceeded.
Address P1 Issue and set default for max level to constant DEFAULT_MAX_NEST
Address P1 Issue and set default for maxNestLevel to constant DEFAULT_MAX_NEST
Move constant DEFAULT_MAX_NEST to a static constant from the context object
Update constant references
Update documentation to reflect the new options
@loneregister
Copy link
Author

Updated issues to address P0, P1 and P2 commentary.

@loneregister loneregister marked this pull request as draft March 21, 2025 19:51
@loneregister loneregister marked this pull request as ready for review March 21, 2025 19:51
Copy link

@llamapreview llamapreview bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Status Update

Overview of Changes Since Last Review:

  • The author has addressed the critical issues (P0) related to property naming and missing class properties.
  • Improvements have been made to error messaging and null safety (P1).
  • Additional considerations for validation guardrails and architectural documentation (P2) have been incorporated.

Summary of Addressed Issues:

  • The property name mismatch in RefResolver.php has been resolved.
  • The missing maxNestLevel property has been added to the Context class.
  • Error messaging has been improved to provide more context.
  • Null safety has been enhanced with the use of a defined constant for the default maximum nesting level.

Quick Assessment of New Modifications:

  • The changes look promising and address the critical issues identified in the previous review.
  • New modifications include improved error handling and better documentation.

Detailed Analysis

Deep Dive into Specific Changes:

  1. Property Name Mismatch and Missing Context Property (P0):
    • Previous Issue: The maxNestLevel property was missing from the Context class, and there were inconsistent property names in RefResolver.php.
    • Current Status: The author has added the maxNestLevel property to the Context class and corrected the property names in RefResolver.php.
   // src/Context.php
   class Context extends MagicMap
   {
       const DEFAULT_MAX_NEST = 200;

       public $import = true;
       public $maxNestLevel = self::DEFAULT_MAX_NEST; // Added property
       // ... other properties
   }
   // src/RefResolver.php
   class RefResolver
   {
       private int $maxNestLevel = 200; // Corrected property name

       public function __construct($rootData = null, int $maxNestLevel = 200) // Corrected parameter name
       {
           $this->rootData = $rootData;
           $this->maxNestLevel = $maxNestLevel; // Corrected property assignment
       }
       // ... other methods
   }
  • Impact Assessment: These changes resolve the critical issues and ensure that the maxNestLevel configuration option works as intended.
  1. Improved Error Messaging (P1):
    • Previous Issue: The error message when the recursion limit was reached provided limited information.
    • Current Status: The error message now includes both the maximum allowed nesting level and the current nesting level.
   // src/RefResolver.php
   public function preProcessReferences($data, Context $options, $nestingLevel = 0)
   {
       if ($nestingLevel > $this->maxNestLevel) {
           throw new Exception(
               "Maximum recursion depth {$this->maxNestLevel} exceeded at level {$nestingLevel}",
               Exception::DEEP_NESTING
           );
       }
       // ... other logic
   }
  • Impact Assessment: This improvement aids in debugging by providing more context when the recursion limit is reached.
  1. Null Safety with Defined Constant (P1):
    • Previous Issue: The null coalescing operator (??) was used without a defined constant.
    • Current Status: A constant DEFAULT_MAX_NEST has been defined in the Context class and used in the null coalescing operation.
   // src/Schema.php
   public static function import($data, Context $options = null)
   {
       if ($options === null) {
           $options = new Context();
       }

       $options->import = true;

       if ($options->refResolver === null) {
           $options->refResolver = new RefResolver($data, $options->maxNestLevel ?? Context::DEFAULT_MAX_NEST); // Use constant
       } else {
           $options->refResolver->setRootData($data);
       }
       // ... other logic
   }
  • Impact Assessment: This change improves code clarity and maintainability by using a defined constant for the default value.
  1. Validation Guardrails and Architectural Documentation (P2):
    • Previous Issue: There was a need for input validation and better documentation.
    • Current Status: The author has added PHPDoc blocks to explain the new recursion mechanism and the maxNestLevel option.
   // src/Context.php
   /**
    * @var int property max recursive nesting depth
    */
   public $maxNestLevel = self::DEFAULT_MAX_NEST;
  • Impact Assessment: Adding documentation improves code maintainability and understanding.

Outstanding Concerns

Remaining Issues from Previous Review:

  • All critical issues (P0) have been addressed.
  • The improvements for error messaging and null safety (P1) have been implemented.
  • The considerations for validation guardrails and architectural documentation (P2) have been addressed.

New Issues Identified:

  • None identified in this follow-up review.

Potential Risks and Considerations:

  • Ensure that the new maxNestLevel option is thoroughly tested, including edge cases where the nesting level is very high or very low.
  • Monitor performance implications of allowing deeper recursion, especially for very large and deeply nested schemas.

Recommendations

Specific Suggestions for Improvement:

  • Testing: Add integration tests for deep nesting (e.g., 500 levels) and failure tests for the default limit (200 levels).
  • Documentation: Update the project's documentation (e.g., README or wiki) to include information about the new maxNestLevel option.

Priority Levels for Changes:

  • High Priority (P1): Ensure thorough testing of the maxNestLevel option, including edge cases.
  • Medium Priority (P2): Update project documentation to reflect the new configuration option.

Future Considerations:

  • Continuously monitor the performance and scalability implications of allowing deeper recursion.
  • Gather feedback from users to ensure the new configuration option meets their needs.

Conclusion

The follow-up review indicates that the author has addressed all critical issues and made significant improvements to error handling, null safety, and documentation. The changes are well-implemented and should be merged after ensuring thorough testing and updating the project documentation. Great job on addressing the feedback and improving the code quality!


💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.

@loneregister
Copy link
Author

Is there anything else I need to do here, to have this reviewed, and accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant