- 
                Notifications
    
You must be signed in to change notification settings  - Fork 666
 
Open
Labels
enhancementIncrementally add new featureIncrementally add new feature
Description
Version
jena-5.5.0
Feature
Description:
Implement W3C RDF Dataset Canonicalization (RDFC-1.0) algorithm in Apache Jena with output in canonical N-Quads format. This enables deterministic serialization of RDF datasets by assigning canonical identifiers to blank nodes.
References:
- RDFC-1.0 Algorithm: https://www.w3.org/TR/rdf-canon/
 - Canonical N-Quads Format: https://www.w3.org/TR/rdf12-n-quads/#canonical-quads
 - W3C Test Suite: https://github.com/w3c/rdf-tests/tree/main/rdf/rdf12/rdf-n-quads/c14n and https://w3c.github.io/rdf-canon/tests/
 
Tasks:
- Create NQuadsCanonicalWriter class extending WriterDatasetRIOTBase
 - Add NQUADS_CANONICAL format constant to RDFFormat
 - Register canonical writer factory in RDFWriterRegistry
 -    Implement RDFC10Canonicalizer with complete RDFC-1.0 algorithm
- Create HashUtils for SHA-256 hash computations and lexicographic sorting
 - Implement CanonicalIssuer for _c14n_N blank node identifier assignment
 - Add DatasetProcessor for blank node extraction and dataset processing
 
 - Download and integrate W3C canonicalization test suite to jena-arq/testing/rdf12-wg/rdf-n-quads-c14n/
 - Update Scripts_RIOT_c14n.java test factory following existing RIOT patterns
 - Implement RDFCanonicalizationTest for algorithm validation leveraging https://w3c.github.io/rdf-canon/tests/
 - Add writeCanonical() and canonicalizeDataset() methods to RDFDataMgr
 - Add --canonical flag support to riot command line tool
 - Update documentation and create usage examples
 
Are you interested in contributing a solution yourself?
Yes
Metadata
Metadata
Assignees
Labels
enhancementIncrementally add new featureIncrementally add new feature