Skip to content

Commit 1cf649d

Browse files
committed
Updated doc
1 parent 73b53a7 commit 1cf649d

File tree

3 files changed

+16
-67
lines changed

3 files changed

+16
-67
lines changed

README.md

Lines changed: 0 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -9,24 +9,6 @@
99
[Previous entries](#History)
1010

1111

12-
## Overview
13-
14-
RDF/SPARQL Workflows on the Command Line made easy. The RDF Processing Toolkit (RPT) integrates several of our tools into a single CLI frontend:
15-
It features commands for running SPARQL-statements on triple and quad based data both streaming and static.
16-
SPARQL extensions for working with CSV, JSON and XML are included. So is an RML toolkit that allows one to convert RML to SPARQL (or TARQL).
17-
Ships with Jena's ARQ and TDB SPARQL engines as well as one based on Apache Spark.
18-
19-
RPT is Java tool which comes with debian and rpm packaging. It is invoked using `rpt <command>` where the following commands are supported:
20-
21-
* [integrate](README-SI.md): This command is the most relevant one for day-to-day RDF processing. It features ad-hoc querying, transformation and updating of RDF datasets with support for SPARQL-extensions for ingesting CSV, XML and JSON. Also supports `jq`-compatible JSON output that allows for building bash pipes in a breeze.
22-
* [ngs](README-NGS.md): Processor for named graph streams (ngs) which enables processing for collections of named graphs in streaming fashion. Process huge datasets without running into memory issues.
23-
* [sbs](README-SBS.md): Processor for SPARQL binding streams (sbs) which enables processing of SPARQL result sets in streaming fashion. Most prominently for use in aggregating the output of a `ngs map` operation.
24-
* [rmltk](https://github.com/Scaseco/r2rml-api-jena/tree/jena-5.0.0#usage-of-the-cli-tool): These are the (sub-)commands of our (R2)RML toolkit. The full documentation is available [here](https://github.com/SmartDataAnalytics/r2rml-api-jena).
25-
* sansa: These are the (sub-)commands of our Semantic Analysis Stack (Stack) - a Big Data RDF Processing Framework. Features parallel execution of RML/SPARQL and TARQL (if the involved sources support it).
26-
27-
28-
**Check this [documentation](doc) for the supported SPARQL extensions with many examples**
29-
3012
## Example Usage
3113

3214
* `integrate` allows one to load multiple RDF files and run multiple queries on them in a single invocation. Further prefixes from a snapshot of [prefix.cc](https://prefix.cc) are predefined and we made the SELECT keyword of SPARQL optional in order to make scripting less verbose. The `--jq` flag enables JSON output for interoperability with the conventional `jq` tool
@@ -87,41 +69,6 @@ The exact definitions can be viewed with `rpt cpcat resource.rq`.
8769
* [Linked Sparql Queries](https://github.com/AKSW/LSQ) provides tools to RDFize SPARQL query logs and run benchmark on the resulting RDF. The triples related to a query represent an instance of a sophisticated domain model and are grouped in a named graph. Depending on the input size one can end up with millions of named graphs describing queries amounting to billions of triples. With ngs one can easily extract complete samples of the queries' models without a related triple being left behind.
8870
8971
90-
## Building
91-
The build requires maven.
92-
93-
For convenience, this [Makefile](Makefile) which defines essential goals for common tasks.
94-
To build a "jar-with-dependencies" use the `distjar` goal. The path to the created jar bundle is shown when the build finishes.
95-
In order to build and and install a deb or rpm package use the `deb-rere` or `rpm-rere` goals, respectively.
96-
97-
```
98-
$ make
99-
100-
make help # Show these help instructions
101-
make distjar # Create only the standalone jar-with-dependencies of rpt
102-
make rpm-rebuild # Rebuild the rpm package (minimal build of only required modules)
103-
make rpm-reinstall # Reinstall rpm (requires prior build)
104-
make rpm-rere # Rebuild and reinstall rpm package
105-
make deb-rebuild # Rebuild the deb package (minimal build of only required modules)
106-
make deb-reinstall # Reinstall deb (requires prior build)
107-
make deb-rere # Rebuild and reinstall deb package
108-
make docker # Build Docker image
109-
make release-bundle # Create files for Github upload
110-
```
111-
112-
A docker image is available at https://registry.hub.docker.com/r/aksw/rpt
113-
114-
The docker image can be built with a custom tag by setting the property `docker.tag`.
115-
The default for `docker.tag` is `${docker.tag.prefix}${project.version}`, where `docker.tag.prefix` defaults to the empty string.
116-
When only setting `docker.tag.prefix` to e.g. `myfork-` then the tag will have the form `myfork-1.2.3-SNAPSHOT`.
117-
118-
```bash
119-
make docker
120-
121-
# Example for providing a custom docker tag via make:
122-
make docker ARGS='-D"docker.tag.prefix=experimental-"'
123-
```
124-
12572
## License
12673
The source code of this repo is published under the [Apache License Version 2.0](LICENSE).
12774
Dependencies may be licensed under different terms. When in doubt please refer to the licenses of the dependencies declared in the `pom.xml` files.

docs/getting-started/index.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@ layout: default
1111

1212
### Downloads
1313

14-
You can download RPT as a JAR-bundle, Debian package, or RPM package from [RPT's GitHub release page](https://github.com/SmartDataAnalytics/RdfProcessingToolkit/releases).
15-
14+
You can download RPT as self-contained Debian or RPM packages from [RPT's GitHub release page](https://github.com/SmartDataAnalytics/RdfProcessingToolkit/releases).
1615

16+
Note, that for running the JAR bundle with the `java` command yourself you need to add the appropriate `--add-opens` declarations. This is documented on the [Building from Source](getting-started/build.html) page.
1717

1818
### Docker
1919

@@ -24,8 +24,7 @@ The quickest way to start an RPT instance is via docker. The container name is `
2424
`docker pull aksw/rpt:latest-dev`
2525

2626

27-
28-
A typical invocation of the container is as follows:
27+
For example, a typical invocation of the `integrate` command is as follows:
2928

3029
`docker run -i -p'8642:8642' -v "$(pwd):/data" -w /data aksw/rpt integrate --server YOUR_DATA.ttl`
3130

docs/index.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,22 @@ nav_order: 10
66

77
# RDF Processing Toolkit (RPT)
88

9-
RPT is a SPARQL-centric command line toolkit for processing RDF data that also comes with an integrated a web server.
10-
The `integrate` command is the most powerful one: It accepts as arguments rdf files and sparql query/update statements which are run in a pipeline. The `--server` option starts a web server with SPARQL and GraphQL endpoints over the provided data.
11-
RPT can also function as a SPARQL proxy using the `remote` engine.
9+
RDF/SPARQL Workflows on the Command Line made easy. The RDF Processing Toolkit (RPT) integrates several of our tools into a single CLI frontend:
10+
It features commands for running SPARQL-statements on triple and quad based data both streaming and static.
11+
SPARQL extensions for working with CSV, JSON and XML are included. So is an RML toolkit that allows one to convert RML to SPARQL (or TARQL).
12+
RPT ships with Jena's ARQ and TDB SPARQL engines as well as one based on Apache Spark.
1213

14+
The [`integrate`](integrate) command is the most versatile one: It accepts as arguments rdf files and sparql query/update statements which are run in a pipeline. The `--server` option starts a web server with SPARQL and GraphQL endpoints over the provided data.
15+
Using `integrate` with the `remote` engine allows RPT to act as a [SPARQL proxy](integrate/#example-4-sparql-proxy).
1316

17+
RPT is Java tool which comes with debian and rpm packaging. It is invoked using `rpt <command>` where the following commands are supported:
1418

15-
## Command Overview
19+
* [integrate](integrate): This command is the most relevant one for day-to-day RDF processing. It features ad-hoc querying, transformation and updating of RDF datasets with support for SPARQL-extensions for ingesting CSV, XML and JSON. Also supports `jq`-compatible JSON output that allows for building bash pipes in a breeze.
20+
* [ngs](named-graph-streams): Processor for named graph streams (ngs) which enables processing for collections of named graphs in streaming fashion. Process huge datasets without running into memory issues.
21+
* [sbs](sparql-binding-streams): Processor for SPARQL binding streams (sbs) which enables processing of SPARQL result sets in streaming fashion. Most prominently for use in aggregating the output of a `ngs map` operation.
22+
* [rmltk](https://github.com/Scaseco/r2rml-api-jena/tree/jena-5.0.0#usage-of-the-cli-tool): These are the (sub-)commands of our (R2)RML toolkit. The full documentation is available [here](https://github.com/SmartDataAnalytics/r2rml-api-jena).
23+
* sansa: These are the (sub-)commands of our Semantic Analysis Stack (Stack) - a Big Data RDF Processing Framework. Features parallel execution of RML/SPARQL and TARQL (if the involved sources support it).
1624

17-
* [`integrate`](integrate): Powerful RDF and SPARQL processing with a [GraphQL](graphql).
1825

19-
* `rmltk`: R2RML and RML toolkit. Converts mappings to SPARQL and executes them
20-
21-
* `sansa`: Big Data (based on Apache Spark and Apache Hadoo)
22-
23-
26+
**Check this [documentation](doc) for the supported SPARQL extensions with many examples**
2427

0 commit comments

Comments
 (0)