|
| 1 | +--- |
| 2 | +layout: posts |
| 3 | +classes: wide |
| 4 | +title: "CLAMS wrapper for spaCy NLP (v1.2)" |
| 5 | +date: 2024-06-11T12:30:19+00:00 |
| 6 | +--- |
| 7 | +## About this version |
| 8 | + |
| 9 | +- Submitter: [marcverhagen](https://github.com/marcverhagen) |
| 10 | +- Submission Time: 2024-06-11T12:30:19+00:00 |
| 11 | +- Prebuilt Container Image: [ghcr.io/clamsproject/app-spacy-wrapper:v1.2](https://github.com/clamsproject/app-spacy-wrapper/pkgs/container/app-spacy-wrapper/v1.2) |
| 12 | +- Release Notes |
| 13 | + |
| 14 | + > Bumping Python SDK version, bug fixes and documentation updates |
| 15 | + > - Updated to clams-python 1.2.2 |
| 16 | + > - Fixed token length (issue #30) |
| 17 | + > - Fixed problems with the pretokenized parameter (issue #32) |
| 18 | + > - Various documentation fixes. |
| 19 | +
|
| 20 | +## About this app (See raw [metadata.json](metadata.json)) |
| 21 | + |
| 22 | +**Apply spaCy NLP to all text documents in a MMIF file.** |
| 23 | + |
| 24 | +- App ID: [http://apps.clams.ai/spacy-wrapper/v1.2](http://apps.clams.ai/spacy-wrapper/v1.2) |
| 25 | +- App License: Apache 2.0 |
| 26 | +- Source Repository: [https://github.com/clamsproject/app-spacy-wrapper](https://github.com/clamsproject/app-spacy-wrapper) ([source tree of the submitted version](https://github.com/clamsproject/app-spacy-wrapper/tree/v1.2)) |
| 27 | +- Analyzer Version: 3.6 |
| 28 | +- Analyzer License: MIT |
| 29 | + |
| 30 | + |
| 31 | +#### Inputs |
| 32 | +(**Note**: "*" as a property value means that the property is required but can be any value.) |
| 33 | + |
| 34 | +- [http://mmif.clams.ai/vocabulary/TextDocument/v1](http://mmif.clams.ai/vocabulary/TextDocument/v1) (required) |
| 35 | +(of any properties) |
| 36 | + |
| 37 | +- [http://vocab.lappsgrid.org/Token](http://vocab.lappsgrid.org/Token) |
| 38 | +(of any properties) |
| 39 | + |
| 40 | + |
| 41 | + |
| 42 | +#### Configurable Parameters |
| 43 | +(**Note**: _Multivalued_ means the parameter can have one or more values.) |
| 44 | + |
| 45 | +- `pretokenized`: optional, defaults to `false` |
| 46 | + |
| 47 | + - Type: boolean |
| 48 | + - Multivalued: False |
| 49 | + - Choices: **_`false`_**, `true` |
| 50 | + |
| 51 | + |
| 52 | + > Boolean parameter to set the app to use existing tokenization, if available, for text documents for NLP processing. Useful to process ASR documents, for example. |
| 53 | +- `pretty`: optional, defaults to `false` |
| 54 | + |
| 55 | + - Type: boolean |
| 56 | + - Multivalued: False |
| 57 | + - Choices: **_`false`_**, `true` |
| 58 | + |
| 59 | + |
| 60 | + > The JSON body of the HTTP response will be re-formatted with 2-space indentation |
| 61 | +
|
| 62 | + |
| 63 | +#### Outputs |
| 64 | +(**Note**: "*" as a property value means that the property is required but can be any value.) |
| 65 | + |
| 66 | +(**Note**: Not all output annotations are always generated.) |
| 67 | + |
| 68 | +- [http://vocab.lappsgrid.org/Token](http://vocab.lappsgrid.org/Token) |
| 69 | +(of any properties) |
| 70 | + |
| 71 | +- [http://vocab.lappsgrid.org/Token#pos](http://vocab.lappsgrid.org/Token#pos) |
| 72 | +(of any properties) |
| 73 | + |
| 74 | +- [http://vocab.lappsgrid.org/Token#lemma](http://vocab.lappsgrid.org/Token#lemma) |
| 75 | +(of any properties) |
| 76 | + |
| 77 | +- [http://vocab.lappsgrid.org/NounChunk](http://vocab.lappsgrid.org/NounChunk) |
| 78 | +(of any properties) |
| 79 | + |
| 80 | +- [http://vocab.lappsgrid.org/Sentence](http://vocab.lappsgrid.org/Sentence) |
| 81 | +(of any properties) |
| 82 | + |
| 83 | +- [http://vocab.lappsgrid.org/NamedEntity](http://vocab.lappsgrid.org/NamedEntity) |
| 84 | +(of any properties) |
| 85 | + |
0 commit comments