PyBioC

PyBioC is a native Python library for reading and writing BioC XML data.

More information about BioC is available at sourceforge.

Installation

Use pip:

pip install git+https://github.com/OntoGene/PyBioC.git

For Python 3, you might have to type pip3.

Usage

Two example programs, test_read+write.py and stemming.py are shipped in the src/ folder.

test_read+write.py shows the very basic reading and writing capability of the library.
stemming.py uses the Python Natural Language Toolkit (NLTK) library to manipulate a BioC XML file read in before; it then tokenizes the corresponding text, does stemming on the tokens and transforms the manipulated PyBioC objects back to valid BioC XML format.

Example

Generate BioC object for export

from bioc import BioCXMLWriter, BioCCollection, BioCDocument, BioCPassage

writer = BioCXMLWriter()
writer.collection = BioCCollection()
collection = writer.collection
collection.date = '20150301'
collection.source = 'ngy1 corpus'

document = BioCDocument()
document.id = '123456'  # pubmed id

passage = BioCPassage()
passage.put_infon('type', 'paragraph')
passage.offset = '0'
passage.text = 'This is a biomedical sentence about various rare diseases.'
document.add_passage(passage)

collection.add_document(document)

print writer

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
src		src
test_input		test_input
.gitignore		.gitignore
CHANGES.txt		CHANGES.txt
LICENSE.txt		LICENSE.txt
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyBioC

Installation

Usage

Example

Generate BioC object for export

About

Releases

Packages

Languages

License

OntoGene/PyBioC

Folders and files

Latest commit

History

Repository files navigation

PyBioC

Installation

Usage

Example

Generate BioC object for export

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages