Skip to content

backslash characters in clinicaltrials #453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
micheldumontier opened this issue May 12, 2017 · 4 comments
Closed

backslash characters in clinicaltrials #453

micheldumontier opened this issue May 12, 2017 · 4 comments

Comments

@micheldumontier
Copy link
Member

Hi everyone,

I was trying to extend the ClinicalTrials.gov bio2rdf conversion script (see jvsoest@ce09d83), and test it before creating a pull request.

Generation of triples runs fine, and creates the clinicaltrials.nq.gz file. However some URIs in this file are containing strange backslash characters at random locations (see example below, "clnicaltrials_voca\bulary").

Does anyone know what could cause this issue?

Thanks in advance!

Kind regards,
Johan

Example:
http://bio2rdf.org/clinicaltrials_vocabulary:Clinical-Study http://bio2rdf.org/bio2rdf_vocabulary:namespace "clinicaltrials_voca\bulary"^^http://www.w3.org/2001/XMLSchema#string http://bio2rdf.org/clinicaltrials_resource:bio2rdf.dataset.clinicaltrials.R4 .

@micheldumontier
Copy link
Member Author

what platform are you using? can you share a (smallish) generated file?

@jvsoest
Copy link

jvsoest commented May 16, 2017

Hi Michel,

I'm trying this on Ubuntu 16.04, with PHP version 7.0.15.
I've attached the conversion for one trial, where it sometimes escapes the "b" character.
clinicaltrials.nq.gz

Maybe it's in the specialEscape function of rdfapi, I'll give it a try as well.

Regards,
Johan

@jvsoest
Copy link

jvsoest commented May 16, 2017

As suspected, specialEscape and safeLiteral in rdfapi.php did the trick for me (currently). I've removed "\b" (first two characters) from the addcslashes function. I think it handles the b as a character, instead of \b as backspace. But I'm not sure if this creates problems for other bio2rdf scripts.

@micheldumontier
Copy link
Member Author

ok, i replaced the string definition with a hex list. see micheldumontier/php-lib@252b8ef

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants