Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ValueError: Unterminated String" when calling gen_outline.py as a module from a separate Python script, but NOT when running gen_outline.py by itself #32

Open
Michael-B-G opened this issue Sep 1, 2016 · 4 comments

Comments

@Michael-B-G
Copy link

Running Python 2.7.8, in a PyDev environment.

I have verified that all of the input parameters (sys.argv) are the same in each scenario, yet when I call gen_outline.py's main function from another Python program (first modifying the sys.argv as necessary), I receive the following result:

gen_outline.main()
File "C:<my_path>\gen_outline.py", line 90, in main
outline = make_outline(args.json_file, args.each_line, args.collection)
File "C:<my_path>\gen_outline.py", line 61, in make_outline
key_map = gather_key_map(iterator)
File "C:<my_path>\gen_outline.py", line 37, in gather_key_map
for d in iterator:
File "C:<my_path>\gen_outline.py", line 28, in coll_iter
data = json.load(f)
File "C:\Python278\lib\json__init__.py", line 290, in load
**kw)
File "C:\Python278\lib\json__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python278\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python278\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Unterminated string starting at: line 1 column 200693 (char 200692)

At that specific character in the JSON is the starting quote for a value on the penultimate "line" of the JSON file (even though it's all technically a single line, but this is how my editor is displaying it with word wrapping).

The data entry is

"ad_hoc_command_events": "/api/v1/hos
ts/111/ad_hoc_command_events/"

Again, the character in question is the quotation mark just before "/api/v1...etc.". As I said, I don't get this error if I call gen_outline.py by itself. This JSON data name/value pair is hardly unique in the entire document, so I'm not sure why this exact place in the document is choking it, but the only possible theory I have is that, again, it starts the value of the last "line" in the JSON document (but as I said before, it is all on a single line, so it shouldn't matter). I've spent quite awhile trying to figure this out, using PyDev's debugger and everything, but I'm still not getting anywhere.

Please let me know if you can help, and if you need any more information on my end to do so.

Thanks in advance,

Michael

@evidens
Copy link
Owner

evidens commented Sep 2, 2016

Is that line break something you inserted or was that in the file? You might find the JSON parser isn't terribly forgiving about such things.

the problem really does seem to be at the decoder level and not in the outline generator. I would also look for funky characters where that breaks.

If you're positive the JSON isn't the source of the problem I'll try to find time this weekend to diagnose.

@Michael-B-G
Copy link
Author

It appears to be a line break and that's how my text editor, UltraEdit, displays it--but as far as I can tell (by showing all characters), there are no line breaks in the document at all, and everything is on a single line. I verified this in Notepad++ as well.

I'm honestly not sure what the root of this problem is. Perhaps it's an encoding issue of some kind? But I can't guarantee there's something faulty with gen_outline.py...the problem may in fact lie in Python's native JSON library! Though from my googling, I haven't been able to find anything online that matches exactly what I'm seeing. I suppose what I could do now is test this in a newer version of Python and see if I get the same problem, though I'm guessing I will.

Thank you very much for offering to look into this. I sincerely appreciate it.

@Michael-B-G
Copy link
Author

Oh, never mind--looking at one of other issues I see that this is Python2-only, so 2.7.8 effectively IS the newest version.

@Michael-B-G
Copy link
Author

Michael-B-G commented Sep 6, 2016

This issue is definitely not unique to your program, as I tried out this tool and got the same result:

https://github.com/vinay20045/json-to-csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants