Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latex: detect non utf8 encoding and convert file #7584

Open
haraldschilly opened this issue May 27, 2024 · 1 comment
Open

latex: detect non utf8 encoding and convert file #7584

haraldschilly opened this issue May 27, 2024 · 1 comment

Comments

@haraldschilly
Copy link
Contributor

haraldschilly commented May 27, 2024

This problem happens for an "IEEE conference template", which is encoded as ISO-8859. But this is a more general issue that could happen with any tex file coming from outside of CoCalc.

  1. Get the template from https://www.ieeesmc2024.org/call-for-paper by scrolling down and clicking on the "LATEX TEMPLATE" button
  2. extract it in CoCalc (I did in in a terminal, unzip ieeeconf.zip and open root.tex)

Observe there are broken chars in the sources + errors in line 61 and onwards:

Screenshot from 2024-05-27 13-41-31

Switching the engine to "xelatex" in the build/select engine dropdown, at least gets rid of the errors:

Screenshot from 2024-05-27 13-42-46


It's unclear what the aim of this ticket is. The expected behavior is certainly that there are no such bad characters. Maybe as a first step, we should just figure out how the tex file could be converted, such that these characters are cleaned up. (workaround is below). The actual fix is probably to run file ... if it is a new file, and convert it automatically to UTF8. I think it's too hard to change the editor itself to switch the encoding specific to a file.

@haraldschilly
Copy link
Contributor Author

haraldschilly commented May 27, 2024

Workaround

$ file root.tex 
root.tex: LaTeX 2e document, ISO-8859 text, with very long lines (902), with CRLF line terminators

reveals this is for windows, or something like that. Converting it to UTF-8 fixes this:

$ iconv -f ISO-8859-1 -t UTF-8 root.tex -o root-utf8.tex

@haraldschilly haraldschilly changed the title latex: IEEE conf template contains broken characters latex: detect non utf8 encoding and convert file May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant