Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect escaping of & without a trailing space #31

Closed
TimG1964 opened this issue Feb 7, 2025 · 0 comments
Closed

Incorrect escaping of & without a trailing space #31

TimG1964 opened this issue Feb 7, 2025 · 0 comments

Comments

@TimG1964
Copy link
Contributor

TimG1964 commented Feb 7, 2025

Maybe this is somehow connected with #17. Or maybe I'm simply misunderstanding something.

I'm using XML.jl to work directly with Excel files. A string like this is a valid cell entry in Excel:

"https://myhouse.sharepoint.com/:i:/r/sites/CorporateServices-COR08PublicAffairsandRelations/Shared%20Documents/COR08%20Public%20Affairs%20and%20Relations/Case%20Studies/Photos/My%20favourite%20society/MAT_5946%20(1).jpg?csf=1&web=1&e=tIywhD"

I can paste it directly in to an Excel workbook by hand.

But I cannot use XML.jl to add it to a cell programmatically because XML.escape() won't escape the two ampersands. The escape function looks for a space after the ampersand, and so these two aren't found.

julia> const escape_chars = ('&' => "&amp;", '<' => "&lt;", '>' => "&gt;", "'" => "&apos;", '"' => "&quot;")
('&' => "&amp;", '<' => "&lt;", '>' => "&gt;", "'" => "&apos;", '"' => "&quot;")

julia> function escape(x::String)
           result = replace(x, r"&(?=\s;)" => "&amp;")
           for (pat, r) in escape_chars[2:end]
                   result = replace(result, pat => r)
           end
           return result
       end
escape (generic function with 1 method)

julia> x = "https://myhouse.sharepoint.com/:i:/r/sites/CorporateServices-COR08PublicAffairsandRelations/Shared%20Documents/COR08%20Public%20Affairs%20and%20Relations/Case%20Studies/Photos/My%20favourite%20society/MAT_5946%20(1).jpg?csf=1&web=1&e=tIywhD"
"https://myhouse.sharepoint.com/:i:/r/sites/CorporateServices-COR08PublicAffairsandRelations/Shared%20Documents/COR08%20Public%20Affairs%20and%20Relations/Case%20Studies/Photos/My%20favourite%20society/MAT_5946%20(1).jpg?csf=1&web=1&e=tIywhD"

julia> escape(x)
"https://myhouse.sharepoint.com/:i:/r/sites/CorporateServices-COR08PublicAffairsandRelations/Shared%20Documents/COR08%20Public%20Affairs%20and%20Relations/Case%20Studies/Photos/My%20favourite%20society/MAT_5946%20(1).jpg?csf=1&web=1&e=tIywhD"

julia> 

A simple change of the regex to r"&(?!amp;|quot;|apos;|gt;|lt;)" seems to work for me, but I'm not sure if this is a general solution.

TimG1964 added a commit to TimG1964/XLSX.jl that referenced this issue Feb 8, 2025
numbers, dates, times and bools.

Also updated to address an issue with escaping in XML.jl
(issue 31 in XML.jl - JuliaComputing/XML.jl#31)

Also addresses a minor bug not previously identified in tests.
TimG1964 added a commit to TimG1964/XML.jl that referenced this issue Feb 8, 2025
TimG1964 added a commit to TimG1964/XML.jl that referenced this issue Feb 8, 2025
joshday added a commit that referenced this issue Feb 11, 2025
Update `escape()` to address issue #31
@joshday joshday closed this as completed Feb 15, 2025
felipenoris pushed a commit to felipenoris/XLSX.jl that referenced this issue Mar 1, 2025
* Remove remaining dependency on ZipFile.jl

* Remove overlooked use of ZipFile

* No need to close zip io any longer

* No need to close zip io any longer

* Now also removed dependency on EzXML.jl except for calls to the
 overloaded findall() and findfirst() functions and a single call
 to EzXMLunlnk().

* Replace EzXML.findall() with a new find_all_nodes()
function that uses the XML.jl API.

* Replaced EzXML.findfirst() with find_all_nodes()[begin]

* Further changes to unlink rows in write.jl

* Yesterday's changes

* Finally got SheetRowStreamIterator to work!

* Further changes bug fixing failed tests

* Force recompile?

* Now passing all tests except `escape`

* Clean up remaining open and close actions

* Final fixes to escape tests

* Remove unnecesaary data files

* Tidy-up

* Don't pretty print

* Remove last pretty printing example

* Simplify regex that undoes pretty printing

* Updated to address #281. This now works for text,
numbers, dates, times and bools.

Also updated to address an issue with escaping in XML.jl
(issue 31 in XML.jl - JuliaComputing/XML.jl#31)

Also addresses a minor bug not previously identified in tests.

* Routing abstract numbers to new code to update
cell value only.

* Small tidy-up

* Add two new functions to make it easier to work
with cell formats:
- `setFont` to set the font of a cell
- `getFont` to retrieve the font of a cell

* Drop the `scheme` attribut to encourage font names to work.

* Add some tests for setFont().

* Implement `setFont()` over cell ranges and column ranges

* Small changes to docstrings and comments.

* Add `getBorders()` and `setBorders()` to get and
set cell borders (untested).

* Now with some tests for `getBorders` and `setBorders`

* A few more tests...

* Added `setUniformBorder()` and relevant tests.

* Added setFill and setUniformFill functions

* Added getAlignment and setAlignment functions
(not quite working yet!)

* SetAlignment function now working with tests

* Added a setOutsideBorder function (n tests yet).

* Added a setFormat function (no tests yet)

* Added a few more tests.

* Add `setColumnWidth` function

* Add fix for issue #275

* Add `setRowHeight()` function.

* Removed stray file

* Add a few changes for consistency

* Add yet more tests!

* Add (unimplemented) function to remove rare illegal characters.

* Properly unimplement the illegal characters function!

* Adjust docstring formatting.

* Fix filename typo

* Add formatting functions to API Reference

* Bump Julia compat to 1.7

* Update ci.yml to take out Julia 1.6

* Try removing generic function definitions

* Changed `@Ref` to `@ref` in `cellformats.jl`

* Update cellformats.jl - correct last `@Ref`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants