-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Support format 12 CMAP of TrueType font #3738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -898,9 +928,10 @@ var CmapTable = (function(_super) { | |||
i = 0 <= tableCount ? ++i : --i | |||
) { | |||
entry = new CmapEntry(data, this.offset); | |||
if (Object.keys(entry.codeMap).length === 0) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When TTF has the unsupported version of format CMAP, the entry
with an empty codeMap
is created, so I skip it.
this.tables.push(entry); | ||
if (entry.isUnicode) { | ||
if (this.unicode == null) { | ||
if (this.unicode == null || this.unicode.format < entry.format) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some codes refer this.tables[0]
, so when TTF has format 4 and 12, format 4 is referred.
To solve it, I sorted entries by format version
Father more, I think many people want to refer newer CMAP format version.
if (codePoint > 0xffff) { | ||
i++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip later half of surrogate pair
unicode = map[code]; | ||
if (unicode > 0xffff) { | ||
unicode -= 0x10000; | ||
unicode = | ||
((unicode >> 10) + 0xd800).toString(16).padStart(4, "0") + | ||
((unicode % 0x400) + 0xdc00).toString(16).padStart(4, "0"); | ||
} else { | ||
unicode = unicode.toString(16).padStart(4, "0"); | ||
} | ||
code = (+code).toString(16).padStart(4, "0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an implementation for outputting surrogate pairs
I see PDF 1.3 specification because README says to follow PDF 1.3 specification.
It says that in 3.8.1
The remainder of the string consists of Unicode character codes, according to the UTF-16 encoding
specified in the Unicode standard, version 2.0.
So I output code as UTF-16 surrogate pairs encoding.
Fixes #3737
This change supports format 12 CMAP of TrueType font and adds test suites.
CmapEntry
supports format 12. Format 12 is simpler than format 4, so it's not a big change.utf8.js
supports surrogate pair.Before

After
