Replies: 4 comments 9 replies
-
Hey Ian, let me know if this is what you're looking for: Characters:
Tilded characters:
(the ç is only used in the catalan language but I think it's not too much of a hassle to keep it as an altgr version of the c) Spanish Punctuation Symbols: In spanish opening and closing quotation marks (") are different, but here in github they look the same. Mostly world processors decorate them on the fly. Same for single qm's, here they look like two apostrophes, so I haven't listed them. Other symbols:
There might be also other characters and symbols that are not on the qwerty-es keyboard that we could consider including for their usefulness. @NickG13 do you think there are any missing character or symbol? |
Beta Was this translation helpful? Give feedback.
-
"In spanish opening and closing quotation marks (") are different, but here in github they look the same. Mostly world processors decorate them on the fly. Same for single qm's, here they look like two apostrophes, so I haven't listed them." So you have ASCII quotes ' and " (U+0027 U+0022) as well as typographic quotes ‘’“” (2018, 2019, 201C, 201D) ? Are ¨ and ´ dead characters just for diacritics? Regarding moving punctuation around, Shai wrote about that: The gender(?) indicators ºª seem to be seldom used? Thanks, Ian |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Okay, I have a 3.4GB file, but there are a ton of Chinese characters, here is one place where they show up: So I'm guessing there is a bug in UniLeipzig's extraction process, since
I think the simplest 'cleanup' procedure will be to simply discard any sentences with characters that are NOT in the Spanish Keyboard list. |
Beta Was this translation helpful? Give feedback.
-
Hi
Can someone post the desired character list?
Thanks, Ian
Beta Was this translation helpful? Give feedback.
All reactions