-
Notifications
You must be signed in to change notification settings - Fork 349
Citation metadata: documentation issue #2723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Some additional concerns: the variable list at https://quarto.org/docs/reference/metadata/citation.html contains some items that are definitely not CSL variables (as per https://docs.citationstyles.org/en/stable/specification.html), including
I do recognize these might serve some useful purpose, but I don't think it's a good idea to list them among valid CSL variables without any clarification, since none of them will be processed by any of the current citeprocs nor, even if these did process variables other than the official ones at all (which they don’t), by any of the existing CSL styles in the official repository (https://github.com/citation-style-language/styles). If the quarto developers feel these variables could usefully supplement the set of existing CSL variables it would seem best to start discussing this with the CSL folks. I am also wondering how exactly the BibLaTeX data are assembled from the Another hint that pandoc does not seem to be involved when assembling BibLaTeX data either is the fact that a perfectly valid (pandoc) CSL YAML construct inside a
is ignored, with an empty Clarification of these issues would be desirable, although I assume already that if pandoc is indeed not involved in any of the above tasks, things could be greatly improved if it were. |
Thank you, you are correct! I've updated the example. Further answers below...
For now, I've corrected the label. Per the below suggestion, I agree it would be great to offer more than one format, which should address the whole issue, hopefully.
This is a great suggestion - I'll open another issue to track providing a few formats specifically.
Correct! A bug fix is on the way.
We currently use Pandoc do the rendering from CSL to BibLaTeX- this is the Pandoc behavior. |
As the page notes, this data is based upon CSL, but allows the specification of additional fields, typically because they are used by Google Scholar metadata. I think that this is ok - I don't think we should necessarily aspire for the citation field to be strict CSL since we may have needs that go beyond the goals of CSL. Perhaps it is worth clarifying in that headings, maybe not strong enough...
We are using Pandoc to render the BibLaTeX. When rendering this front matter: ---
title: CSL Example
author: Charles Teague
date: last-modified
description: |
This provides an example of CSL formatting being used to control the output of references for a document. Note that the CSL provides the style for both the generated bibliography for the page, but also for the citation information for the page itself, included in the appendix.
bibliography: example.bib
citation:
title: "Hello World"
type: article-journal
container-title: "Journal of Data Science Software"
doi: "10.23915/reprodocs.00010"
url: https://example.com/summarizing-output
issued: 2022-10-01
--- We generate the following CSL JSON: {
title: "Hello World",
type: "article-journal",
author: [ { family: "Teague", given: "Charles", literal: "Charles Teague" } ],
language: "en",
"available-date": { "date-parts": [ [ 2022, 10, 6 ] ], literal: "2022-10-06", raw: "2022-10-06" },
issued: { "date-parts": [ [ 2022, 10, 1 ] ], literal: "2022-10-01", raw: "2022-10-01" },
"container-title": "Journal of Data Science Software",
id: "teague2022",
URL: "https://example.com/summarizing-output",
DOI: "10.23915/reprodocs.00010"
} produces: @article{teague2022,
author = {Charles Teague},
title = {Hello {World}},
journal = {Journal of Data Science Software},
date = {2022-10-01},
url = {https://example.com/summarizing-output},
doi = {10.23915/reprodocs.00010},
langid = {en}
} when rendered using:
This looks to be an issue with our CSL parsing, which isn't parsing the author name as anything more than a string right now. I will open an issue to track resolving this. |
I wonder if the fact that 'literal' is present in the CSL name for the author is resulting in Pandoc preferring the literal string rather than the structured name. We currently always bundle literal into the name, thinking that it is useful to provide the origin literal input, but perhaps this isn't working out in this case... |
Well, as shown above, the CSL JSON snippet could not be processed with my pandoc installation (2.19.2, macOS). I had to change it to:
… which results in the same biblatex code as shown above then. (BTW, the The biblatex snippet, now called
… renders:
The “front matter” shown above (minus the
I was failing to see how pandoc could have produced the latter … … but now I realize quarto is processing the json version, where I’d agree the |
… corresponding to Zotero’s one-field author mode, as opposed to its two-field mode for personal names comprised of “first” and “last”. |
I am not sure to what extent it is possible to parse the unstructured authors’ names that typically appear in pandoc’s YAML metadata block top-level Ideally, these would have to be parsed into either an explicit literal/organizational/corporate name:
or a structured representation of a personal name:
… or, as I reckon this kind of parsing will be next to impossible to do properly if there is no one-field/two-field distinction in the input data to begin with (and even if there is, as in Zotero, certain conventions and hacks need to be relied upon; more on this maybe at a later point …), it seems more likely users will have to be encouraged to enter structured representations of a text’s authors’ names into the |
As to dates, I think including all of If a date can be parsed into year, year-month, or year-month-day (or year-season, all with or without a circa flag), or an interval consisting of two such dates, either The
|
For the document author (not in the citation key), we do currently parse the name into a structure as you suggest (we use pandoc for this), but as you note it can't be 100% accurate, so we allow the user to also provide that structure if they'd like (meaning they can correct us in those cases where we miss parse the field). It would be nice for us to improve the parsing of the citation author / editors (really any CSL name field) as well, perhaps using this same approach. We should also drop the literal from our CSL parsing. For dates, I will take a closer look as well. |
I just realized that quarto does support structured representation of names – https://quarto.org/docs/journals/authors.html#author-schema –, and I would suggest encouraging users to enter names in this scheme only: Even a name as simple as “Jane Doe” could be a personal name, in which case parsing into separate elements, and rendering as “Doe, Jane” in certain contexts is appropriate, or a corporate name, where it is not. I would restrict the use of Bugs: Also, when adding a Authoritative specs of CSL JSON that describe CSL name formats in detail can be found at https://citeproc-js.readthedocs.io/en/latest/csl-json/markup.html#name-fields and https://github.com/Juris-M/citeproc-js/blob/master/attic/citeproc-doc.rst#names. |
Improves author parsing per #2723
No error thrown when providing structured author info. You may now specify structured CSL name, or Quarto will attempt to infer it (not all that well). |
Bug description
https://quarto.org/docs/reference/metadata/citation.html displays this example:
Issues:
date
is not a valid CSL variable; useissued
instead.3/2022
is not a valid CSL YAML date format; only ISO-8601 dates are accepted, e.g.,2022
,2022-03
,2022-03-29
etc.date: 3/2022
above should be replaced byissued: 2022-03
.Further suggestions:
BibTeX citation:
-- which, strictly speaking is not true: Neitherurl
, nordoi
,langid
exist in BibTeX -- while the BibTeX variant natbib hasurl
anddoi
, many others that appear in quarto output, includinglangid
,eventdate
,urldate
,annote
are exclusive to BibLaTeX.editor = {},
should be filtered out to reduce clutter.title
fields should be converted to the corresponding (La)TeX commands.title: Foo **bar**
should be converted to `title = {Foo \textbf{bar}},citation:
data from CSL to BibLaTeX, see https://pandoc.org/MANUAL.html#specifying-bibliographic-data.Checklist
The text was updated successfully, but these errors were encountered: