-
Notifications
You must be signed in to change notification settings - Fork 8
Description
It would be nice to be able to download also the word picture results in a tabular format (CSV, TSV and/or XLSX), which might be more accessible to non-technical users than JSON. It would also be consistent with the download options for the KWIC, statistics and trend diagram results.
What would be a good format for the downloadable table? What about having the following columns:
- dependency relation
- head lemgram or word form
- head part of speech
- dependent lemgram or word form
- dependent part of speech
- LMI value
- absolute frequency
- source corpora
Another option might be to have searched and related lemgrams or word forms instead of head and dependent ones. In that case, the searched lemgram or word form would be the same for all rows in a single table, but I think it might better to have a uniform format for all rows, instead of having a separate first row containing only the searched lemgram or word form. However, having head and dependent would correspond more closely to the JSON format.
Each subtable in the word picture would correspond to a group of rows with the same dependency relation. The downloadable table could be sorted by the dependency relation and LMI value, so the order would be the same as in the word picture.
Source corpora could be a space- or comma-separated list of either corpus ids or corpus titles; which would be more useful? The source sentence id might not be very useful for a typical user.