Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File metadata parsers #555

Closed
mzur opened this issue Feb 23, 2023 · 8 comments · Fixed by #709
Closed

File metadata parsers #555

mzur opened this issue Feb 23, 2023 · 8 comments · Fixed by #709
Assignees

Comments

@mzur
Copy link
Member

mzur commented Feb 23, 2023

We had a request if BIIGLE could support more file metadata formats. These formats would require more than a simple mapping between column names as it is currently done. Some formats may also be very specific and only required for a single BIIGLE instance. Here is how this could be achieved:

Implement a generic "metadata parser" interface that receives a string as input (i.e. the file content, if a file is uploaded it will be automatically read before the parser is called) and produces the internal BIIGLE metadata array as output. A parser may also throw an exception if there is anything wrong with the file. The first metadata parser classes could be the CSVParser that basically does this and the IfdoV1Parser that basically does this. I'm told that v2.0 of the iFDO standard will be finalized soon, so we could also add an IfdoV2Parser in the future (@tschoeni).

All metadata parsers are defined in a config file like this:

[
   'csv' => [
      'name' => 'Metadata CSV',
      'parser' => CsvParser::class,
   ],
   'ifdov1' => [
      'name' => 'iFDOv1 YAML',
      'parser' => IfdoV1Parser::class,
   ],
   'ifdov2' => [
      'name' => 'iFDOv2 YAML',
      'parser' => IfdoV2Parser::class,
   ],
   'myformat' => [
      'name' => 'My Format XLSX',
      'parser' => MyFormatParser::class,
   ],
]

All parsers are offered in the "import" dropdown in the "create new volume" form of BIIGLE.

If there is a custom format that should be supported in a single BIIGLE instance only, the config array above could be extended (see myformat) and a new parser class file injected at build-time.

The parsers (except maybe the native CSV parser) could all work like the iFDO import is working currently. When an import file is selected in the new volume form, the file is uploaded to an API endpoint that returns the parsed information in the BIIGLE CSV format. This information is then added to the respective form field. Another special case is the support for an iFDO file upload (in addition to storing the metadata in the database). This is probably not required for other metadata imports.

@WaiiMCap
Copy link

WaiiMCap commented Mar 9, 2023

@mzur Thank you for providing that explanation, Could you kindly explain the process of injecting a class file at build-time, please?

@mzur
Copy link
Member Author

mzur commented Mar 9, 2023

The build happens in the build directory of the distribution configuration. You can add the class file to this directory and then "inject" it in the base Docker image (build.dockerfile) when it is built. Here is an example how the filesystems config is copied to the image (to replace the default config). You can add any PHP file this way (as long as it is placed at the correct location).

@WaiiMCap
Copy link

@mzur Could you please tell me where to locate the config file that defines the available parsers and their associated parser classes in BIIGLE?

@mzur mzur moved this to Medium Priority in BIIGLE Roadmap Mar 24, 2023
@mzur
Copy link
Member Author

mzur commented Mar 24, 2023

This feature does not exist in BIIGLE yet. This issue describes the feature idea how new and more flexible parsers could be implemented. Currently, BIIGLE can parse its CSV file format and iFDO files. You can find all the relevant information about these in the first post of this issue.

@mzur
Copy link
Member Author

mzur commented Jun 14, 2023

@WaiiMCap are you working on this issue or is it free for someone else to pick up?

@WaiiMCap
Copy link

@WaiiMCap are you working on this issue or is it free for someone else to pick up?

I'm working on it.

@mzur
Copy link
Member Author

mzur commented Sep 21, 2023

@WaiiMCap Any news on this? We'll start working on iFDOv2 support soon and it would integrate nicely with this.

@mzur
Copy link
Member Author

mzur commented Dec 6, 2023

fyi, a general framework for new metadata parsers is now taking shape here #709

New metadata parsers should extend the new abstract class MetadataParser. An example for adding a custom metadata parser (iFDOv2) as a PHP package will also be developed.

@mzur mzur linked a pull request Apr 19, 2024 that will close this issue
38 tasks
@mzur mzur assigned mzur and unassigned WaiiMCap Apr 23, 2024
@mzur mzur closed this as completed in #709 Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants