import external segmentation

> Note: I am not sure if I understand this correctly. I have not used the app yet. Please correct me if my angle is all wrong!

In step 2 of your [getting started guide](https://glyphcollector.app), there is considerable manual effort.

Suppose that instead you have some glyph segmentation from an external annotation format and workflow, like OCR output in [hOCR](https://github.com/kba/hocr-spec) or [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML/) or [ALTO-XML](https://www.loc.gov/standards/alto/) format. (Really, any suitable segmentation-providing format would do.)

Couldn't Glyphcollector then use these coordinates to crop sample images, perhaps with a heuristic to keep some subset of visually maximally distant or textually (i.e. predefined) samples and either ignore the others or use them for bootstrapping your search?

Also, would external high-quality binarization (i.e. bitonal reduction) help in any way?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

import external segmentation #23

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

import external segmentation #23

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions