Skip to content

[Canonical Pages] - Add the Facsimile image height and width to importers #155

@piconti

Description

@piconti

For quality assessment of the bounding boxes in the canonical and rebuilt data, but also in general, it would be very helpful to add the facismile width and height to the canonical pages.

Since a lot of data is already ingested, and won't be reingested just for this, it will have to be patched (also based on the output's of Simon's bboxqa), but this will be the subject of another issue.

However, these properties should be added to the canonical page, thus updating the importers accordingly.

Note however, that depending on the importer and the information at disposal during the ingestion, there might be the need to modify the approach for each one.
In particular, whether the image is hosted on IIIF locally or not will impact this process, so small tests will have to be made to know the best approach in each case (querying the IIIF page manifest will probably create an unavoidable overhead, I will also need to check whether the page size can be retrieved easily and correctly from the local image files when available).

Action points for this issue are:

  • Experiments to check what approaches to fetching the image size both form local files and IIIF are the most efficient and correct.
  • Updating each importer accordingly - in priority the ones for data that will be newly ingested.
    • BCUL importer (also needs the integration of the legacy ids with issue [Canonical] - Keep legacy IDs in the canonical data #146) - no local copies, IIIF
    • BL importer (which also needs finishing - and might need new importers) - local copies
    • KB importer (needs finalization) - local copies soon
    • BNF importer - no local copies, IIIF
    • BNF-EN importer - no local copies, IIIF
    • FedGaz importer - local copies
    • BNL importer - no local copies, IIIF
    • RERO importer - local copies
    • SWA importer - no local copies, IIIF (which needs to be checked)
    • TETML importer (NZZ, which will probably shared in mets-alto format) - local copies
    • Olive importer - local copies, but several different coordinate issues
    • ONB-ANNO importer (also needs finalization) - no local copies, IIIF

This will have to be done in concurrency with the patching of the existing data.

Sub-issues

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestschemaanything related to JSON schemas

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions