Skip to content

[Canonical-Olive] - Multipage articles only show first page #167

@piconti

Description

@piconti

As highlighted by Simon on this issue, some CIs which exist on multiple pages sometimes only show on the first page.

An example provided for this is the following

After a small check in the data - namely EXP-1829-03-26-a-pages.jsonl it was identified that indeed, the "pOf" elements only exist for for first page on which the CI is, the following pages only having region boxes, without any pOf element.
This is not a new problem since the data in question was generated in 2019, but it should be fixed, and then all subsequent stages should be recomputed

The action points related to this issue are the following:

  • check the extent to which this problem spreads (eg. dates and aliases)
  • see if the canonical code can be "easily" fixed, of if this should rather be patched
  • if patching is a better approach, correct the data and replace it in the corresponding canonical bucket.
  • compute the manifest according to it.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdataissues that are related to the datarequires re-ingestion

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions