Investigating a report that a record was changed in Alma but has not changed in Books & Media

Investigating a report that a record was changed in Alma but has not changed in Books & Media

Determine the harvest that should have the record in it in the data pipeline

  • OAI-PMH record by identifier should tell you when the record was last published - then would be harvested in next run after that date/time

    • https://duke.alma.exlibrisgroup.com/view/oai/01DUKE_INST/request?verb=GetRecord&identifier=oai:alma.01DUKE_INST:INSERTMMSID&metadataPrefix=marc21

    • Record header has <datestamp> value - e.g., this is 10:52 PM on July 1, so you’d start searching harvest records for 00:00 on July 2.

      <header> <identifier>oai:alma.01DUKE_INST:99119755203608501</identifier> <datestamp>2025-07-02T02:52:43Z</datestamp> <setSpec>trln_discovery_spec</setSpec> </header>
  • Can also search by Alma > All Titles > Search by MMS ID

    • Other information > Click on entity ID next to “Publishing information for” physical or electronic inventory

    • Choose publishing profile for Summon and/or Primo and note when information was last published

  • Note that if the record was suppressed, you should see ‘d' in the fifth character of the leader - that’s what the pipeline is looking for to delete the record from the SOLR index

Determine the timing of the record change in Alma versus the last indexed date in Books & Media

  • Record change in Alma - last modified date on bib or holdings (or items?)

  • OAI-PMH should republish the record at the top of the next hour after the Alma record was republished

  • Then the harvest (the actual pipeline) would pick it up at 12, 6, midnight, 6 AM

  • Enrichment happens at :35 past the hour

  • Records are sent to TRLN ingest at top of hour

So to analyze:

  • Last indexed date in Books & Media - append /raw to the end of the URL, look for index_date

  • General guideline - If OAI datestamp is same business day, ask staff member to check again next morning

  • If not same business day, something is likely wrong - keep investigating

Investigate the harvest records

  • WinSCP > connect to Alma SFTP server > /share/alma/pipeline/soa-enrich

  • Open the folder where you think the record was harvested based off of the OAI-PMH datestamp. There should be folders for 00:00 AM, 06:00 AM, 12:00 PM, 06:00 PM

  • Within the folder, you’ll see a number of subfolders, usually appended by a number. It’s almost always just 1 unless it was a particularly large harvest in which case you might see a set of folders appended by 2 or 3. You want to investigate:

    • “harvestX” folder will have the OAI-PMH harvested file(s), you would download and search these to verify you have the right harvest

    • “enrichX” folder will have the enriched output, still in MARCXML

    • “errorsX” folder will record errors that caused failure in enrichment.

  • There are two types of errors that could be caught here - missing items, and missing alma numbers

Errors that can appear in the harvest

Missing items

  • Means that the record in the OAI-PMH file did not have any inventory (electronic or physical)

    • If item does not have inventory yet in Alma, that’s why it didn’t publish to Books & Media - this is expected behavior.

    • If item has Alma inventory, it may have been added after the harvest took place

Missing Alma numbers