Investigating a report that a record was changed in Alma but has not changed in Books & Media
Determine the harvest that should have the record in it in the data pipeline
OAI-PMH record by identifier should tell you when the record was last published - then would be harvested in next run after that date/time
https://duke.alma.exlibrisgroup.com/view/oai/01DUKE_INST/request?verb=GetRecord&identifier=oai:alma.01DUKE_INST:INSERTMMSID&metadataPrefix=marc21
Record header has <datestamp> value - e.g., this is 10:52 PM on July 1, so you’d start searching harvest records for 00:00 on July 2.
<header> <identifier>oai:alma.01DUKE_INST:99119755203608501</identifier> <datestamp>2025-07-02T02:52:43Z</datestamp> <setSpec>trln_discovery_spec</setSpec> </header>
Can also search by Alma > All Titles > Search by MMS ID
Other information > Click on entity ID next to “Publishing information for” physical or electronic inventory
Choose publishing profile for Summon and/or Primo and note when information was last published
Note that if the record was suppressed, you should see ‘d' in the fifth character of the leader - that’s what the pipeline is looking for to delete the record from the SOLR index
Determine the timing of the record change in Alma versus the last indexed date in Books & Media
Record change in Alma - last modified date on bib or holdings (or items?)
OAI-PMH should republish the record at the top of the next hour after the Alma record was republished
Then the harvest (the actual pipeline) would pick it up at 12, 6, midnight, 6 AM
Enrichment happens at :35 past the hour
Records are sent to TRLN ingest at top of hour
So to analyze:
Last indexed date in Books & Media - append /raw to the end of the URL, look for index_date
General guideline - If OAI datestamp is same business day, ask staff member to check again next morning
If not same business day, something is likely wrong - keep investigating
Investigate the harvest records
WinSCP > connect to Alma SFTP server > /share/alma/pipeline/soa-enrich
Open the folder where you think the record was harvested based off of the OAI-PMH datestamp. There should be folders for 00:00 AM, 06:00 AM, 12:00 PM, 06:00 PM
Within the folder, you’ll see a number of subfolders, usually appended by a number. It’s almost always just 1 unless it was a particularly large harvest in which case you might see a set of folders appended by 2 or 3. You want to investigate:
“harvestX” folder will have the OAI-PMH harvested file(s), you would download and search these to verify you have the right harvest
“enrichX” folder will have the enriched output, still in MARCXML
“errorsX” folder will record errors that caused failure in enrichment.
There are two types of errors that could be caught here - missing items, and missing alma numbers
Errors that can appear in the harvest
Missing items
Means that the record in the OAI-PMH file did not have any inventory (electronic or physical)
If item does not have inventory yet in Alma, that’s why it didn’t publish to Books & Media - this is expected behavior.
If item has Alma inventory, it may have been added after the harvest took place
Missing Alma numbers
Means that the AWS physical or electronic table does not have an entity ID stored for the associated MMS ID
Can be verified with How to see what is in AWS tables for Discovery Pipeline