Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
excludetrue
maxLevel6
minLevel1maxLevel6
include
outlinefalse
indent
exclude
stylenone
typelist
printabletrue
classprintable

October 14, 2024

Some very cool updates to report today.

Firstly, last week we confirmed and processed a list of ~1.3 million records to be removed from the common data stream for the TRLN catalogs. These were largely duplicate e-resource records left over from our transition away from 360KB, but it also included some DKU records. Now that these records are gone, the search results should be a bit less confusing and have less potential for incorrect information, which will be a big improvement to the user experience of our catalog. It should also fix the main remaining reason behind workaround #2: inactive records displaying in B&MC. (Other deleted and suppressed records that were originally showing up should also be gone now.)

Another big win last week - we have fixed an issue with the “View Online” buttons for ~2 million e-resource records. These records were showing in the catalog, but they were missing the access URL, so the record pages looked strangely empty and users wouldn’t know where to go to get to the resource. Now that the buttons are showing up, this should fix the primary reason behind workaround #3: no View Online button.

We’re still working as fast as we can to get the data pipeline set up so it runs on an hourly schedule, and there’s progress on that every day, but it’s all behind the scenes right now. We’re also making great progress on other quality-of-life improvements, like more automated testing so we find bugs faster. Keep watching this space for the latest information!

October 9, 2024

The records from 9/18 are all fully ingested into the common data stream, so the catalog data should now be up to date as of 9/18.

Another fun change that came with this update: you can now search by MMS ID in the Books & Media Catalog! Searching for the MMS ID (the new Alma ID number) should work either for “all fields” searches or for “ISBN/ISSN/barcode” searches.

Screenshot 2024-10-09 at 3.53.26 PM.pngImage Added

October 7, 2024

Quick update on the data pipeline. The records harvested from Alma on 9/18/24 have just about finished their load into the TRLN common data stream. Many updates are already available in the catalog, like improvements to the New Titles results. In other cases, the updated data may just be the first step toward getting the catalog to work correctly.

We have also just published a document detailing Workarounds for Books & Media Catalog Issues. This new documentation goes into a bit more detail about the workarounds for several of the major issues still affected the Books & Media Catalog. Spoiler alert: searching via Summon is often a good alternative.

Be on the lookout for invitations to staff listening sessions related to the catalog! Thomas Crichlow will be hosting a series of meetings to hear about issues people are experiencing and offer additional support.

October 3, 2024

This update is a bit of a state-of-the-union for the Books and Media Catalog extended universe.

...

  1. Setup a publishing profile in Alma to determine what metadata elements are included when we harvest records

  2. Kick off a publishing job in Alma

  3. Harvest all of the published records from Alma

  4. Enrich the records with additional information not available from the harvest (e.g., availability information)

  5. Transform the final records into the correct format for TRLN Discovery

  6. When needed, test a subset of records in a Duke sandbox to see how they will look in the catalog before we push them into the live application

  7. Publish the records into the TLRN Discovery data stream

A note about the stale data in the catalog: When we have to run the full pipeline on all ~8 million records that go into the catalog, the process can take almost a month to complete. Since the Alma cutover in July, we’ve had to make several changes to the pipeline, and each change meant we had to reprocess all of the records again. As the pipeline nears its final state, we are working to transition to “intermittent updates.” This means that instead of reprocessing all of our millions of records each time, we can just ask Alma for the records that have been updated since our last run. Switching to intermittent updates takes some extra coding to automate each part of the pipeline, and that work is currently underway. When we get to the point where we can switch to intermittent updates, we hope to be processing updated records every hour. In the meantime, changes in Alma will still take several weeks to appear in the catalog.

Recent updates to the data pipeline

...