Leveraging Technology: Remote Cataloging Projects to Capture and Improve Coordinates Data
Traditionally, a library catalog search relies upon use of specific words or phrases to retrieve relevant results. For geographic queries, the political unit – country, state, etc. – is commonly used to identify relevant resources but it requires the user to know the appropriate jurisdictional name of the place of interest. In many cases, the location of interest can be described with a variety of place names – the political unit(s), the region and so on. Conversely, place names can represent a variety of locations; for example, there are many locations called Washington. Geographic coordinates, on the other hand, identify a unique location regardless of the names attached to the place. With the advent of geospatial portals, map-based search interfaces and increasing demand for digital geospatial data, availability of geographic coordinates in the bibliographic record is essential. How then can libraries make this happen? This paper describes two processes that are being used to enhance existing bibliographic records with coordinates data.
Standards for coordinates data in bibliographic records
Cataloging standards have long required that we include information about the scale of a cartographic object in our descriptions whether scale is present on the item being cataloged or not. While that is the situation with scale data, it is quite different for geographic coordinates. Scale data is required in the cataloger’s work, but including geographic coordinates data remains optional, limiting these records’ usability for coordinate-based searching and retrieval. As digital cartographic resources, such as maps with accompanying metadata, continue to be added to geospatial portals, including coordinates is a must in this regard.
There are two fields for recording coordinates data in the MARC format (https://www.loc.gov/marc/bibliographic/), the 255 and the 034. The former presents coordinates data in a human-friendly format, while the latter presents coordinates data in a more machine-friendly format.
MARC field 255: “Cartographic Mathematical Data” is used to record aspects such as scale and latitude/longitude bounding coordinates. For example:
255 ## $a Scale 1:750,000 $c (W 68⁰30ʹ–W 66⁰55ʹ/N 45⁰10ʹ–N 44⁰15ʹ).
MARC field 034: “Coded Cartographic Mathematical Data” is the coded or numerical-only form of data that appear in the 255 field for scale and coordinates, with individual bounding coordinates data coded in their own subfields rather than collectively in a single subfield. Additionally, the current recommended standard is to record the latitude/longitude values in decimal form. For example:
034 ## a $b 175000 $d -68.5000 $e -066.9100 $f +045.1600 $g +044.2500
(Note: The ## symbols in each field above indicate that both indicator values are blank.)
Making changes to the international cataloging standard, RDA: Resource Description and Access (https://www.rdatoolkit.org/), is a slow and lumbering process at best, but the Map and Geospatial Information Round Table (MAGIRT) of ALA took the bull by the horns. In 2020, a cataloging working group was charged to begin the year-long process of creating a best practices document, Guidelines for Cataloging Cartographic Resources using RDA that MAGIRT members formally accepted as a whole in January 2021. This document urges map catalogers to always include geographic coordinates as part of describing each title. Additionally, and more importantly, the document asserts the importance of recording coordinates in decimal format so that computers can easily read and use them. The coordinates in the 255 field can continue to be delivered in the traditional human-readable degrees, minutes, seconds (DMS) format if desired or entered in decimal format.
Penn State’s coordinates improvement projects
In March 2020, the COVID-19 pandemic forced closure of the Penn State Libraries facilities and personnel were asked to quarantine and work remotely where possible. Access to physical resources was restricted, so members of the Cataloging and Metadata Services Department undertook a variety of projects to enhance bibliographic records and improve metadata for digital objects, among other tasks. Projects to enhance descriptive records for our cartographic materials were included in these efforts. Members of the Maps Cataloging Team had already embraced and been using the principles outlined in the best practices document and embarked on several projects related to coordinates, specifically adding, adding to, or updating the MARC 034 and 255 fields.
Adding 255 and 034 fields
The initial step for these projects required identification of records in Penn State’s catalog that lacked a 255 and/or 034 field. Most often in older records the 255 field is present (and in truly old records, the field tag is a 507 and must be changed to 255) but the 034 is not. This is because in the ever-evolving MARC standard the 034 was not created and accepted for use until the late 1990s. Next, these local catalog records were compared with their OCLC counterparts via a batch search process to identify any matching OCLC records that had the needed field(s). If found, MarcEdit was used to push the fields into our local records then the file of merged records was loaded back into the catalog, overwriting existing records. Local records that lacked a corresponding OCLC record number were excluded from the batch search.
Only a small percentage, 3%-4%, of our map records lacked a 255 field and approximately 2700 records were enhanced using this method. A much higher percentage, about one-third of our records, lacked an 034 field. Around 500 of these had no OCLC number so they were excluded from the batch search. Ultimately, 23% or approximately 18,000 map records were enhanced with the addition of an 034 field.
For some local records, such as those with no OCLC number, coordinates are being added by utilizing placenames from subject headings or titles within these records to determine bounding coordinates for the 255 field. For example, a map with a title indicating geographic coverage of “Maine” was assigned the known bounding coordinates for that location, generated using the Klokan Bounding Box tool (https://boundingbox.klokantech.com/). To date, approximately 3100 records (~60% of records examined) have gained coordinate fields, both 255 and 034, via this method. The geographic boundaries of the other 40% of records examined were indeterminate using either the title or geographic subject heading, often because the map showed a multi-jurisdictional area such as central Europe or Eurasia. They await the addition of coordinates when the physical collections are once again accessible, and we can determine coordinates by visual inspection with a map in hand.
Converting 034 fields to decimal form
Around two-thirds of all local map records now had 034 fields, however, many were not in decimal format. We extracted all records that included an 034 field, eliminated those already in decimal form, used the MarcEdit Decimal Conversion tool to flip the 034 fields into the decimal format then loaded the records back into the catalog, overwriting the older versions. The process of checking for 034 fields in non-decimal form is now a regular (every six months) occurrence to ensure that all records moving forward are up-to-date according to national best practice guidelines.
Side notes: The 034 field allows for up to six decimal places, however, MarcEdit takes the decimal conversion only to four places. It is not uncommon to see some variation in the level of precision in this datum depending on the tool used. Also, as part of this process, Penn State tested a beta version of the decimal conversion tool in MarcEdit. A bug was identified in the tool and was quickly fixed by Terry Reese, the developer of MarcEdit and Head of Infrastructure Support and Digital Initiatives at The Ohio State University Libraries. We gratefully acknowledge his support and appreciate the opportunity to contribute to improving MarcEdit.
To summarize, working remotely posed serious challenges to members of the Penn State Libraries’ Cataloging and Metadata Services Department yet the Maps Cataloging Team was able to creatively and successfully continue to add value to descriptive records in our catalog despite being physically remote from the collections. Two-thirds of records for the maps collections now contain coordinates that can be utilized for greater and more useful purposes by our students, faculty and others. In addition, enhancements that would have taken many months had these records been worked on individually were accomplished in just a few weeks. By creatively leveraging existing technologies, such as MarcEdit and the Klokan Bounding Box tool, these projects illustrate that it is possible to improve discoverability of cartographic resources even while working remotely.
The members of the Maps Cataloging Team who made this project a success are John Hamilton, Stefan Kroger, and Paige Andrew with the able assistance of Jeff Edmunds, Digital Access Coordinator in the Cataloging and Metadata Services Department.
Linda Musser, Head of the Fletcher L. Byrom Earth and Mineral Sciences Library, Penn State University, 105 Deike Building, University Park PA 16802; email@example.com
Paige Andrew, Cartographic Resources Cataloging Librarian, Penn State University, 003 Paterno Library, University Park PA 16802; firstname.lastname@example.org