Masters Theses
Date of Award
12-2025
Degree Type
Thesis
Degree Name
Master of Science
Major
Ecology and Evolutionary Biology
Major Professor
Brian C. O'Meara
Committee Members
Charles Kwit, Jessica L. Allen
Abstract
Data deficiency remains a conservation barrier for many organismal groups, with declines in biodiversity and ecosystem health predicted to continue. The historical focus toward groups generally considered to be more charismatic has ultimately led to a lack of data available for assessing extinction risk in lesser-known taxa, and we lack baseline taxonomic knowledge across many ecosystems. Fungi are just one example of an understudied speciose group that has more recently been gaining conservation attention. While efforts to digitize natural history collections continue to increase our overall understanding of biodiversity, this cannot directly address underlying sampling biases that skew organismal representation across physical collections. Additionally, many digitized collections accessed for conservation-related research are skeletal records that do not include latitude and longitude values to reflect where the specimen was collected from. To address some of these gaps in baseline taxonomic knowledge and physical collections holdings, 360 unidentified lichen specimens from an incomplete biodiversity inventory of Washington’s Palouse Prairie were identified to species. Digitized herbarium records were also analyzed to compare against current identifications., as well as to synthesize historical information for Palouse lichens into a referenceable document. The digitization of newly identified collections and submission to multiple herbaria increases the representation of dryland ecosystems in the northwestern U.S. and provides data for use in both local and state conservation efforts. To contribute to resolving gaps in digital collections, the capabilities of Large Language Models (LLMs) in geocoding from locality strings held in digital occurrence data were tested to gain insight on the potential use of such tools to help with georeferencing tasks. It was found for the current combination of prompts and LLMs that model selection greatly influenced the accuracy of an LLM to choose coordinates based on 500 GBIF locality strings, but that the specific prompt given made no difference. It was additionally found that when asked to perform this geocoding task when disconnected from the internet, the chain-of-reasoning reflected actions that would not be possible without an internet connection. Overall, this works adds to ongoing efforts that address data deficiencies related to natural history collections and our knowledge of biodiversity.
Recommended Citation
Chandler, Amanda, "ADDRESSING DATA DEFICIENCIES: LICHENS OF THE PALOUSE PRAIRIE (U.S.) AND THE POTENTIAL OF LARGE LANGUAGE MODELS FOR GEOCODING. " Master's Thesis, University of Tennessee, 2025.
https://trace.tennessee.edu/utk_gradthes/15465