Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Tickle College of Engineering
  4. Engineering -- Faculty Publications and Other Works
  5. Faculty Publications and Other Works - Industrial & Information Engineering
  6. Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research
Details

Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research

Date Issued
October 1, 2024
Author(s)
Xu, Haowen  
Li, Xueping  
Tupayachi, Jose  
Lian, Jianming (Jamie)  
Omitaomu, Olufemi
DOI
https://doi.org/10.1145/3681780.3697252
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/47462
Abstract

Bibliometric analysis is essential for understanding research trends, scope, and impact in urban science, especially in high-impact journals, such Nature Portfolios. However, traditional methods, relying on keyword searches and basic NLP techniques, often fail to uncover valuable insights not explicitly stated in article titles or keywords. These approaches are unable to perform semantic searches and contextual understanding, limiting their effectiveness in classifying topics and characterizing studies. In this paper, we address these limitations by leveraging Generative AI models, specifically transformers and Retrieval-Augmented Generation (RAG), to automate and enhance bibliometric analysis. We developed a technical workflow that integrates a vector database, Sentence Transformers, a Gaussian Mixture Model (GMM), Retrieval Agent, and Large Language Models (LLMs) to enable contextual search, topic ranking, and characterization of research using customized prompt templates. A pilot study analyzing 223 urban science-related articles published in Nature Communications over the past decade highlights the effectiveness of our approach in generating insightful summary statistics on the quality, scope, and characteristics of papers in high-impact journals. This study introduces a new paradigm for enhancing bibliometric analysis and knowledge retrieval in urban research, positioning an AI agent as a powerful tool for advancing research evaluation and understanding.

Subjects

Bibliometrics Analysi...

Large Language Models...

Retrieval-Augmented G...

Transformers

Disciplines
Artificial Intelligence and Robotics
Computer Sciences
Engineering
Recommended Citation
Haowen Xu, Xueping Li, Jose Tupayachi, Jianming (Jamie) Lian, and Olufemi A Omitaomu. 2024. Automating Bibliometric Analysis with Sentence Transformers and Retrieval Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research . In 2nd ACM SIGSPATIAL International Workshop on Advances in Urban-AI (UrbanAI’24), October 29- November 1 2024, Atlanta, GA, USA. ACM, Seattle, WA, USA, 7 pages. https://doi.org/10.1145/3681780.3697252
Submission Type
Publisher's Version
File(s)
Thumbnail Image
Name

3681780.3697252.pdf

Size

1.09 MB

Format

Adobe PDF

Checksum (MD5)

ed2de5d65a36b01e0598ff10c51d7dbb

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify