Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Hierarchical Neural Architectures for Classifying Cancer Pathology Reports
Details

Hierarchical Neural Architectures for Classifying Cancer Pathology Reports

Date Issued
December 15, 2019
Author(s)
Gao, Shang
Advisor(s)
Georgia Tourassi
Additional Advisor(s)
Arvind Ramanathan, Hairong Qi, Russell Zaretzki
Abstract

Electronic health records (EHRs) are the primary method for documenting and storing patient outcomes in modern healthcare; data mining and machine learning approaches utilize the information stored in EHRs to assist in clinical decision support and other critical healthcare applications. Important information in EHRs is often stored in the form of unstructured clinical text. Unfortunately, the state-of-the-art methods used to automatically extract useful information from unstructured clinical text lags significantly behind the state-of-the-art methods used in the general natural language processing (NLP) community for other tasks such as machine translation, question answering, and sentiment analysis. In this work, we attempt to bridge this gap by applying and developing hierarchical neural approaches to classify key data elements in cancer pathology reports, such as cancer site, histology, grade, and behavior. We (1) show that a hierarchical attention network (HAN), which has strong performance on classifying general text such as Yelp reviews and news snippets, achieves better classification accuracy and macro F-score on identifying cancer site and grade than previous state-of-the-art approaches, (2) develop a novel hierarchical self-attention network (HiSAN) which not only achieves better accuracy and macro F-score in cancer pathology pathology report classification than the HAN but also trains over 10x faster, and (3) introduce a hierarchical framework for incorporating case-level context when classifying cancer pathology reports and show that it gives a significant boost in accuracy and macro F-score.

Subjects

Deep learning

natural language proc...

pathology reports

neural networks

clinical text

electronic health rec...

Degree
Doctor of Philosophy
Major
Computer Science
Comments
This dissertation is a manuscript-style dissertation in which each of the three core chapters is a previously published journal paper or is a paper currently undergoing the submission process.
Embargo Date
December 15, 2020
File(s)
Thumbnail Image
Name

utk.ir.td_12920.pdf

Size

12.72 MB

Format

Adobe PDF

Checksum (MD5)

ea6b1c93cec01007d436932f0507381c

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify