Analysis of Literal Text from the Death Certificate
RPubs - Analysis of Literal Text from the Death Certificate
This R Markdown report, published to RPubs, performs text mining analysis on a representative sample of death records from public data available for purchase from the Washington State Department of Health. The purpose of this project is to perform exploratory analyses prior to a larger examination of records for the United States from 2003-2017. The analysis will focus primarily on records pertaining to drug overdose deaths with the following ICD-10 codes: X40-44, X60-64, X85, Y10-Y14.
Sample Analyses
- Top words among all death records.
- Top words according to ICD code and drug overdose status, examining raw counts and term frequency inverse document frequency (TF-IDF).
- Top bigrams, including network graph visualizations depicting the relationship of common word pairs.
Refer to RPubs Report for Analyses