Lexical Stability

Code for the paper "Lexical Stability of Psychiatric Clinical Notes from Electronic Health Records over a Decade". The raw data can not be shared, but aggregated data (keyword counts, descriptive statistics, novelty) can be found under data. r_src/changepoint_analysis.Rmd can be run to recreate the analysis from the paper.

Pipeline

src/extract_descriptive_statistics.py to extract descriptive statistics with textdescriptives.
src/summarize_descriptive_statistics.py to aggregate them in quarterly bins.
src/get_keyword_counts.py to extract keyword counts and aggregate them quarterly.
src/extract_novelty.py to calculate novelty based on the keyword proportions.
r_src/changepoint_analysis.Rmd to conduct the changepoint analysis.

Directory structure

├── data/
│   ├── entropy_prop.csv # novelty
│   ├── keyword_counts.pkl # keyword counts
│   ├── keywords.yaml # the extracted keywords
│   └── td_stats.csv # extracted descriptive statistics
├── pretty_path.py
├── r_src/ # R scripts
│   ├── changepoint_analysis.Rmd # main analysis
│   ├── novelty_visualization.Rmd # to reproduce figure 5
│   ├── number_of_notes.Rmd # to reproduce table 2
│   └── r_utils/
│       └── change_point_detection.R # functions for change point detection
├── README.md
├── requirements.txt
├── src/ # python scripts
│   ├── __init__.py
│   ├── extract_descriptive_statistics.py 
│   ├── extract_novelty.py
│   ├── figures_and_tables/ # 
│   │   ├── dep_visualization.py # to reproduce figure 6
│   │   ├── note_description.py # to create data for table 2
│   │   └── patient_description.py # to reproduce table 1
│   ├── get_keyword_counts.py
│   ├── summarize_descriptive_statistics.py
│   └── utils/
│       ├── __init__.py
│       ├── infodynamics.py # code for calculating novelty
│       └── utils.py # misc. utilities

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lexical Stability

Pipeline

Directory structure

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
r_src		r_src
src		src
README.md		README.md
lexical-dynamics.Rproj		lexical-dynamics.Rproj
requirements.txt		requirements.txt

Aarhus-Psychiatry-Research/lexical-stability

Folders and files

Latest commit

History

Repository files navigation

Lexical Stability

Pipeline

Directory structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages