Skip to content

Latest commit

 

History

History
104 lines (71 loc) · 6.39 KB

index.md

File metadata and controls

104 lines (71 loc) · 6.39 KB
layout show_excerpts header title
home
false
false

👋 Hello there!

I am an aspiring data scientist and Master's student in the Department of Statistical Science at Duke University, where I also obtained my Bachelor's in Statistical Science (Data Science Concentration) and a Minor in Computer Science.

  • My academic journey has been marked by a deep research commitment to statistical analysis, machine learning, and data science, with special focuses on natural language processing, Bayesian statistics, and creative data visualizations. Take a look of my previous research and projects in R and Python, and let me know if you are interested!
  • Beyond academia, I actively contribute as a Project Manager & Data Analyst @ Duke Impact Investing Group and as the Chief Technology Officer @ Duke Statistical Science Majors Union. Additionally, I have been a teaching assistant with 3+ years of experience. Feel free to reach out for project advice and business case studies.
  • In my free time, I do 🥊 / 🚴‍♀️ / 🎹 / 🧁

🏫 Education

Institution Degree Field of Study Dates
Duke University M.S. Student Statistics May 2025
Duke University B.S. Statistical Science (Data Science Concentration) Minor in Computer Science May 2023
University of California, Santa Barbara (Transfer Out) Statistics and Data Science June 2021

⚙️ Skillset

Skillset

© Visualization is created by scraping through my resume using R wordcloud2 package.

👩‍💻 Highlights & Updates

Invitee | R Dev Day @ Hutch @ (Aug 2024)

Opportunity Scholar | posit::conf(2024) @ (Aug 2024)

Masters Statistician Intern @ (May 2024 – Aug 2024) Diabetes Common Safety Tables, Figures, Lists (TFLs) Automation
  • Developed and launched a Shiny app to automate the creation, execution, and review of common safety TFLs, integrating R and SAS code with output formatting, progress tracking, and error reporting through front-end UI design and back-end cloud system engineering; consolidated 30+ common safety TFLs from 300+ listings across 5+ Diabetes study by building a flexible internal TAFFY template project; reimagined the clinical reporting pipeline with enhanced efficiency and consistency
  • Orchestrated regular meetings with senior leadership; pitched the app to 600+ global employees; achieved successful implementations in Diabetes, with ongoing rollouts to Neuroscience and other therapeutic areas
Student Research Affiliate @ (May 2022 – Dec 2022) Lab Test Harmonization: Bio-BERT Based Deduplication of Test Labels
  • Optimized lab test deduplication of grouper labels by fine-tuning Bio-BERT, an NLP model pre-trained on biomedical corpora; established a new method of cross-comparison similarity evaluation based on ground-truth text embeddings; uncovered a 95% performance boost in the application to Duke Hospital’s lab database
  • Demonstrated academic distinction by contributing to the Duke AI Health 2022 cohort as the sole undergraduate participant; effectively communicated research outcomes through a well-received presentation at the Duke AI Health Poster Showcase 2022
Data Science Intern @ (Jun 2022 – Aug 2022) Hiya Shield Project: Robocall Identification & Screening
  • Spearheaded an NLP-based robocall detection system based on internal audio databases, leveraging SBERT, unsupervised learning, statistical analysis, and AWS Cloud on text- and audio-space manipulation
  • Enhanced classification efficiency by discovering optimal audio truncation length and similarity thresholds, driving a 67% faster user experience with a customizable accuracy screening feature for Hiya mobile app
Lead Author & Research Assistant @ (Jun 2020 – Mar 2021) Cross-Media Retrieval Based on Big Data Technology
  • Refined traditional permutation invariant training with mean squared error loss through BLSTM/LSTM and CNN in a key media separation technique; innovated two new separation methods – the FIX strategy and the masking-based data augmentation strategy, demonstrating notable performance gains
  • Publication: Audio-Visual Single-Channel Signal Separation based on Big Data Augmentation in IEEE (IICSPI 2020)