Skip to content

Latest commit

 

History

History
79 lines (48 loc) · 5.73 KB

index.md

File metadata and controls

79 lines (48 loc) · 5.73 KB

Data Science/ML Projects

Text Segmentation from Engineering Drawings

Problem: Segmenting semantically meaningful text from engineering drawings (such as blueprints and schematics) is a rather more difficult job than, for example, OCR on text from standardized printed pages like books, newspapers, or magazines.

Approach: Fine-tuning the Kraken convolutional neural network, which was trained on pages or images with more distortion and other abberations. I developed and ran ML experiments for optical character recognition, iterating between data preprocessing and experimental testing. As part of the process, I streamlined and improved the data processing pipeline, improved training data accounting and quality, and improved results assessment and validation. Finally, I contributed to the final report for the Department of Energy grant supporting this research.


Onboard Data Science Tutorials [git, medium]

A series of data science tutorials and open source evangelism for Onboard Data, Inc, including:

  • tutorials on the company's API and API client [one, two, three]
  • timeseries cleaning and basic imputation techniques [colab, medium]
  • feature engineering and selection [colab, medium]
  • timeseries forecasting with Facebook's Prophet [colab, medium]
  • outlier and anomaly detection [colab, medium]
  • fault detection in HVAC systems [medium]

Cognitive Science Research

A Bayesian Model of Cue-based Cardinal Direction Estimation [pdf]


Absolute Direction Feedback Impacts Environmental Knowledge [pdf]


How Do You Know If You’re Lost or Not? Epistemic and Pragmatic Action During Navigation [pdf]


Navigational Feedback Technology Alters Environment Awareness [pdf]


Lost and Confused: Measuring Uncertainty in Navigation [pdf]


Software Projects

QuantAQ R Client

  • Developing company's R API client from scratch following CRAN standards.

Onboard R Client

  • Refined the company's internal R client, bringing it up to CRAN standards, making it ready to be public-facing. Improved parity with company's public-facing Python API client. Updated the company's API client docs (ReadTheDocs) to reflect the newly-public R client.

open-fdd

  • Overhauled one portion of this open source AHU fault detection. Reduced redundant code, shifted to OOP, improved modularization.