I'm Plamen, and I've been passionately contributing to the open-source software community since 2018.
My open-source journey began as a software engineer at DAI-Lab, where I faced and conquered a multitude of intriguing challenges. These experiences not only sharpened my skills but also broadened my horizons in the realms of open source and software engineering.
I'm currently a part of the fantastic team at DataCebo, the proud creators of SDV, the largest ecosystem for synthetic data generation and evaluation. My role involves developing new features, refactoring, and maintaining various projects within this ecosystem. Here are some of the key projects I'm actively involved in:
-
SDV: The Synthetic Data Vault, a powerful synthetic data generation tool that maintains the same format and statistical properties as the real data.
-
RDT: Reversible Data Transforms, a Python library for transforming raw data into fully numerical data.
-
CTGAN: A collection of deep learning-based synthetic data generators for single table data.
-
Copulas: A Python library for modeling multivariate distributions and sampling from them using copula functions.
-
DeepEcho: A synthetic data generation Python library for mixed-type, multivariate time series.
-
SDMetrics: A library that evaluates synthetic data by comparing it to the real data you're trying to mimic.
-
SDGym: Synthetic Data Gym, a framework for benchmarking the performance of synthetic data generators based on SDV and SDMetrics.
Over the years, I've had the privilege of contributing to several public repositories, including:
-
SteganoGAN: A tool for creating steganographic images using adversarial training.
-
MLPrimitives: Pipelines and primitives for machine learning and data science.
-
MLBlocks: A simple framework for composing end-to-end tunable machine learning pipelines.
-
BTB: Bayesian Tuning and Bandits, a tool for hyperparameter tuning and model selection.
-
AutoBazaar: An AutoML system combining BTB, MLPrimitives, and MLBlocks.
-
mit-d3m-ta2: MIT-Featuretools TA2 submission for the D3M program.
-
ATM: Auto Tune Models, an AutoML system designed with ease of use in mind.
-
Orion: A machine learning library for unsupervised time series anomaly detection.
-
SigPro: An end-to-end solution for efficiently applying multiple signal processing techniques to raw time series data.
-
Draco: A collection of end-to-end solutions for machine learning problems commonly found in monitoring wind energy production systems.
Feel free to explore these projects and don't hesitate to reach out if you have any questions or would like to collaborate!