Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
jwmueller committed Aug 2, 2024
1 parent 21f8ecd commit 35d16d7
Show file tree
Hide file tree
Showing 190 changed files with 12,865 additions and 12,405 deletions.
2 changes: 1 addition & 1 deletion master/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 59e8199fdccc20214d3ba6bfb97e71d7
config: da399c314656edf666511f8f45e8bb47
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file modified master/.doctrees/cleanlab/benchmarking/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/benchmarking/noise_generation.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/classification.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/count.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/data_valuation.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/datalab.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/guide/index.doctree
Binary file not shown.
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/guide/table.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/data.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/data_issues.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/factory.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/issue_finder.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/model_outputs.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/report.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/internal/task.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/datalab/optional_dependencies.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/dataset.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/experimental/cifar_cnn.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/experimental/coteaching.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/experimental/index.doctree
Binary file not shown.
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/experimental/mnist_pytorch.doctree
Binary file not shown.
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/filter.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/label_quality_utils.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/latent_algebra.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/multiannotator_utils.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/multilabel_scorer.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/multilabel_utils.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/neighbor/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/neighbor/knn_graph.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/neighbor/metric.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/neighbor/search.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/outlier.doctree
Binary file not shown.
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/util.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/internal/validation.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/models/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/models/keras.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/multiannotator.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/multilabel_classification/rank.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/object_detection/filter.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/object_detection/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/object_detection/rank.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/object_detection/summary.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/outlier.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/rank.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/regression/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/regression/learn.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/regression/rank.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/segmentation/filter.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/segmentation/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/segmentation/rank.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/segmentation/summary.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/token_classification/filter.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/token_classification/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/token_classification/rank.doctree
Binary file not shown.
Binary file modified master/.doctrees/cleanlab/token_classification/summary.doctree
Binary file not shown.
Binary file modified master/.doctrees/environment.pickle
Binary file not shown.
Binary file modified master/.doctrees/index.doctree
Binary file not shown.
Binary file modified master/.doctrees/migrating/migrate_v2.doctree
Binary file not shown.
145 changes: 80 additions & 65 deletions master/.doctrees/nbsphinx/tutorials/clean_learning/tabular.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,10 @@
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:34.527671Z",
"iopub.status.busy": "2024-07-30T16:31:34.527492Z",
"iopub.status.idle": "2024-07-30T16:31:36.140632Z",
"shell.execute_reply": "2024-07-30T16:31:36.140024Z"
"iopub.execute_input": "2024-08-02T23:17:23.433118Z",
"iopub.status.busy": "2024-08-02T23:17:23.432923Z",
"iopub.status.idle": "2024-08-02T23:17:24.941638Z",
"shell.execute_reply": "2024-08-02T23:17:24.941075Z"
},
"nbsphinx": "hidden"
},
Expand All @@ -126,7 +126,7 @@
"dependencies = [\"cleanlab\"]\n",
"\n",
"if \"google.colab\" in str(get_ipython()): # Check if it's running in Google Colab\n",
" %pip install git+https://github.com/cleanlab/cleanlab.git@774f5b4625f50853a4527b3bf0414f14a7116208\n",
" %pip install git+https://github.com/cleanlab/cleanlab.git@b699edd9acff56a96f5d8635fc51bcc94bc9a1ed\n",
" cmd = ' '.join([dep for dep in dependencies if dep != \"cleanlab\"])\n",
" %pip install $cmd\n",
"else:\n",
Expand All @@ -151,10 +151,10 @@
"execution_count": 2,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:36.143586Z",
"iopub.status.busy": "2024-07-30T16:31:36.143047Z",
"iopub.status.idle": "2024-07-30T16:31:36.178768Z",
"shell.execute_reply": "2024-07-30T16:31:36.178228Z"
"iopub.execute_input": "2024-08-02T23:17:24.944158Z",
"iopub.status.busy": "2024-08-02T23:17:24.943875Z",
"iopub.status.idle": "2024-08-02T23:17:24.963528Z",
"shell.execute_reply": "2024-08-02T23:17:24.962963Z"
}
},
"outputs": [],
Expand Down Expand Up @@ -195,10 +195,10 @@
"execution_count": 3,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:36.181589Z",
"iopub.status.busy": "2024-07-30T16:31:36.181045Z",
"iopub.status.idle": "2024-07-30T16:31:36.338074Z",
"shell.execute_reply": "2024-07-30T16:31:36.337466Z"
"iopub.execute_input": "2024-08-02T23:17:24.966010Z",
"iopub.status.busy": "2024-08-02T23:17:24.965604Z",
"iopub.status.idle": "2024-08-02T23:17:25.079442Z",
"shell.execute_reply": "2024-08-02T23:17:25.078863Z"
}
},
"outputs": [
Expand Down Expand Up @@ -305,10 +305,10 @@
"execution_count": 4,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:36.372204Z",
"iopub.status.busy": "2024-07-30T16:31:36.371964Z",
"iopub.status.idle": "2024-07-30T16:31:36.377781Z",
"shell.execute_reply": "2024-07-30T16:31:36.377262Z"
"iopub.execute_input": "2024-08-02T23:17:25.111044Z",
"iopub.status.busy": "2024-08-02T23:17:25.110645Z",
"iopub.status.idle": "2024-08-02T23:17:25.114497Z",
"shell.execute_reply": "2024-08-02T23:17:25.114027Z"
}
},
"outputs": [],
Expand All @@ -329,10 +329,10 @@
"execution_count": 5,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:36.380079Z",
"iopub.status.busy": "2024-07-30T16:31:36.379702Z",
"iopub.status.idle": "2024-07-30T16:31:36.389163Z",
"shell.execute_reply": "2024-07-30T16:31:36.388645Z"
"iopub.execute_input": "2024-08-02T23:17:25.116536Z",
"iopub.status.busy": "2024-08-02T23:17:25.116200Z",
"iopub.status.idle": "2024-08-02T23:17:25.124454Z",
"shell.execute_reply": "2024-08-02T23:17:25.123892Z"
}
},
"outputs": [],
Expand Down Expand Up @@ -384,10 +384,10 @@
"execution_count": 6,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:36.391552Z",
"iopub.status.busy": "2024-07-30T16:31:36.391341Z",
"iopub.status.idle": "2024-07-30T16:31:36.394409Z",
"shell.execute_reply": "2024-07-30T16:31:36.393862Z"
"iopub.execute_input": "2024-08-02T23:17:25.126998Z",
"iopub.status.busy": "2024-08-02T23:17:25.126543Z",
"iopub.status.idle": "2024-08-02T23:17:25.129409Z",
"shell.execute_reply": "2024-08-02T23:17:25.128804Z"
}
},
"outputs": [],
Expand All @@ -409,10 +409,10 @@
"execution_count": 7,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:36.396451Z",
"iopub.status.busy": "2024-07-30T16:31:36.396262Z",
"iopub.status.idle": "2024-07-30T16:31:36.936436Z",
"shell.execute_reply": "2024-07-30T16:31:36.935844Z"
"iopub.execute_input": "2024-08-02T23:17:25.131344Z",
"iopub.status.busy": "2024-08-02T23:17:25.131035Z",
"iopub.status.idle": "2024-08-02T23:17:25.655338Z",
"shell.execute_reply": "2024-08-02T23:17:25.654793Z"
}
},
"outputs": [],
Expand Down Expand Up @@ -446,10 +446,10 @@
"execution_count": 8,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:36.939261Z",
"iopub.status.busy": "2024-07-30T16:31:36.938884Z",
"iopub.status.idle": "2024-07-30T16:31:39.269788Z",
"shell.execute_reply": "2024-07-30T16:31:39.269009Z"
"iopub.execute_input": "2024-08-02T23:17:25.657837Z",
"iopub.status.busy": "2024-08-02T23:17:25.657465Z",
"iopub.status.idle": "2024-08-02T23:17:27.751426Z",
"shell.execute_reply": "2024-08-02T23:17:27.750727Z"
}
},
"outputs": [
Expand Down Expand Up @@ -481,10 +481,10 @@
"execution_count": 9,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:39.273002Z",
"iopub.status.busy": "2024-07-30T16:31:39.272142Z",
"iopub.status.idle": "2024-07-30T16:31:39.283199Z",
"shell.execute_reply": "2024-07-30T16:31:39.282635Z"
"iopub.execute_input": "2024-08-02T23:17:27.754463Z",
"iopub.status.busy": "2024-08-02T23:17:27.753684Z",
"iopub.status.idle": "2024-08-02T23:17:27.764911Z",
"shell.execute_reply": "2024-08-02T23:17:27.764361Z"
}
},
"outputs": [
Expand Down Expand Up @@ -605,10 +605,10 @@
"execution_count": 10,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:39.285386Z",
"iopub.status.busy": "2024-07-30T16:31:39.285054Z",
"iopub.status.idle": "2024-07-30T16:31:39.289139Z",
"shell.execute_reply": "2024-07-30T16:31:39.288681Z"
"iopub.execute_input": "2024-08-02T23:17:27.767199Z",
"iopub.status.busy": "2024-08-02T23:17:27.766875Z",
"iopub.status.idle": "2024-08-02T23:17:27.770951Z",
"shell.execute_reply": "2024-08-02T23:17:27.770498Z"
}
},
"outputs": [],
Expand All @@ -633,10 +633,10 @@
"execution_count": 11,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:39.291246Z",
"iopub.status.busy": "2024-07-30T16:31:39.290919Z",
"iopub.status.idle": "2024-07-30T16:31:39.298453Z",
"shell.execute_reply": "2024-07-30T16:31:39.297891Z"
"iopub.execute_input": "2024-08-02T23:17:27.772966Z",
"iopub.status.busy": "2024-08-02T23:17:27.772625Z",
"iopub.status.idle": "2024-08-02T23:17:27.779796Z",
"shell.execute_reply": "2024-08-02T23:17:27.779212Z"
}
},
"outputs": [],
Expand All @@ -658,10 +658,10 @@
"execution_count": 12,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:39.301188Z",
"iopub.status.busy": "2024-07-30T16:31:39.300801Z",
"iopub.status.idle": "2024-07-30T16:31:39.419299Z",
"shell.execute_reply": "2024-07-30T16:31:39.418728Z"
"iopub.execute_input": "2024-08-02T23:17:27.781951Z",
"iopub.status.busy": "2024-08-02T23:17:27.781645Z",
"iopub.status.idle": "2024-08-02T23:17:27.895282Z",
"shell.execute_reply": "2024-08-02T23:17:27.894690Z"
}
},
"outputs": [
Expand Down Expand Up @@ -691,10 +691,10 @@
"execution_count": 13,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:39.421608Z",
"iopub.status.busy": "2024-07-30T16:31:39.421234Z",
"iopub.status.idle": "2024-07-30T16:31:39.424361Z",
"shell.execute_reply": "2024-07-30T16:31:39.423765Z"
"iopub.execute_input": "2024-08-02T23:17:27.897585Z",
"iopub.status.busy": "2024-08-02T23:17:27.897260Z",
"iopub.status.idle": "2024-08-02T23:17:27.899963Z",
"shell.execute_reply": "2024-08-02T23:17:27.899515Z"
}
},
"outputs": [],
Expand All @@ -715,10 +715,10 @@
"execution_count": 14,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:39.426671Z",
"iopub.status.busy": "2024-07-30T16:31:39.426252Z",
"iopub.status.idle": "2024-07-30T16:31:41.720026Z",
"shell.execute_reply": "2024-07-30T16:31:41.719145Z"
"iopub.execute_input": "2024-08-02T23:17:27.902092Z",
"iopub.status.busy": "2024-08-02T23:17:27.901699Z",
"iopub.status.idle": "2024-08-02T23:17:30.041948Z",
"shell.execute_reply": "2024-08-02T23:17:30.041308Z"
}
},
"outputs": [],
Expand All @@ -738,10 +738,10 @@
"execution_count": 15,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:41.723999Z",
"iopub.status.busy": "2024-07-30T16:31:41.722968Z",
"iopub.status.idle": "2024-07-30T16:31:41.736024Z",
"shell.execute_reply": "2024-07-30T16:31:41.735553Z"
"iopub.execute_input": "2024-08-02T23:17:30.045213Z",
"iopub.status.busy": "2024-08-02T23:17:30.044360Z",
"iopub.status.idle": "2024-08-02T23:17:30.055915Z",
"shell.execute_reply": "2024-08-02T23:17:30.055449Z"
}
},
"outputs": [
Expand All @@ -766,15 +766,30 @@
"We can see that the test set accuracy slightly improved as a result of the data cleaning. Note that this will not always be the case, especially when we evaluate on test data that are themselves noisy. The best practice is to run cleanlab to identify potential label issues and then manually review them, before blindly trusting any accuracy metrics. In particular, the most effort should be made to ensure high-quality test data, which is supposed to reflect the expected performance of our model during deployment."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Spending too much time on data quality?\n",
"\n",
"Using this open-source package effectively can require significant ML expertise and experimentation, plus handling detected data issues can be cumbersome.\n",
"\n",
"That’s why we built [Cleanlab Studio](https://cleanlab.ai/blog/data-centric-ai/) -- an automated platform to find **and fix** issues in your dataset, 100x faster and more accurately. Cleanlab Studio automatically runs optimized data quality algorithms from this package on top of cutting-edge AutoML & Foundation models fit to your data, and helps you fix detected issues via a smart data correction interface. [Try it](https://cleanlab.ai/) for free!\n",
"\n",
"<p align=\"center\">\n",
" <img src=\"https://raw.githubusercontent.com/cleanlab/assets/master/cleanlab/ml-with-cleanlab-studio.png\" alt=\"The modern AI pipeline automated with Cleanlab Studio\">\n",
"</p>"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"execution": {
"iopub.execute_input": "2024-07-30T16:31:41.738164Z",
"iopub.status.busy": "2024-07-30T16:31:41.737959Z",
"iopub.status.idle": "2024-07-30T16:31:41.800777Z",
"shell.execute_reply": "2024-07-30T16:31:41.800288Z"
"iopub.execute_input": "2024-08-02T23:17:30.057830Z",
"iopub.status.busy": "2024-08-02T23:17:30.057653Z",
"iopub.status.idle": "2024-08-02T23:17:30.088205Z",
"shell.execute_reply": "2024-08-02T23:17:30.087743Z"
},
"nbsphinx": "hidden"
},
Expand Down
Loading

0 comments on commit 35d16d7

Please sign in to comment.