Skip to content

Commit

Permalink
Add screenshots data explorer (#555)
Browse files Browse the repository at this point in the history
fix #496

---------

Co-authored-by: Robbe Sneyders <[email protected]>
  • Loading branch information
mrchtr and RobbeSneyders authored Oct 27, 2023
1 parent 2d01d5e commit e388a45
Show file tree
Hide file tree
Showing 5 changed files with 28 additions and 20 deletions.
Binary file added docs/art/data_explorer/data_explorer.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/art/data_explorer/image_explorer.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
24 changes: 12 additions & 12 deletions docs/components/hub.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ Below you can find the reusable components offered by Fondant.

--8<-- "components/caption_images/README.md:1"

??? "chunk_text"

--8<-- "components/chunk_text/README.md:1"

??? "download_images"

--8<-- "components/download_images/README.md:1"
Expand All @@ -18,22 +22,18 @@ Below you can find the reusable components offered by Fondant.

--8<-- "components/embed_images/README.md:1"

??? "embedding_based_laion_retrieval"
??? "embed_text"

--8<-- "components/embedding_based_laion_retrieval/README.md:1"
--8<-- "components/embed_text/README.md:1"

??? "filter_comments"
??? "embedding_based_laion_retrieval"

--8<-- "components/filter_comments/README.md:1"
--8<-- "components/embedding_based_laion_retrieval/README.md:1"

??? "filter_image_resolution"

--8<-- "components/filter_image_resolution/README.md:1"

??? "filter_line_length"

--8<-- "components/filter_line_length/README.md:1"

??? "image_cropping"

--8<-- "components/image_cropping/README.md:1"
Expand All @@ -42,6 +42,10 @@ Below you can find the reusable components offered by Fondant.

--8<-- "components/image_resolution_extraction/README.md:1"

??? "index_weaviate"

--8<-- "components/index_weaviate/README.md:1"

??? "language_filter"

--8<-- "components/language_filter/README.md:1"
Expand All @@ -62,10 +66,6 @@ Below you can find the reusable components offered by Fondant.

--8<-- "components/minhash_generator/README.md:1"

??? "pii_redaction"

--8<-- "components/pii_redaction/README.md:1"

??? "prompt_based_laion_retrieval"

--8<-- "components/prompt_based_laion_retrieval/README.md:1"
Expand Down
24 changes: 16 additions & 8 deletions docs/data_explorer.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Data explorer

## Data explorer UI

The data explorer UI enables Fondant users to explore the inputs and outputs of their Fondant pipeline.

The user can specify a pipeline and a specific pipeline run and component to explore. The user will then be able to explore the different subsets produced by by Fondant components.

The chosen subset (and the columns within the subset) can be explored in 3 tabs.

![data explorer](../art/data_explorer/data_explorer.png)

## How to use?
You can setup the data explorer container with the `fondant explore` CLI command, which is installed together with the Fondant python package.

Expand All @@ -16,21 +26,19 @@ Example:
```bash
fondant explore --base_path gs://foo/bar --auth-gcp
```
## Data explorer UI

The data explorer UI enables Fondant users to explore the inputs and outputs of their Fondant pipeline.

The user can specify a pipeline and a specific pipeline run and component to explore. The user will then be able to explore the different subsets produced by by Fondant components.

The chosen subset (and the columns within the subset) can be explored in 3 tabs.

### Sidebar
In the sidebar, the user can specify the path to a manifest file. This will load the available subsets into a dropdown, from which the user can select one of the subsets. Finally, the columns within the subset are shown in a multiselect box, and can be used to remove / select the columns that are loaded into the exploration tabs.

### Data explorer Tab
The data explorer shows an interactive table of the loaded subset DataFrame with on each row a sample. The table can be used to browse through a partition of the data, to visualize images inside image columns and more.

### Numeric analysis Tab
The numerical analysis tab shows statistics of the numerical columns of the loaded subset (mean, std, percentiles, ...) in a table. In the second part of the tab, the user can choose one of the numerical columns for in depth exploration of the data by visualizing it in a variety of interactive plots.

![data explorer](../art/data_explorer/data_explorer_numeric_analysis.png)

### Image explorer Tab
The image explorer tab enables the user to choose one of the image columns and analyse these images.
The image explorer tab enables the user to choose one of the image columns and analyse these images.

![data explorer](../art/data_explorer/image_explorer.png)

0 comments on commit e388a45

Please sign in to comment.