diff --git a/content/tutorials/_index.md b/content/tutorials/_index.md index b1554a53d..66c409328 100644 --- a/content/tutorials/_index.md +++ b/content/tutorials/_index.md @@ -28,7 +28,7 @@ See the following tutorials for step by step information on how to use popular M - [PyTorch](/tutorials/pytorch) - [PyTorch Lightning](/tutorials/lightning) -- [HuggingFace πŸ€— Transformers](/tutorials/huggingface) +- [HuggingFace Transformers](/tutorials/huggingface) - Tensorflow - [Track experiments](/tutorials/tensorflow) - [Tune hyperparameters](/tutorials/tensorflow_sweeps) diff --git a/content/tutorials/artifacts.md b/content/tutorials/artifacts.md index d0fa0931b..b350d80fd 100644 --- a/content/tutorials/artifacts.md +++ b/content/tutorials/artifacts.md @@ -13,7 +13,7 @@ Follow along with a [video tutorial](http://tiny.cc/wb-artifacts-video). ## About artifacts -An artifact, like a Greek [amphora 🏺](https://en.wikipedia.org/wiki/Amphora), +An artifact, like a Greek [amphora](https://en.wikipedia.org/wiki/Amphora), is a produced object -- the output of a process. In ML, the most important artifacts are _datasets_ and _models_. @@ -30,7 +30,9 @@ where a training run takes in a dataset and produces a model. Since one run can use another run's output as an input, `Artifact`s and `Run`s together form a directed graph (a bipartite [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph), with nodes for `Artifact`s and `Run`s and arrows that connect a `Run` to the `Artifact`s it consumes or produces. -# 0️⃣ Install and Import +## Use artifacts to track models and datatsets + +### Install and Import Artifacts are part of our Python library, starting with version `0.9.2`. @@ -49,7 +51,7 @@ import os import wandb ``` -# 1️⃣ Log a Dataset +### Log a Dataset First, let's define some Artifacts. @@ -160,7 +162,7 @@ def load_and_log(): load_and_log() ``` -### πŸš€ `wandb.init` +#### `wandb.init` When we make the `Run` that's going to produce the `Artifact`s, @@ -180,7 +182,7 @@ This keeps the graph of your Artifacts nice and tidy. > **Rule of πŸ‘**: the `job_type` should be descriptive and correspond to a single step of your pipeline. Here, we separate out `load`ing data from `preprocess`ing data. -### 🏺 `wandb.Artifact` +#### `wandb.Artifact` To log something as an `Artifact`, we have to first make an `Artifact` object. @@ -202,7 +204,7 @@ The `metadata` just needs to be serializable to JSON. > **Rule of πŸ‘**: the `metadata` should be as descriptive as possible. -### 🐣 `artifact.new_file` and ✍️ `run.log_artifact` +#### `artifact.new_file` and `run.log_artifact` Once we've made an `Artifact` object, we need to add files to it. @@ -227,7 +229,7 @@ including any `Artifact`s that got logged. We'll see some examples that make better use of the other components of the Run page below. -# 2️⃣ Use a Logged Dataset Artifact +### Use a Logged Dataset Artifact `Artifact`s in W&B, unlike artifacts in museums, are designed to be _used_, not just stored. @@ -325,7 +327,7 @@ steps = {"normalize": True, preprocess_and_log(steps) ``` -### βœ”οΈ `run.use_artifact` +#### `run.use_artifact` These steps are simpler. The consumer just needs to know the `name` of the `Artifact`, plus a bit more. @@ -342,8 +344,7 @@ so the `Artifact` we want is `mnist-raw:latest`. Use custom `alias`es like `latest` or `best` when you want an `Artifact` that satisifies some property -### πŸ“₯ `artifact.download` - +#### `artifact.download` Now, you may be worrying about the `download` call. If we download another copy, won't that double the burden on memory? @@ -363,10 +364,10 @@ Check out its contents with `!tree artifacts`: !tree artifacts ``` -### 🌐 The Artifacts page on [wandb.ai](https://wandb.ai) +#### The Artifacts page Now that we've logged and used an `Artifact`, -let's check out the Artifacts tab on the Run page. +let's check out the Artifacts tab on the Run page. Navigate to the Run page URL from the `wandb` output and select the "Artifacts" tab from the left sidebar @@ -384,7 +385,7 @@ with the `type`s of `Artifact`s and the `job_type`s of `Run` as the two types of nodes, with arrows to represent consumption and production. -# 3️⃣ Log a Model +### Log a Model That's enough to see how the API for `Artifact`s works, but let's follow this example through to the end of the pipeline @@ -478,7 +479,7 @@ model_config = {"hidden_layer_sizes": [32, 64], build_model_and_log(model_config) ``` -### βž• `artifact.add_file` +#### `artifact.add_file` Instead of simultaneously writing a `new_file` and adding it to the `Artifact`, @@ -489,7 +490,7 @@ and then `add` them to the `Artifact` in another. > **Rule of πŸ‘**: use `new_file` when you can, to prevent duplication. -# 4️⃣ Use a Logged Model Artifact +#### Use a Logged Model Artifact Just like we could call `use_artifact` on a `dataset`, we can call it on our `initialized_model` diff --git a/content/tutorials/experiments.md b/content/tutorials/experiments.md index fdba8df77..f21418b81 100644 --- a/content/tutorials/experiments.md +++ b/content/tutorials/experiments.md @@ -90,7 +90,7 @@ The following image shows what a dashboard can look like: Now that we know how to integrate W&B into a psuedo machine learning training loop, let's track a machine learning experiment using a basic PyTorch neural network. The following code will also upload model checkpoints to W&B that you can then share with other teams in your organization. -## Track a machine learning experiment using Pytorch +## Track a machine learning experiment using Pytorch The following code cell defines and trains a simple MNIST classifier. During training, you will see W&B prints out URLs. Click on the project page link to see your results stream in live to a W&B project. diff --git a/content/tutorials/integration-tutorials/huggingface.md b/content/tutorials/integration-tutorials/huggingface.md index d5d171644..60c6b92ef 100644 --- a/content/tutorials/integration-tutorials/huggingface.md +++ b/content/tutorials/integration-tutorials/huggingface.md @@ -74,7 +74,7 @@ Optionally, we can set environment variables to customize W&B logging. See [docu %env WANDB_WATCH=all ``` -# πŸ‘Ÿ Train the model +## Train the model Next, call the downloaded training script [run_glue.py](https://huggingface.co/transformers/examples.html#glue) and see training automatically get tracked to the Weights & Biases dashboard. This script fine-tunes BERT on the Microsoft Research Paraphrase Corpusβ€” pairs of sentences with human annotations indicating whether they are semantically equivalent. @@ -109,7 +109,7 @@ Here's an example comparing [BERT vs DistilBERT](https://app.wandb.ai/jack-morri {{< img src="/images/tutorials/huggingface-comparearchitectures.gif" alt="" >}} -### Track key information effortlessly by default +## Track key information effortlessly by default Weights & Biases saves a new run for each experiment. Here's the information that gets saved by default: - **Hyperparameters**: Settings for your model are saved in Config - **Model Metrics**: Time series data of metrics streaming in are saved in Log diff --git a/content/tutorials/integration-tutorials/keras.md b/content/tutorials/integration-tutorials/keras.md index 3220453a0..d1235f66f 100644 --- a/content/tutorials/integration-tutorials/keras.md +++ b/content/tutorials/integration-tutorials/keras.md @@ -14,7 +14,7 @@ Use Weights & Biases for machine learning experiment tracking, dataset versionin This Colab notebook introduces the `WandbMetricsLogger` callback. Use this callback for [Experiment Tracking](/guides/track). It will log your training and validation metrics along with system metrics to Weights and Biases. -## 🌴 Setup and Installation +## Setup and Installation First, let us install the latest version of Weights and Biases. We will then authenticate this colab instance to use W&B. @@ -43,31 +43,31 @@ If this is your first time using W&B or you are not logged in, the link that app wandb.login() ``` -## 🌳 Hyperparameters +## Hyperparameters Use of proper config system is a recommended best practice for reproducible machine learning. We can track the hyperparameters for every experiment using W&B. In this colab we will be using simple Python `dict` as our config system. ```python configs = dict( - num_classes = 10, - shuffle_buffer = 1024, - batch_size = 64, - image_size = 28, - image_channels = 1, - earlystopping_patience = 3, - learning_rate = 1e-3, - epochs = 10 + num_classes=10, + shuffle_buffer=1024, + batch_size=64, + image_size=28, + image_channels=1, + earlystopping_patience=3, + learning_rate=1e-3, + epochs=10, ) ``` -## 🍁 Dataset +## Dataset In this colab, we will be using [CIFAR100](https://www.tensorflow.org/datasets/catalog/cifar100) dataset from TensorFlow Dataset catalog. We aim to build a simple image classification pipeline using TensorFlow/Keras. ```python -train_ds, valid_ds = tfds.load('fashion_mnist', split=['train', 'test']) +train_ds, valid_ds = tfds.load("fashion_mnist", split=["train", "test"]) ``` @@ -90,14 +90,10 @@ def parse_data(example): def get_dataloader(ds, configs, dataloader_type="train"): dataloader = ds.map(parse_data, num_parallel_calls=AUTOTUNE) - if dataloader_type=="train": + if dataloader_type == "train": dataloader = dataloader.shuffle(configs["shuffle_buffer"]) - - dataloader = ( - dataloader - .batch(configs["batch_size"]) - .prefetch(AUTOTUNE) - ) + + dataloader = dataloader.batch(configs["batch_size"]).prefetch(AUTOTUNE) return dataloader ``` @@ -108,17 +104,21 @@ trainloader = get_dataloader(train_ds, configs) validloader = get_dataloader(valid_ds, configs, dataloader_type="valid") ``` -# πŸŽ„ Model +## Model ```python def get_model(configs): - backbone = tf.keras.applications.mobilenet_v2.MobileNetV2(weights='imagenet', include_top=False) + backbone = tf.keras.applications.mobilenet_v2.MobileNetV2( + weights="imagenet", include_top=False + ) backbone.trainable = False - inputs = layers.Input(shape=(configs["image_size"], configs["image_size"], configs["image_channels"])) + inputs = layers.Input( + shape=(configs["image_size"], configs["image_size"], configs["image_channels"]) + ) resize = layers.Resizing(32, 32)(inputs) - neck = layers.Conv2D(3, (3,3), padding="same")(resize) + neck = layers.Conv2D(3, (3, 3), padding="same")(resize) preprocess_input = tf.keras.applications.mobilenet.preprocess_input(neck) x = backbone(preprocess_input) x = layers.GlobalAveragePooling2D()(x) @@ -134,33 +134,35 @@ model = get_model(configs) model.summary() ``` -## 🌿 Compile Model +## Compile Model ```python model.compile( - optimizer = "adam", - loss = "categorical_crossentropy", - metrics = ["accuracy", tf.keras.metrics.TopKCategoricalAccuracy(k=5, name='top@5_accuracy')] + optimizer="adam", + loss="categorical_crossentropy", + metrics=[ + "accuracy", + tf.keras.metrics.TopKCategoricalAccuracy(k=5, name="top@5_accuracy"), + ], ) ``` -## 🌻 Train +## Train ```python # Initialize a W&B run -run = wandb.init( - project = "intro-keras", - config = configs -) +run = wandb.init(project="intro-keras", config=configs) # Train your model model.fit( trainloader, - epochs = configs["epochs"], - validation_data = validloader, - callbacks = [WandbMetricsLogger(log_freq=10)] # Notice the use of WandbMetricsLogger here + epochs=configs["epochs"], + validation_data=validloader, + callbacks=[ + WandbMetricsLogger(log_freq=10) + ], # Notice the use of WandbMetricsLogger here ) # Close the W&B run diff --git a/content/tutorials/integration-tutorials/keras_models.md b/content/tutorials/integration-tutorials/keras_models.md index 583808e91..19dde3ad1 100644 --- a/content/tutorials/integration-tutorials/keras_models.md +++ b/content/tutorials/integration-tutorials/keras_models.md @@ -13,7 +13,7 @@ Use Weights & Biases for machine learning experiment tracking, dataset versionin This Colab notebook introduces the `WandbModelCheckpoint` callback. Use this callback to log your model checkpoints to Weight and Biases [Artifacts](/guides/artifacts). -## 🌴 Setup and Installation +## Setup and Installation First, let us install the latest version of Weights and Biases. We will then authenticate this colab instance to use W&B. @@ -43,7 +43,7 @@ If this is your first time using W&B or you are not logged in, the link that app wandb.login() ``` -## 🌳 Hyperparameters +## Hyperparameters Use of proper config system is a recommended best practice for reproducible machine learning. We can track the hyperparameters for every experiment using W&B. In this colab we will be using simple Python `dict` as our config system. @@ -61,7 +61,7 @@ configs = dict( ) ``` -## 🍁 Dataset +## Dataset In this colab, we will be using [CIFAR100](https://www.tensorflow.org/datasets/catalog/cifar100) dataset from TensorFlow Dataset catalog. We aim to build a simple image classification pipeline using TensorFlow/Keras. @@ -108,7 +108,7 @@ trainloader = get_dataloader(train_ds, configs) validloader = get_dataloader(valid_ds, configs, dataloader_type="valid") ``` -## πŸŽ„ Model +## Model ```python @@ -134,7 +134,7 @@ model = get_model(configs) model.summary() ``` -## 🌿 Compile Model +## Compile Model ```python @@ -145,7 +145,7 @@ model.compile( ) ``` -## 🌻 Train +## Train ```python diff --git a/content/tutorials/integration-tutorials/keras_tables.md b/content/tutorials/integration-tutorials/keras_tables.md index 06ba98b89..ef31d83cb 100644 --- a/content/tutorials/integration-tutorials/keras_tables.md +++ b/content/tutorials/integration-tutorials/keras_tables.md @@ -13,7 +13,7 @@ Use Weights & Biases for machine learning experiment tracking, dataset versionin This Colab notebook introduces the `WandbEvalCallback` which is an abstract callback that be inherited to build useful callbacks for model prediction visualization and dataset visualization. -## 🌴 Setup and Installation +## Setup and Installation First, let us install the latest version of Weights and Biases. We will then authenticate this colab instance to use W&B. @@ -45,7 +45,7 @@ If this is your first time using W&B or you are not logged in, the link that app wandb.login() ``` -## 🌳 Hyperparameters +## Hyperparameters Use of proper config system is a recommended best practice for reproducible machine learning. We can track the hyperparameters for every experiment using W&B. In this colab we will be using simple Python `dict` as our config system. @@ -63,7 +63,7 @@ configs = dict( ) ``` -## 🍁 Dataset +## Dataset In this colab, we will be using [CIFAR100](https://www.tensorflow.org/datasets/catalog/cifar100) dataset from TensorFlow Dataset catalog. We aim to build a simple image classification pipeline using TensorFlow/Keras. @@ -110,7 +110,7 @@ trainloader = get_dataloader(train_ds, configs) validloader = get_dataloader(valid_ds, configs, dataloader_type="valid") ``` -## πŸŽ„ Model +## Model ```python @@ -140,7 +140,7 @@ model = get_model(configs) model.summary() ``` -## 🌿 Compile Model +## Compile Model ```python @@ -154,7 +154,7 @@ model.compile( ) ``` -## πŸ’« `WandbEvalCallback` +## `WandbEvalCallback` The `WandbEvalCallback` is an abstract base class to build Keras callbacks for primarily model prediction visualization and secondarily dataset visualization. @@ -172,7 +172,7 @@ As an example, we have implemented `WandbClfEvalCallback` below for an image cla - performs inference and logs the prediction (`pred_table`) to W&B on every epoch end. -## ✨ How the memory footprint is reduced? +## How the memory footprint is reduced We log the `data_table` to W&B when the `on_train_begin` method is ivoked. Once it's uploaded as a W&B Artifact, we get a reference to this table which can be accessed using `data_table_ref` class variable. The `data_table_ref` is a 2D list that can be indexed like `self.data_table_ref[idx][n]` where `idx` is the row number while `n` is the column number. Let's see the usage in the example below. @@ -214,7 +214,7 @@ class WandbClfEvalCallback(WandbEvalCallback): return preds ``` -## 🌻 Train +## Train ```python diff --git a/content/tutorials/integration-tutorials/lightning.md b/content/tutorials/integration-tutorials/lightning.md index eee8a63d3..194bc5794 100644 --- a/content/tutorials/integration-tutorials/lightning.md +++ b/content/tutorials/integration-tutorials/lightning.md @@ -21,6 +21,7 @@ pip install wandb -qU ```python import lightning.pytorch as pl + # your favorite machine learning tracking tool from lightning.pytorch.loggers import WandbLogger @@ -43,7 +44,7 @@ Now you'll need to log in to your wandb account. wandb.login() ``` -## πŸ”§ DataModule - The Data Pipeline we Deserve +## DataModule - The Data Pipeline we Deserve DataModules are a way of decoupling data-related hooks from the LightningModule so you can develop dataset agnostic models. @@ -95,7 +96,7 @@ class CIFAR10DataModule(pl.LightningDataModule): return DataLoader(self.cifar_test, batch_size=self.batch_size) ``` -## πŸ“± Callbacks +## Callbacks A callback is a self-contained program that can be reused across projects. PyTorch Lightning comes with few [built-in callbacks](https://lightning.ai/docs/pytorch/latest/extensions/callbacks.html#built-in-callbacks) which are regularly used. Learn more about callbacks in PyTorch Lightning [here](https://lightning.ai/docs/pytorch/latest/extensions/callbacks.html). @@ -135,7 +136,7 @@ class ImagePredictionLogger(pl.callbacks.Callback): ``` -## 🎺 LightningModule - Define the System +## LightningModule - Define the System The LightningModule defines a system and not a model. Here a system groups all the research code into a single class to make it self-contained. `LightningModule` organizes your PyTorch code into 5 sections: - Computations (`__init__`). @@ -242,7 +243,7 @@ class LitModel(pl.LightningModule): ``` -## πŸš‹ Train and Evaluate +## Train and Evaluate Now that we have organized our data pipeline using `DataModule` and model architecture+training loop using `LightningModule`, the PyTorch Lightning `Trainer` automates everything else for us. @@ -288,7 +289,7 @@ trainer = pl.Trainer(max_epochs=2, checkpoint_callback], ) -# Train the model βš‘πŸš…βš‘ +# Train the model trainer.fit(model, dm) # Evaluate the model on the held-out test set ⚑⚑ diff --git a/content/tutorials/integration-tutorials/monai_3d_segmentation.md b/content/tutorials/integration-tutorials/monai_3d_segmentation.md index 044f54c19..772a78c89 100644 --- a/content/tutorials/integration-tutorials/monai_3d_segmentation.md +++ b/content/tutorials/integration-tutorials/monai_3d_segmentation.md @@ -26,7 +26,7 @@ This tutorial demonstrates how to construct a training workflow of multi-labels 4. Log and version model checkpoints as model artifacts on Weights & Biases. 5. Visualize and compare the predictions on the validation dataset using `wandb.Table` and interactive segmentation overlay on Weights & Biases. -## 🌴 Setup and Installation +## Setup and Installation First, install the latest version of both MONAI and Weights and Biases. @@ -75,7 +75,7 @@ Then, authenticate the Colab instance to use W&B. wandb.login() ``` -## 🌳 Initialize a W&B Run +## Initialize a W&B Run Start a new W&B run to start tracking the experiment. @@ -118,7 +118,7 @@ os.makedirs(config.dataset_dir, exist_ok=True) os.makedirs(config.checkpoint_dir, exist_ok=True) ``` -## πŸ’Ώ Data Loading and Transformation +## Data Loading and Transformation Here, use the `monai.transforms` API to create a custom transform that converts the multi-classes labels into multi-labels segmentation task in one-hot format. @@ -198,7 +198,7 @@ val_transform = Compose( ) ``` -### 🍁 The Dataset +### The Dataset The dataset used for this experiment comes from http://medicaldecathlon.com/. It uses multi-modal multi-site MRI data (FLAIR, T1w, T1gd, T2w) to segment Gliomas, necrotic/active tumour, and oedema. The dataset consists of 750 4D volumes (484 Training + 266 Testing). @@ -229,7 +229,7 @@ val_dataset = DecathlonDataset( **Note:** Instead of applying the `train_transform` to the `train_dataset`, apply `val_transform` to both the training and validation datasets. This is because, before training, you would be visualizing samples from both the splits of the dataset. {{% /alert %}} -### πŸ“Έ Visualizing the Dataset +### Visualizing the Dataset Weights & Biases supports images, video, audio, and more. You can log rich media to explore your results and visually compare our runs, models, and datasets. Use the [segmentation mask overlay system](/guides/track/log/media#image-overlays-in-tables) to visualize our data volumes. To log segmentation masks in [tables](/guides/tables), you must provide a `wandb.Image` object for each row in the table. @@ -375,7 +375,7 @@ Open an image and see how you can interact with each of the segmentation masks u **Note:** The labels in the dataset consist of non-overlapping masks across classes. The overlay logs the labels as separate masks in the overlay. {{% /alert %}} -### πŸ›« Loading the Data +### Loading the Data Create the PyTorch DataLoaders for loading the data from the datasets. Before creating the DataLoaders, set the `transform` for `train_dataset` to `train_transform` to pre-process and transform the data for training. @@ -400,7 +400,7 @@ val_loader = DataLoader( ) ``` -## πŸ€– Creating the Model, Loss, and Optimizer +## Creating the Model, Loss, and Optimizer This tutorial crates a `SegResNet` model based on the paper [3D MRI brain tumor segmentation using auto-encoder regularization](https://arxiv.org/pdf/1810.11654.pdf). The `SegResNet` model that comes implemented as a PyTorch Module as part of the `monai.networks` API as well as an optimizer and learning rate scheduler. @@ -467,7 +467,7 @@ def inference(model, input): return _compute(input) ``` -## 🚝 Training and Validation +## Training and Validation Before training, define the metric properties which will later be logged with `wandb.log()` for tracking the training and validation experiments. @@ -487,7 +487,7 @@ metric_values_whole_tumor = [] metric_values_enhanced_tumor = [] ``` -### 🍭 Execute Standard PyTorch Training Loop +### Execute Standard PyTorch Training Loop ```python # Define a W&B Artifact object @@ -593,7 +593,7 @@ Navigate to the artifacts tab in the W&B run dashboard to access the different v |:--:| | **An example of model checkpoints logging and versioning on W&B.** | -## πŸ”± Inference +## Inference Using the artifacts interface, you can select which version of the artifact is the best model checkpoint, in this case, the mean epoch-wise training loss. You can also explore the entire lineage of the artifact and use the version that you need. @@ -613,7 +613,7 @@ model.load_state_dict(torch.load(os.path.join(model_artifact_dir, "model.pth"))) model.eval() ``` -### πŸ“Έ Visualizing Predictions and Comparing with the Ground Truth Labels +### Visualizing Predictions and Comparing with the Ground Truth Labels Create another utility function to visualize the predictions of the pre-trained model and compare them with the corresponding ground-truth segmentation mask using the interactive segmentation mask overlay,. diff --git a/content/tutorials/integration-tutorials/pytorch.md b/content/tutorials/integration-tutorials/pytorch.md index 20276fcca..e88e21f05 100644 --- a/content/tutorials/integration-tutorials/pytorch.md +++ b/content/tutorials/integration-tutorials/pytorch.md @@ -12,16 +12,12 @@ Use [Weights & Biases](https://wandb.com) for machine learning experiment tracki {{< img src="/images/tutorials/huggingface-why.png" alt="" >}} -## What this notebook covers: +## What this notebook covers We show you how to integrate Weights & Biases with your PyTorch code to add experiment tracking to your pipeline. -## The resulting interactive W&B dashboard will look like: - {{< img src="/images/tutorials/pytorch.png" alt="" >}} -## In pseudocode, what we'll do is: - ```python # import the library import wandb @@ -48,8 +44,6 @@ model.to_onnx() wandb.save("model.onnx") ``` - - Follow along with a [video tutorial](http://wandb.me/pytorch-video). **Note**: Sections starting with _Step_ are all you need to integrate W&B in an existing pipeline. The rest just loads data and defines a model. @@ -108,9 +102,9 @@ import wandb wandb.login() ``` -# πŸ‘©β€πŸ”¬ Define the Experiment and Pipeline +## Define the Experiment and Pipeline -## 2️⃣ Step 2: Track metadata and hyperparameters with `wandb.init` +### Track metadata and hyperparameters with `wandb.init` Programmatically, the first thing we do is define our experiment: what are the hyperparameters? what metadata is associated with this run? @@ -211,7 +205,7 @@ def make(config): return model, train_loader, test_loader, criterion, optimizer ``` -# πŸ“‘ Define the Data Loading and Model +### Define the Data Loading and Model Now, we need to specify how the data is loaded and what the model looks like. @@ -277,13 +271,13 @@ class ConvNet(nn.Module): return out ``` -# πŸ‘Ÿ Define Training Logic +### Define Training Logic Moving on in our `model_pipeline`, it's time to specify how we `train`. Two `wandb` functions come into play here: `watch` and `log`. -### 3️⃣ Step 3. Track gradients with `wandb.watch` and everything else with `wandb.log` +## Track gradients with `wandb.watch` and everything else with `wandb.log` `wandb.watch` will log the gradients and the parameters of your model, every `log_freq` steps of training. @@ -354,7 +348,7 @@ def train_log(loss, example_ct, epoch): print(f"Loss after {str(example_ct).zfill(5)} examples: {loss:.3f}") ``` -# πŸ§ͺ Define Testing Logic +### Define Testing Logic Once the model is done training, we want to test it: run it against some fresh data from production, perhaps, @@ -362,7 +356,7 @@ or apply it to some hand-curated examples. -#### 4️⃣ Optional Step 4: Call `wandb.save` +## (Optional) Call `wandb.save` This is also a great time to save the model's architecture and final parameters to disk. @@ -401,7 +395,7 @@ def test(model, test_loader): wandb.save("model.onnx") ``` -# πŸƒβ€β™€οΈ Run training and watch your metrics live on wandb.ai +### Run training and watch your metrics live on wandb.ai Now that we've defined the whole pipeline and slipped in those few lines of W&B code, @@ -428,7 +422,7 @@ we'll also print a summary of the results in the cell output. model = model_pipeline(config) ``` -# 🧹 Test Hyperparameters with Sweeps +### Test Hyperparameters with Sweeps We only looked at a single set of hyperparameters in this example. But an important part of most ML workflows is iterating over @@ -453,11 +447,11 @@ That's all there is to running a hyperparameter sweep. {{< img src="/images/tutorials/pytorch-2.png" alt="" >}} -# πŸ–ΌοΈ Example Gallery +## Example Gallery See examples of projects tracked and visualized with W&B in our [Gallery β†’](https://app.wandb.ai/gallery) -# πŸ€“ Advanced Setup +## Advanced Setup 1. [Environment variables](/guides/hosting/env-vars): Set API keys in environment variables so you can run training on a managed cluster. 2. [Offline mode](../support/run_wandb_offline.md): Use `dryrun` mode to train offline and sync results later. 3. [On-prem](/guides/hosting/hosting-options/self-managed): Install W&B in a private cloud or air-gapped servers in your own infrastructure. We have local installations for everyone from academics to enterprise teams. diff --git a/content/tutorials/integration-tutorials/tensorflow.md b/content/tutorials/integration-tutorials/tensorflow.md index 52cb205cc..171c08c46 100644 --- a/content/tutorials/integration-tutorials/tensorflow.md +++ b/content/tutorials/integration-tutorials/tensorflow.md @@ -17,8 +17,7 @@ Use Weights & Biases for machine learning experiment tracking, dataset versionin * Easy integration of Weights and Biases with your TensorFlow pipeline for experiment tracking. * Computing metrics with `keras.metrics` * Using `wandb.log` to log those metrics in your custom training loop. - -## The interactive W&B dashboard will look like this: + {{< img src="/images/tutorials/tensorflow/dashboard.png" alt="dashboard" >}} @@ -35,9 +34,9 @@ import pandas as pd import matplotlib.pyplot as plt ``` -# πŸš€ Install, Import, Login +## Install, Import, Login -## Step 0️⃣: Install W&B +### Install W&B ```python @@ -45,7 +44,7 @@ import matplotlib.pyplot as plt !pip install wandb ``` -## Step 1️⃣: Import W&B and login +### Import W&B and login ```python @@ -57,7 +56,7 @@ wandb.login() > Side note: If this is your first time using W&B or you are not logged in, the link that appears after running `wandb.login()` will take you to sign-up/login page. Signing up is as easy as one click. -# πŸ‘©β€πŸ³ Prepare Dataset +### Prepare Dataset ```python @@ -75,7 +74,7 @@ val_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test)) val_dataset = val_dataset.batch(BATCH_SIZE) ``` -# 🧠 Define the Model and the Training Loop +## Define the Model and the Training Loop ```python @@ -113,7 +112,7 @@ def test_step(x, y, model, loss_fn, val_acc_metric): return loss_value ``` -## Step 2️⃣: Add `wandb.log` to your training loop +## Add `wandb.log` to your training loop ```python @@ -160,9 +159,9 @@ def train(train_dataset, val_dataset, model, optimizer, 'val_acc':float(val_acc)}) ``` -# πŸ‘Ÿ Run Training +## Run Training -## Step 3️⃣: Call `wandb.init` to start a run +### Call `wandb.init` to start a run This lets us know you're launching an experiment, so we can give it a unique ID and a dashboard. @@ -210,11 +209,11 @@ train(train_dataset, run.finish() # In Jupyter/Colab, let us know you're finished! ``` -# πŸ‘€ Visualize Results +### Visualize Results Click on the [**run page**](/guides/runs/intro.md#view-logged-runs) link above to see your live results. -# 🧹 Sweep 101 +## Sweep 101 Use Weights & Biases Sweeps to automate hyperparameter optimization and explore the space of possible models. @@ -228,7 +227,7 @@ Use Weights & Biases Sweeps to automate hyperparameter optimization and explore {{< img src="/images/tutorials/tensorflow/sweeps.png" alt="Sweep result" >}} -# 🎨 Example Gallery +## Example Gallery See examples of projects tracked and visualized with W&B in our gallery of examples, [Fully Connected β†’](https://wandb.me/fc) @@ -239,7 +238,7 @@ See examples of projects tracked and visualized with W&B in our gallery of examp 4. **Notes**: Type notes in the table to track the changes between runs. 5. **Reports**: Take quick notes on progress to share with colleagues and make dashboards and snapshots of your ML projects. -## πŸ€“ Advanced Setup +## Advanced Setup 1. [Environment variables](/guides/hosting/env-vars): Set API keys in environment variables so you can run training on a managed cluster. 2. [Offline mode](../support/run_wandb_offline.md) 3. [On-prem](/guides/hosting/hosting-options/self-managed): Install W&B in a private cloud or air-gapped servers in your own infrastructure. We have local installations for everyone from academics to enterprise teams. diff --git a/content/tutorials/integration-tutorials/tensorflow_sweeps.md b/content/tutorials/integration-tutorials/tensorflow_sweeps.md index 7345ffbfb..d77fe380d 100644 --- a/content/tutorials/integration-tutorials/tensorflow_sweeps.md +++ b/content/tutorials/integration-tutorials/tensorflow_sweeps.md @@ -17,7 +17,7 @@ Use Weights & Biases Sweeps to automate hyperparameter optimization and explore {{< img src="/images/tutorials/tensorflow/sweeps.png" alt="" >}} -## πŸ€” Why Should I Use Sweeps? +## Why Should I Use Sweeps? * **Quick setup**: With just a few lines of code, you can run W&B sweeps. * **Transparent**: The project cites all algorithms used, and the [code is open source](https://github.com/wandb/wandb/blob/main/wandb/apis/public/sweeps.py). @@ -34,13 +34,9 @@ Use Weights & Biases Sweeps to automate hyperparameter optimization and explore **Note**: Sections starting with _Step_ are all you need to perform hyperparameter sweep in existing code. The rest of the code is there to set up a simple example. +## Install, Import, and Log in - - - -## πŸš€ Install, Import, and Log in - -### Step 0️⃣: Install W&B +### Install W&B ```bash @@ -48,7 +44,7 @@ The rest of the code is there to set up a simple example. !pip install wandb ``` -### Step 1️⃣: Import W&B and Login +### Import W&B and Login ```python @@ -73,7 +69,7 @@ wandb.login() > Side note: If this is your first time using W&B or you are not logged in, the link that appears after running `wandb.login()` will take you to sign-up/login page. Signing up is as easy as a few clicks. -## πŸ‘©β€πŸ³ Prepare Dataset +## Prepare Dataset ```python @@ -86,9 +82,7 @@ x_train = np.reshape(x_train, (-1, 784)) x_test = np.reshape(x_test, (-1, 784)) ``` -## 🧠 Define the Model and Training Loop - -## πŸ—οΈ Build a Simple Classifier MLP +## Build a Simple Classifier MLP ```python @@ -122,10 +116,7 @@ def test_step(x, y, model, loss_fn, val_acc_metric): return loss_value ``` -## πŸ” Write a Training Loop - -### Step 3️⃣: Log metrics with `wandb.log` - +## Write a Training Loop ```python def train( @@ -191,7 +182,7 @@ def train( ) ``` -### Step 4️⃣: Configure the Sweep +## Configure the Sweep This is where you will: * Define the hyperparameters you're sweeping over @@ -214,7 +205,7 @@ sweep_config = { } ``` -### Step 5️⃣: Wrap the Training Loop +## Wrap the Training Loop You'll need a function, like `sweep_train` below, that uses `wandb.config` to set the hyperparameters @@ -274,7 +265,7 @@ def sweep_train(config_defaults=None): ) ``` -### Step 6️⃣: Initialize Sweep and Run Agent +## Initialize Sweep and Run Agent ```python @@ -288,23 +279,23 @@ You can limit the number of total runs with the `count` parameter, we will limit wandb.agent(sweep_id, function=sweep_train, count=10) ``` -## πŸ‘€ Visualize Results +## Visualize Results Click on the **Sweep URL** link above to see your live results. -## 🎨 Example Gallery +## Example Gallery See examples of projects tracked and visualized with W&B in the [Gallery β†’](https://app.wandb.ai/gallery) -## πŸ“ Best Practices +## Best Practices 1. **Projects**: Log multiple runs to a project to compare them. `wandb.init(project="project-name")` 2. **Groups**: For multiple processes or cross validation folds, log each process as a runs and group them together. `wandb.init(group='experiment-1')` 3. **Tags**: Add tags to track your current baseline or production model. 4. **Notes**: Type notes in the table to track the changes between runs. 5. **Reports**: Take quick notes on progress to share with colleagues and make dashboards and snapshots of your ML projects. -## πŸ€“ Advanced Setup +## Advanced Setup 1. [Environment variables](/guides/hosting/env-vars): Set API keys in environment variables so you can run training on a managed cluster. 2. [Offline mode](../support/run_wandb_offline.md) 3. [On-prem](/guides/hosting/hosting-options/self-managed): Install W&B in a private cloud or air-gapped servers in your own infrastructure. Everyone from academics to enterprise teams use local installations. \ No newline at end of file diff --git a/content/tutorials/weave_models_registry.md b/content/tutorials/weave_models_registry.md index 9e70b8da4..19d37e61c 100644 --- a/content/tutorials/weave_models_registry.md +++ b/content/tutorials/weave_models_registry.md @@ -3,13 +3,11 @@ menu: tutorials: identifier: weave_models_registry parent: weave-and-models-tutorials -title: Weave and Models +title: Weave and Models integration demo --- {{< cta-button colabLink="https://colab.research.google.com/drive/1Uqgel6cNcGdP7AmBXe2pR9u6Dejggsh8?usp=sharing" >}} -# Models and Weave integration demo - This notebook shows how to use W&B Weave together with W&B Models. Specifically, this example considers two different teams. * **The Model Team:** the model building team fine-tunes a new Chat Model (Llama 3.2) and saves it to the registry using **W&B Models**. @@ -31,7 +29,7 @@ The workflow covers the following steps: The `RagModel` referenced below is top-level `weave.Model` that you can consider a complete RAG app. It contains a `ChatModel`, Vector database, and a Prompt. The `ChatModel` is also another `weave.Model` which contains the code to download an artifact from the W&B Registry and it can change to support any other chat model as part of the `RagModel`. For more details see [the complete model on Weave](https://wandb.ai/wandb-smle/weave-cookboook-demo/weave/evaluations?peekPath=%2Fwandb-smle%2Fweave-cookboook-demo%2Fobjects%2FRagModel%2Fversions%2Fx7MzcgHDrGXYHHDQ9BA8N89qDwcGkdSdpxH30ubm8ZM%3F%26). -# 1. Setup +## 1. Setup First, install `weave` and `wandb`, then log in with an API key. You can create and view your API keys at https://wandb.ai/settings. ```bash @@ -50,7 +48,7 @@ wandb.login() weave.init(ENTITY + "/" + PROJECT) ``` -# 2. Make `ChatModel` based on Artifact +## 2. Make `ChatModel` based on Artifact Retrieve the fine-tuned chat model from the Registry and create a `weave.Model` from it to directly plug into the [`RagModel`](https://wandb.ai/wandb-smle/weave-cookboook-demo/weave/object-versions?filter=%7B%22objectName%22%3A%22RagModel%22%7D&peekPath=%2Fwandb-smle%2Fweave-cookboook-demo%2Fobjects%2FRagModel%2Fversions%2FcqRaGKcxutBWXyM0fCGTR1Yk2mISLsNari4wlGTwERo%3F%26) in the next step. It takes in the same parameters as the existing [ChatModel](https://wandb.ai/wandb-smle/weave-cookboook-demo/weave/object-versions?filter=%7B%22objectName%22%3A%22RagModel%22%7D&peekPath=%2Fwandb-smle%2Fweave-rag-experiments%2Fobjects%2FChatModelRag%2Fversions%2F2mhdPb667uoFlXStXtZ0MuYoxPaiAXj3KyLS1kYRi84%3F%26) just the `init` and `predict` change. @@ -154,7 +152,7 @@ new_chat_model = UnslothLoRAChatModel( ) ``` - # 3. Integrate new `ChatModel` version into `RagModel` +## 3. Integrate new `ChatModel` version into `RagModel` Building a RAG app from a fine-tuned chat model can provide several advantages, particularly in enhancing the performance and versatility of conversational AI systems. Now retrieve the [`RagModel`](https://wandb.ai/wandb-smle/weave-cookboook-demo/weave/object-versions?filter=%7B%22objectName%22%3A%22RagModel%22%7D&peekPath=%2Fwandb-smle%2Fweave-cookboook-demo%2Fobjects%2FRagModel%2Fversions%2FcqRaGKcxutBWXyM0fCGTR1Yk2mISLsNari4wlGTwERo%3F%26) (you can fetch the weave ref for the current `RagModel` from the use tab as shown in the image below) from the existing Weave project and exchange the `ChatModel` to the new one. There is no need to change or re-create any of the other components (VDB, prompts, etc.)! @@ -176,7 +174,7 @@ PUB_REFERENCE = weave.publish(RagModel, "RagModel") await RagModel.predict("When was the first conference on climate change?") ``` -# 4. Run new `weave.Evaluation` connecting to the existing models run +## 4. Run new `weave.Evaluation` connecting to the existing models run Finally, evaluate the new `RagModel` on the existing `weave.Evaluation`. To make the integration as easy as possible, include the following changes. From a Models perspective: @@ -194,10 +192,10 @@ climate_rag_eval = weave.ref(WEAVE_EVAL).get() with weave.attributes({"wandb-run-id": wandb.run.id}): # use .call attribute to retrieve both the result and the call in order to save eval trace to Models - summary, call = await climate_rag_eval.evaluate.call(climate_rag_eval, `RagModel`) + summary, call = await climate_rag_eval.evaluate.call(climate_rag_eval, ` RagModel `) ``` -# 5. Save the new RAG model on the Registry +## 5. Save the new RAG model on the Registry In order to effectively share the new RAG Model, push it to the Registry as a reference artifact adding in the weave version as an alias. ```python diff --git a/content/tutorials/workspaces.md b/content/tutorials/workspaces.md index 8a675b8ba..31ceddc48 100644 --- a/content/tutorials/workspaces.md +++ b/content/tutorials/workspaces.md @@ -14,8 +14,7 @@ Organize and visualize your machine learning experiments more effectively by pro In this tutorial you will see how to use `wandb-workspaces` to create and customize workspaces by defining configurations, set panel layouts, and organize sections. - -### How to use this notebook +## How to use this notebook * Run each cell one at a time. * Copy and paste the URL that is printed after you run a cell to view the changes made to the workspace.