zenml-io · avishniakov · Aug 27, 2024 · Aug 5, 2024 · Aug 5, 2024 · Aug 6, 2024
diff --git a/.typos.toml b/.typos.toml
@@ -35,6 +35,7 @@ daa = "daa"
 arange = "arange"
 cachable = "cachable"
 OT = "OT"
+cll = "cll"
 
 [default]
 locale = "en-us"
diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md
@@ -731,7 +731,7 @@ by adding support for `Schedule.start_time` to the HyperAI orchestrator.
 ## What's Changed
 * Really run migration testing by @avishniakov in https://github.com/zenml-io/zenml/pull/2562
 * Interact with feature gate by @AlexejPenner in https://github.com/zenml-io/zenml/pull/2492
-* Allow for logs to be unformatted / without colours by @strickvl in https://github.com/zenml-io/zenml/pull/2544
+* Allow for logs to be unformatted / without colors by @strickvl in https://github.com/zenml-io/zenml/pull/2544
 * Add VS Code extension to README / docs by @strickvl in https://github.com/zenml-io/zenml/pull/2568
 * Allow loading of artifacts without needing to activate the artifact store (again) by @avishniakov in https://github.com/zenml-io/zenml/pull/2545
 * Minor fix by @htahir1 in https://github.com/zenml-io/zenml/pull/2578
@@ -1302,7 +1302,7 @@ and some improvements to the Model Control Plane.
 ## What's Changed
 * Bump aquasecurity/trivy-action from 0.16.0 to 0.16.1 by @dependabot in https://github.com/zenml-io/zenml/pull/2244
 * Bump crate-ci/typos from 1.16.26 to 1.17.0 by @dependabot in https://github.com/zenml-io/zenml/pull/2245
-* Add YAML formatting standardisation to formatting & linting scripts by @strickvl in https://github.com/zenml-io/zenml/pull/2224
+* Add YAML formatting standardization to formatting & linting scripts by @strickvl in https://github.com/zenml-io/zenml/pull/2224
 * Remove text annotation by @strickvl in https://github.com/zenml-io/zenml/pull/2246
 * Add MariaDB migration testing by @strickvl in https://github.com/zenml-io/zenml/pull/2170
 * Delete artifact links from model version via Client, ModelVersion and API by @avishniakov in https://github.com/zenml-io/zenml/pull/2191
@@ -1383,7 +1383,7 @@ which allows you to define custom blocks for the Slack message.
 * Bump google-github-actions/auth from 1 to 2 by @dependabot in https://github.com/zenml-io/zenml/pull/2203
 * Bump aws-actions/amazon-ecr-login from 1 to 2 by @dependabot in https://github.com/zenml-io/zenml/pull/2200
 * Bump crate-ci/typos from 1.16.25 to 1.16.26 by @dependabot in https://github.com/zenml-io/zenml/pull/2207
-* Fix unreliable test behaviour when using hypothesis by @strickvl in https://github.com/zenml-io/zenml/pull/2208
+* Fix unreliable test behavior when using hypothesis by @strickvl in https://github.com/zenml-io/zenml/pull/2208
 * Added more pod spec properties for k8s orchestrator by @htahir1 in https://github.com/zenml-io/zenml/pull/2097
 * Fix API docs environment setup by @strickvl in https://github.com/zenml-io/zenml/pull/2190
 * Use placeholder runs to show pipeline runs in the dashboard without delay by @schustmi in https://github.com/zenml-io/zenml/pull/2048
@@ -2602,7 +2602,7 @@ improvements and bug fixes.
 * Delete extra word from `bentoml` docs by @strickvl in https://github.com/zenml-io/zenml/pull/1484
 * Remove top-level config from recommended repo structure by @schustmi in https://github.com/zenml-io/zenml/pull/1485
 * Bump `mypy` and `ruff` by @strickvl in https://github.com/zenml-io/zenml/pull/1481
-* ZenML Version Downgrade - Silence Warnning by @safoinme in https://github.com/zenml-io/zenml/pull/1477
+* ZenML Version Downgrade - Silence Warning by @safoinme in https://github.com/zenml-io/zenml/pull/1477
 * Update ZenServer recipes to include secret stores by @wjayesh in https://github.com/zenml-io/zenml/pull/1483
 * Fix alembic order by @schustmi in https://github.com/zenml-io/zenml/pull/1487
 * Fix source resolving for classes in notebooks by @schustmi in https://github.com/zenml-io/zenml/pull/1486

diff --git a/docs/book/component-guide/annotators/prodigy.md b/docs/book/component-guide/annotators/prodigy.md
@@ -76,7 +76,7 @@ workflow!
 With Prodigy, there is no need to specially start the annotator ahead of time
 like with [Label Studio](label-studio.md). Instead, just use Prodigy as per the
 [Prodigy docs](https://prodi.gy) and then you can use the ZenML wrapper / API to
-get your labelled data etc using our Python methods.
+get your labeled data etc using our Python methods.
 
 ZenML supports access to your data and annotations via the `zenml annotator ...`
 CLI command.

diff --git a/docs/book/component-guide/orchestrators/azureml.md b/docs/book/component-guide/orchestrators/azureml.md
@@ -114,7 +114,7 @@ from zenml import step, pipeline
 from zenml.integrations.azure.flavors import AzureMLOrchestratorSettings
 
 azureml_settings = AzureMLOrchestratorSettings(
-  mode="serverless"  # It's the default behaviour
+  mode="serverless"  # It's the default behavior
 )
 
 @step

diff --git a/docs/book/component-guide/orchestrators/databricks.md b/docs/book/component-guide/orchestrators/databricks.md
@@ -124,7 +124,7 @@ The Databricks orchestrator only supports the `cron_expression`, in the `Schedul
 {% endhint %}
 
 {% hint style="warning" %}
-The Databricks orchestrator requires Java Timezone IDs to be used in the `cron_expression`. You can find a list of supported timezones [here](https://docs.oracle.com/middleware/1221/wcs/tag-ref/MISC/TimeZones.html), the timezone ID must be set in the settings of the orchestrator (see below for more imformation how to set settings for the orchestrator).
+The Databricks orchestrator requires Java Timezone IDs to be used in the `cron_expression`. You can find a list of supported timezones [here](https://docs.oracle.com/middleware/1221/wcs/tag-ref/MISC/TimeZones.html), the timezone ID must be set in the settings of the orchestrator (see below for more information how to set settings for the orchestrator).
 {% endhint %}
 
 **How to delete a scheduled pipeline**

diff --git a/docs/book/component-guide/orchestrators/skypilot-vm.md b/docs/book/component-guide/orchestrators/skypilot-vm.md
@@ -407,7 +407,7 @@ One of the key features of the SkyPilot VM Orchestrator is the ability to run ea
 
 The SkyPilot VM Orchestrator allows you to configure resources for each step individually. This means you can specify different VM types, CPU and memory requirements, and even use spot instances for certain steps while using on-demand instances for others.
 
-If no step-specific settings are specified, the orchestrator will use the resources specified in the orchestrator settings for each step and run the entire pipeline in one VM. If step-specific settings are specified, an orchestrator VM will be spun up first, which will subsequently spin out new VMs dependant on the step settings. You can disable this behavior by setting the `disable_step_based_settings` parameter to `True` in the orchestrator configuration, using the following command:
+If no step-specific settings are specified, the orchestrator will use the resources specified in the orchestrator settings for each step and run the entire pipeline in one VM. If step-specific settings are specified, an orchestrator VM will be spun up first, which will subsequently spin out new VMs dependent on the step settings. You can disable this behavior by setting the `disable_step_based_settings` parameter to `True` in the orchestrator configuration, using the following command:
 
 ```shell
 zenml orchestrator update <ORCHESTRATOR_NAME> --disable_step_based_settings=True

diff --git a/docs/book/how-to/build-pipelines/README.md b/docs/book/how-to/build-pipelines/README.md
@@ -41,7 +41,7 @@ When this pipeline is executed, the run of the pipeline gets logged to the ZenML
 at its DAG and all the associated metadata. To access the dashboard you need to have a ZenML server either running
 locally or remotely. See our documentation on this [here](../../getting-started/deploying-zenml/README.md).
 
-<figure><img src="../../.gitbook/assets/SimplePipelineDag.png" alt=""><figcaption><p>DAG representation in the ZenML Dahboard.</p></figcaption></figure>
+<figure><img src="../../.gitbook/assets/SimplePipelineDag.png" alt=""><figcaption><p>DAG representation in the ZenML Dashboard.</p></figcaption></figure>
 
 Check below for more advanced ways to build and interact with your pipeline.
 

diff --git a/docs/book/how-to/build-pipelines/schedule-a-pipeline.md b/docs/book/how-to/build-pipelines/schedule-a-pipeline.md
@@ -8,16 +8,24 @@ description: Learn how to set, pause and stop a schedule for pipelines.
 Schedules don't work for all orchestrators. Here is a list of all supported orchestrators.
 {% endhint %}
 
-| Orchestrator                                                                   | Scheduling Support |
-|--------------------------------------------------------------------------------|--------------------|
-| [LocalOrchestrator](../../component-guide/orchestrators/local.md)              | ⛔️                 |
-| [LocalDockerOrchestrator](../../component-guide/orchestrators/local-docker.md) | ⛔️                 |
-| [KubernetesOrchestrator](../../component-guide/orchestrators/kubernetes.md)    | ✅                  |
-| [KubeflowOrchestrator](../../component-guide/orchestrators/kubeflow.md)        | ✅                  |
-| [VertexOrchestrator](../../component-guide/orchestrators/vertex.md)            | ✅                  |
-| [TektonOrchestrator](../../component-guide/orchestrators/tekton.md)            | ⛔️                 |
-| [AirflowOrchestrator](../../component-guide/orchestrators/airflow.md)          | ✅                  |
-| [AzureMLOrchestrator](../../component-guide/orchestrators/azureml.md)          | ✅                  |
+| Orchestrator                                                                     | Scheduling Support |
+|----------------------------------------------------------------------------------|--------------------|
+| [AirflowOrchestrator](../../component-guide/orchestrators/airflow.md)            | ✅                 |
+| [AzureMLOrchestrator](../../component-guide/orchestrators/azureml.md)            | ✅                 |
+| [DatabricksOrchestrator](../../component-guide/orchestrators/databricks.md)      | ✅                 |
+| [HyperAIOrchestrator](../../component-guide/orchestrators/hyperai.md)            | ✅                 |
+| [KubeflowOrchestrator](../../component-guide/orchestrators/kubeflow.md)          | ✅                 |
+| [KubernetesOrchestrator](../../component-guide/orchestrators/kubernetes.md)      | ✅                 |
+| [LocalOrchestrator](../../component-guide/orchestrators/local.md)                | ⛔️                 |
+| [LocalDockerOrchestrator](../../component-guide/orchestrators/local-docker.md)   | ⛔️                 |
+| [SagemakerOrchestrator](../../component-guide/orchestrators/sagemaker.md)        | ⛔️                 |
+| [SkypilotAWSOrchestrator](../../component-guide/orchestrators/skypilot-vm.md)    | ⛔️                 |
+| [SkypilotAzureOrchestrator](../../component-guide/orchestrators/skypilot-vm.md)  | ⛔️                 |
+| [SkypilotGCPOrchestrator](../../component-guide/orchestrators/skypilot-vm.md)    | ⛔️                 |
+| [SkypilotLambdaOrchestrator](../../component-guide/orchestrators/skypilot-vm.md) | ⛔️                 |
+| [TektonOrchestrator](../../component-guide/orchestrators/tekton.md)              | ⛔️                 |
+| [VertexOrchestrator](../../component-guide/orchestrators/vertex.md)              | ✅                 |
+
 
 ### Set a schedule
 

diff --git a/docs/book/reference/environment-variables.md b/docs/book/reference/environment-variables.md
@@ -72,9 +72,9 @@ Set to `false` to disable the [`rich` traceback](https://rich.readthedocs.io/en/
 export ZENML_ENABLE_RICH_TRACEBACK=true
 ```
 
-## Disable colourful logging
+## Disable colorful logging
 
-If you wish to disable colourful logging, set the following environment variable:
+If you wish to disable colorful logging, set the following environment variable:
 
 ```bash
 ZENML_LOGGING_COLORS_DISABLED=true

diff --git a/...guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers.md b/...guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers.md
@@ -16,7 +16,7 @@ following:
 - finetune our model using the [Sentence
   Transformers](https://www.sbert.net/) library
 - evaluate the base and finetuned embeddings
-- visualise the results of the evaluation
+- visualize the results of the evaluation
 
 ![Embeddings finetuning pipeline with Sentence Transformers and
 ZenML](../../../.gitbook/assets/rag-finetuning-embeddings-pipeline.png)
@@ -94,7 +94,7 @@ The finetuning process leverages the capabilities of the Sentence Transformers l
 Our model is finetuned, saved in the Hugging Face Hub for easy access and
 reference in subsequent steps, but also versioned and tracked within ZenML for
 full observability. At this point the pipeline will evaluate the base and
-finetuned embeddings and visualise the results.
+finetuned embeddings and visualize the results.
 
 <!-- For scarf -->
 <figure><img alt="ZenML Scarf" referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" /></figure>

diff --git a/docs/book/user-guide/llmops-guide/reranking/evaluating-reranking-performance.md b/docs/book/user-guide/llmops-guide/reranking/evaluating-reranking-performance.md
@@ -105,7 +105,7 @@ Step retrieval_evaluation_full_with_reranking has finished in 4m20s.
 
 We can see here a specific example of a failure in the reranking evaluation. It's quite a good one because we can see that the question asked was actually an anomaly in the sense that the LLM has generated two questions and included its meta-discussion of the two questions it generated. Obviously this is not a representative question for the dataset, and if we saw a lot of these we might want to take some time to both understand why the LLM is generating these questions and how we can filter them out.
 
-### Visualising our reranking performance
+### Visualizing our reranking performance
 
 Since ZenML can display visualizations in its dashboard, we can showcase the results of our experiments in a visual format. For example, we can plot the failure rates of the retrieval system with and without reranking to see the impact of reranking on the performance.
 

diff --git a/scripts/format.sh b/scripts/format.sh
@@ -62,7 +62,7 @@ ruff check $SRC --select F401,F841 --fix --exclude "__init__.py" --isolated
 ruff check $SRC --select I --fix --ignore D
 ruff format $SRC
 
-# standardises / formats CI yaml files
+# standardizes / formats CI yaml files
 if [ "$SKIP_YAMLFIX" = false ]; then
     yamlfix .github tests --exclude "dependabot.yml"
 fi

diff --git a/src/zenml/cli/__init__.py b/src/zenml/cli/__init__.py
@@ -2186,7 +2186,7 @@ def my_pipeline(...):
 
 You can update a registered service connector by using the `update` command.
 Keep in mind that all service connector updates are validated before being
-applied. If you want to disable this behaviour please use the `--no-verify`
+applied. If you want to disable this behavior please use the `--no-verify`
 flag.
 
 ```bash

diff --git a/src/zenml/integrations/airflow/flavors/airflow_orchestrator_flavor.py b/src/zenml/integrations/airflow/flavors/airflow_orchestrator_flavor.py
@@ -119,6 +119,15 @@ class AirflowOrchestratorConfig(
 
     local: bool = True
 
+    @property
+    def is_schedulable(self) -> bool:
+        """Whether the orchestrator is schedulable or not.
+
+        Returns:
+            Whether the orchestrator is schedulable or not.
+        """
+        return True
+
 
 class AirflowOrchestratorFlavor(BaseOrchestratorFlavor):
     """Flavor for the Airflow orchestrator."""

diff --git a/src/zenml/integrations/azure/flavors/azureml.py b/src/zenml/integrations/azure/flavors/azureml.py
@@ -40,8 +40,8 @@ class AzureMLComputeSettings(BaseSettings):
 
     There are three possible use cases for this implementation:
 
-        1. Serverless compute (default behaviour):
-            - The `mode` is set to `serverless` (default behaviour).
+        1. Serverless compute (default behavior):
+            - The `mode` is set to `serverless` (default behavior).
             - All the other parameters become irrelevant and will throw a
             warning if set.
 

diff --git a/src/zenml/integrations/azure/flavors/azureml_orchestrator_flavor.py b/src/zenml/integrations/azure/flavors/azureml_orchestrator_flavor.py
@@ -80,6 +80,15 @@ def is_synchronous(self) -> bool:
         """
         return self.synchronous
 
+    @property
+    def is_schedulable(self) -> bool:
+        """Whether the orchestrator is schedulable or not.
+
+        Returns:
+            Whether the orchestrator is schedulable or not.
+        """
+        return True
+
 
 class AzureMLOrchestratorFlavor(BaseOrchestratorFlavor):
     """Flavor for the AzureML orchestrator."""

diff --git a/src/zenml/integrations/databricks/flavors/databricks_orchestrator_flavor.py b/src/zenml/integrations/databricks/flavors/databricks_orchestrator_flavor.py
@@ -102,6 +102,15 @@ def is_remote(self) -> bool:
         """
         return True
 
+    @property
+    def is_schedulable(self) -> bool:
+        """Whether the orchestrator is schedulable or not.
+
+        Returns:
+            Whether the orchestrator is schedulable or not.
+        """
+        return True
+
 
 class DatabricksOrchestratorFlavor(BaseOrchestratorFlavor):
     """Databricks orchestrator flavor."""

diff --git a/src/zenml/integrations/gcp/flavors/vertex_orchestrator_flavor.py b/src/zenml/integrations/gcp/flavors/vertex_orchestrator_flavor.py
@@ -162,6 +162,15 @@ def is_synchronous(self) -> bool:
         """
         return self.synchronous
 
+    @property
+    def is_schedulable(self) -> bool:
+        """Whether the orchestrator is schedulable or not.
+
+        Returns:
+            Whether the orchestrator is schedulable or not.
+        """
+        return True
+
 
 class VertexOrchestratorFlavor(BaseOrchestratorFlavor):
     """Vertex Orchestrator flavor."""

diff --git a/src/zenml/integrations/gcp/orchestrators/vertex_orchestrator.py b/src/zenml/integrations/gcp/orchestrators/vertex_orchestrator.py
@@ -32,7 +32,16 @@
 import os
 import re
 import types
-from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Type, cast
+from typing import (
+    TYPE_CHECKING,
+    Any,
+    Dict,
+    List,
+    Optional,
+    Tuple,
+    Type,
+    cast,
+)
 from uuid import UUID
 
 from google.api_core import exceptions as google_exceptions

diff --git a/src/zenml/integrations/huggingface/steps/accelerate_runner.py b/src/zenml/integrations/huggingface/steps/accelerate_runner.py
@@ -82,7 +82,7 @@ def inner(*args: Any, **kwargs: Any) -> Any:
                 )
 
             with create_cli_wrapped_script(
-                entrypoint, flavour="accelerate"
+                entrypoint, flavor="accelerate"
             ) as (
                 script_path,
                 output_path,

diff --git a/src/zenml/integrations/hyperai/flavors/hyperai_orchestrator_flavor.py b/src/zenml/integrations/hyperai/flavors/hyperai_orchestrator_flavor.py
@@ -81,6 +81,15 @@ def is_remote(self) -> bool:
         """
         return True
 
+    @property
+    def is_schedulable(self) -> bool:
+        """Whether the orchestrator is schedulable or not.
+
+        Returns:
+            Whether the orchestrator is schedulable or not.
+        """
+        return True
+
 
 class HyperAIOrchestratorFlavor(BaseOrchestratorFlavor):
     """Flavor for the HyperAI orchestrator."""

diff --git a/src/zenml/integrations/kubeflow/flavors/kubeflow_orchestrator_flavor.py b/src/zenml/integrations/kubeflow/flavors/kubeflow_orchestrator_flavor.py
@@ -213,6 +213,15 @@ def is_synchronous(self) -> bool:
         """
         return self.synchronous
 
+    @property
+    def is_schedulable(self) -> bool:
+        """Whether the orchestrator is schedulable or not.
+
+        Returns:
+            Whether the orchestrator is schedulable or not.
+        """
+        return True
+
 
 class KubeflowOrchestratorFlavor(BaseOrchestratorFlavor):
     """Kubeflow orchestrator flavor."""

diff --git a/src/zenml/integrations/kubeflow/orchestrators/kubeflow_orchestrator.py b/src/zenml/integrations/kubeflow/orchestrators/kubeflow_orchestrator.py
@@ -32,7 +32,16 @@
 
 import os
 import types
-from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Type, cast
+from typing import (
+    TYPE_CHECKING,
+    Any,
+    Dict,
+    List,
+    Optional,
+    Tuple,
+    Type,
+    cast,
+)
 from uuid import UUID
 
 import kfp

diff --git a/src/zenml/integrations/kubernetes/flavors/kubernetes_orchestrator_flavor.py b/src/zenml/integrations/kubernetes/flavors/kubernetes_orchestrator_flavor.py
@@ -125,6 +125,15 @@ def is_synchronous(self) -> bool:
         """
         return self.synchronous
 
+    @property
+    def is_schedulable(self) -> bool:
+        """Whether the orchestrator is schedulable or not.
+
+        Returns:
+            Whether the orchestrator is schedulable or not.
+        """
+        return True
+
 
 class KubernetesOrchestratorFlavor(BaseOrchestratorFlavor):
     """Kubernetes orchestrator flavor."""

diff --git a/src/zenml/integrations/kubernetes/orchestrators/kubernetes_orchestrator.py b/src/zenml/integrations/kubernetes/orchestrators/kubernetes_orchestrator.py
@@ -31,7 +31,16 @@
 """Kubernetes-native orchestrator."""
 
 import os
-from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Type, cast
+from typing import (
+    TYPE_CHECKING,
+    Any,
+    Dict,
+    List,
+    Optional,
+    Tuple,
+    Type,
+    cast,
+)
 
 from kubernetes import client as k8s_client
 from kubernetes import config as k8s_config

diff --git a/src/zenml/integrations/prodigy/annotators/prodigy_annotator.py b/src/zenml/integrations/prodigy/annotators/prodigy_annotator.py
@@ -221,7 +221,7 @@ def delete_dataset(self, **kwargs: Any) -> None:
     def get_dataset(self, **kwargs: Any) -> Any:
         """Gets the dataset metadata for the given name.
 
-        If you would like the labelled data, use `get_labeled_data` instead.
+        If you would like the labeled data, use `get_labeled_data` instead.
 
         Args:
             **kwargs: Additional keyword arguments to pass to the Prodigy client.