-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Re-propagate changes to bare-metal.md
- Loading branch information
1 parent
737ce65
commit c3f4e80
Showing
1 changed file
with
7 additions
and
65 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,68 +19,6 @@ Reach out to the W&B Sales Team for related question: [[email protected]](mailto | |
|
||
Before you start deploying W&B, refer to the [reference architecture](./ref-arch.md#infrastructure-requirements), especially the infrastructure requirements. | ||
|
||
{{% alert %}} | ||
W&B strongly recommends to deploy W&B Server into a Kubernetes cluster using the W&B Kubernetes Operator. Deploying to a Kubernetes cluster with the operator ensures that you can use all the existing and latest W&B features. | ||
{{% /alert %}} | ||
|
||
{{% alert color="secondary" %}} | ||
W&B application performance depends on scalable data stores that your operations team must configure and manage. The team must provide a MySQL 8 database cluster and an AWS S3 compatible object store for the application to scale properly. | ||
{{% /alert %}} | ||
|
||
### Application server | ||
|
||
W&B recommends deploying W&B Server into its own namespace and a two availability zone node group with the following specifications to provide the best performance, reliability, and availability: | ||
|
||
| Specification | Value | | ||
|----------------------------|-----------------------------------| | ||
| Bandwidth | Dual 10 Gigabit+ Ethernet Network | | ||
| Root Disk Bandwidth (Mbps) | 4,750+ | | ||
| Root Disk Provision (GB) | 100+ | | ||
| Core Count | 4 | | ||
| Memory (GiB) | 8 | | ||
|
||
This ensures that W&B Server has sufficient disk space to process the application data and store temporary logs before they are externalized. | ||
|
||
|
||
|
||
It also ensures fast and reliable data transfer, the necessary processing power and memory for smooth operation, and that W&B will not be affected by any noisy neighbors. | ||
|
||
It is important to keep in mind that these specifications are minimum requirements, and actual resource needs may vary depending on the specific usage and workload of the W&B application. Monitoring the resource usage and performance of the application is critical to ensure that it operates optimally and to make adjustments as necessary. | ||
|
||
### Database server | ||
|
||
W&B recommends a [MySQL 8](#mysql-database) database as a metadata store. The shape of the model parameters and related metadata impact the performance of the database. The database size grows as the ML practitioners track more training runs, and incurs read heavy load when queries are executed in run tables, users workspaces, and reports. | ||
|
||
To ensure optimal performance W&B recommends deploying the W&B database on to a server with the following starting specs: | ||
|
||
| Specification | Value | | ||
|--------------------------- |-----------------------------------| | ||
| Bandwidth | Dual 10 Gigabit+ Ethernet Network | | ||
| Root Disk Bandwidth (Mbps) | 4,750+ | | ||
| Root Disk Provision (GB) | 1000+ | | ||
| Core Count | 4 | | ||
| Memory (GiB) | 32 | | ||
|
||
Again, W&B recommends monitoring the resource usage and performance of the database to ensure that it operates optimally and to make adjustments as necessary. | ||
|
||
Additionally, W&B recommends the following [parameter overrides](#mysql-database) to tune the DB for MySQL 8. | ||
|
||
### Object storage | ||
|
||
W&B is compatible with an object storage that supports S3 API interface, Signed URLs and CORS. W&B recommends specifying the storage array to the current needs of your practitioners and to capacity plan on a regular cadence. | ||
|
||
More details on object store configuration can be found in the [how-to section](../self-managed/bare-metal.md#object-store). | ||
|
||
Some tested and working providers: | ||
- [MinIO](https://min.io/) | ||
- [Ceph](https://ceph.io/) | ||
- [NetApp](https://www.netapp.com/) | ||
- [Pure Storage](https://www.purestorage.com/) | ||
|
||
#### Secure Storage Connector | ||
|
||
The [Secure Storage Connector](../data-security/secure-storage-connector.md) is not available for teams at this time for bare metal deployments. | ||
|
||
## MySQL database | ||
|
||
{{% alert color="secondary" %}} | ||
|
@@ -116,6 +54,11 @@ CREATE DATABASE wandb_local CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci; | |
GRANT ALL ON wandb_local.* TO 'wandb_local'@'%' WITH GRANT OPTION; | ||
``` | ||
|
||
{{% alert %}} | ||
This works only if the SSL certificate is trusted. W&B does not support self-signed certificates. | ||
{{% /alert %}} | ||
|
||
|
||
### Parameter group configuration | ||
|
||
Ensure that the following parameter groups are set to tune the database performance: | ||
|
@@ -160,7 +103,7 @@ s3://$ACCESS_KEY:$SECRET_KEY@$HOST/$BUCKET_NAME?tls=true | |
``` | ||
|
||
{{% alert color="secondary" %}} | ||
This will only work if the SSL certificate is trusted. W&B does not support self-signed certificates. | ||
This works only if the SSL certificate is trusted. W&B does not support self-signed certificates. | ||
{{% /alert %}} | ||
|
||
Set `BUCKET_QUEUE` to `internal://` if you use third-party object stores. This tells the W&B server to manage all object notifications internally instead of depending on an external SQS queue or equivalent. | ||
|
@@ -189,7 +132,6 @@ mc mb --region=us-east1 local/local-files | |
|
||
The recommended installation method is with the official W&B Helm chart. Follow [this section](../operator.md#deploy-wb-with-helm-cli) to deploy the W&B Server application. | ||
|
||
|
||
### OpenShift | ||
|
||
W&B supports operating from within an [OpenShift Kubernetes cluster](https://www.redhat.com/en/technologies/cloud-computing/openshift). | ||
|
@@ -311,7 +253,7 @@ wandb login --host=https://YOUR_DNS_DOMAIN | |
wandb verify | ||
``` | ||
|
||
Check log files to view any errors the W&B Server hits at startup. Run the following commands: | ||
Check log files to view any errors the W&B Server hits at startup. Run the following commands: | ||
|
||
{{< tabpane text=true >}} | ||
{{% tab header="Docker" value="docker" %}} | ||
|