Skip to content

Commit

Permalink
minor editing of databricks page (#4862)
Browse files Browse the repository at this point in the history
  • Loading branch information
dberenbaum authored Sep 18, 2023
1 parent 8bd5942 commit 2220e1d
Showing 1 changed file with 15 additions and 14 deletions.
29 changes: 15 additions & 14 deletions content/docs/user-guide/integrations/databricks.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
# Databricks

As of September 2023 Databricks doesn't expose the underlying GIT repo in your
project, so GIT-related DVC functionality within the repo provided by Databricks
is not supported (e.g. [experiments], `--rev/--all-commits/--all-tags/etc`). But
everything will operate as normal if you `git clone` a project yourself or use
remote projects with DVC directly.
As of September 2023 [Databricks Repos] don't expose the underlying Git repo, so
Git-related DVC functionality within Databricks Repos is not supported (e.g.
[experiments], `--rev/--all-commits/--all-tags/etc`). Everything will operate as
normal if you `git clone` a project yourself or [use remote projects](#dvc-api)
with DVC directly.

## Setup

```bash
%pip install dvc
```

In order to be able to work in the project provided by databricks without GIT
functionality, you'll need to use this workaround:
In order to be able to work in [Databricks Repos], you'll need to use this
workaround:

```bash
!dvc config core.no_scm true --local
```

## DVC API

You can use your existing DVC projects through [Python API] as normal, for
You can use your existing DVC projects through the [Python API] as normal, for
example:

```python
Expand All @@ -36,9 +36,8 @@ with dvc.api.open(

### Secrets

If you need to use secrets to access your data, first add them to databricks
secrets https://docs.databricks.com/en/security/secrets/index.html and then use
them with DVC, for example:
If you need to use secrets to access your data, first add them to [Databricks
secrets] and then use them with DVC, for example:

```python
import dvc.api
Expand Down Expand Up @@ -75,18 +74,20 @@ normal.
!dvc add data
```

Note that due to the limitations described in the beginning and `noscm`
workaround, DVC won't be able to automatically add new entries to corresponding
`.gitignore`s, so you'll need to do that manually.
If working with [Databricks Repos], due to the limitations described in the
beginning and `noscm` workaround, DVC won't be able to automatically add new
entries to corresponding `.gitignore`s, so you'll need to do that manually.

### Example: import data

```bash
!dvc import-url https://archive.ics.uci.edu/static/public/186/wine+quality.zip
```

[Databricks Repos]: https://docs.databricks.com/en/repos/index.html
[experiments]: /doc/start/experiments
[Python API]: /doc/api-reference
[Databricks secrets]: https://docs.databricks.com/en/security/secrets/index.html
[magic commands]:
https://ipython.readthedocs.io/en/stable/interactive/magics.html
[web terminal]: https://docs.databricks.com/en/clusters/web-terminal.html

0 comments on commit 2220e1d

Please sign in to comment.