Migration Assessment Report

Migration Assessment Report
Assessment Report Summary
Assessment Widgets
Assessment Widgets
Assessment Finding Index
Common Terms

This document describes the Assessment Report generated from the UCX tools. The main assessment report includes dashlets, widgets and details of the assessment findings and common recommendations made based on the Assessment Finding (AF) Index entry.

Assessment Report Summary

The Assessment Report (Main) is the output of the Databricks Labs UCX assessment workflow. This report queries the $inventory database (e.g. ucx) and summarizes the findings of the assessment. The link to the Assessment Report (Main) can be found in your home folder, under .ucx in the README.py file. The user may also directly navigate to the Assessment report by clicking on Dashboards icon on the left to find the Dashboard.

Files

assessment.md

Latest commit

History

assessment.md

File metadata and controls

Migration Assessment Report

Assessment Report Summary

Assessment Widgets

Readiness

Total Databases

Metastore Crawl Failures

Total Tables

Storage Locations

Assessment Widgets

Readiness

Assessment Summary

Table counts by storage

Table counts by schema and format

Database Summary

External Locations

Mount Points

Table Types

Incompatible Clusters

Incompatible Jobs

Incompatible Delta Live Tables

Incompatible Global Init Scripts

Assessment Finding Index

AF101 - not supported DBR: ##.#.x-scala2.12

AF102 - not supported DBR: ##.#.x-cpu-ml-scala2.12

AF103 - not supported DBR: ##.#.x-gpu-ml-scala2.12

AF111 - Uses azure service principal credentials config in cluster.

AF112 - Uses azure service principal credentials config in Job cluster.

AF113 - Uses azure service principal credentials config in pipeline.

AF114 - Uses external Hive metastore config: spark.hadoop.javax.jdo.option.ConnectionURL

AF115 - Uses passthrough config: spark.databricks.passthrough.enabled.

AF116 - No isolation shared clusters not supported in UC

AF117 - cluster type not supported

AF201 - Inplace Sync

AF202 - Asset Replication Required

AF203 - Data in DBFS Root

AF204 - Data is in DBFS Mount

AF210 - Non-DELTA format: CSV

AF211 - Non-DELTA format: DELTA

AF212 - Non-DELTA format

AF221 - Unsupported Storage Type

AF300 - AF399

AF300.6 - 3 level namespace

AF300.1 - r language support

AF300.2 - scala language support

AF300.3 - Minimum DBR version

AF300.4 - ML Runtime cpu

AF300.5 - ML Runtime gpu

AF301.1 - spark.catalog.x

AF301.2 - spark.catalog.x (spark._jsparkSession.catalog)

AF302.x - Arbitrary Java

AF302.1 - Arbitrary Java (spark._jspark)

AF302.2 - Arbitrary Java (spark._jvm)

AF302.3 - Arbitrary Java (._jdf)

AF302.4 - Arbitrary Java (._jcol)

AF302.5 - Arbitrary Java (._jvm)

AF302.6 - Arbitrary Java (._jvm.org.apache.log4j)

AF303.1 - Java UDF (spark.udf.registerJavaFunction)

AF304.1 - JDBC datasource (spark.read.format("jdbc"))

AF305.1 - boto3

AF305.2 - s3fs

AF306.1 - dbutils...getContext (.toJson())

AF306.2 - dbutils...getContext

AF310.1 - credential passthrough (dbutils.credentials.)

AF311.x - dbutils (dbutils)

AF311.1 - dbutils.fs (dbutils.fs.)

AF311.2 - dbutils mount(s) (dbutils.fs.mount)

AF311.3 - dbutils mount(s) (dbutils.fs.refreshMounts)

AF311.4 - dbutils mount(s) (dbutils.fs.unmount)

AF311.5 - mount points (dbfs:/mnt)

AF311.6 - dbfs usage (dbfs:/)

AF311.7 - dbfs usage (/dbfs/)

AF313.x - SparkContext

AF313.1 - SparkContext (spark.sparkContext)

AF313.2 - SparkContext (from pyspark.sql import SQLContext)

AF302.1 - Arbitrary Java (`spark._jspark`)

AF302.2 - Arbitrary Java (`spark._jvm`)

AF302.3 - Arbitrary Java (`._jdf`)

AF302.4 - Arbitrary Java (`._jcol`)

AF302.5 - Arbitrary Java (`._jvm`)

AF302.6 - Arbitrary Java (`._jvm.org.apache.log4j`)

AF303.1 - Java UDF (`spark.udf.registerJavaFunction`)

AF304.1 - JDBC datasource (`spark.read.format("jdbc")`)

AF306.1 - dbutils...getContext (`.toJson()`)

AF310.1 - credential passthrough (`dbutils.credentials.`)

AF311.x - dbutils (`dbutils`)

AF311.1 - dbutils.fs (`dbutils.fs.`)

AF311.2 - dbutils mount(s) (`dbutils.fs.mount`)

AF311.3 - dbutils mount(s) (`dbutils.fs.refreshMounts`)

AF311.4 - dbutils mount(s) (`dbutils.fs.unmount`)

AF311.5 - mount points (`dbfs:/mnt`)

AF311.6 - dbfs usage (`dbfs:/`)

AF311.7 - dbfs usage (`/dbfs/`)

AF313.1 - SparkContext (`spark.sparkContext`)

AF313.2 - SparkContext (`from pyspark.sql import SQLContext`)

AF313.3 - SparkContext (`.binaryFiles`)

AF313.4 - SparkContext (`.binaryRecords`)

AF313.5 - SparkContext (`.emptyRDD`)

AF313.6 - SparkContext (`.getConf`)

AF313.7 - SparkContext ( `.hadoopFile` )

AF313.8 - SparkContext ( `.hadoopRDD` )

AF313.9 - SparkContext ( `.init_batched_serializer` )

AF313.10 - SparkContext ( `.newAPIHadoopFile` )

AF313.11 - SparkContext ( `.newAPIHadoopRDD` )

AF313.12 - SparkContext ( `.parallelize` )

AF313.13 - SparkContext ( `.pickleFile` )

AF313.14 - SparkContext ( `.range` )

AF313.15 - SparkContext ( `.rdd` )

AF313.16 - SparkContext ( `.runJob` )

AF313.17 - SparkContext ( `.sequenceFile` )

AF313.18 - SparkContext ( `.setJobGroup` )

AF313.19 - SparkContext ( `.setLocalProperty` )

AF313.20 - SparkContext ( `.setSystemProperty` )

AF313.21 - SparkContext ( `.stop` )

AF313.22 - SparkContext ( `.textFile` )

AF313.23 - SparkContext ( `.uiWebUrl`)

AF313.24 - SparkContext (`.union`)

AF313.25 - SparkContext (`.wholeTextFiles`)

AF314.1 - Distributed ML (`sparknlp`)

AF314.2 - Distributed ML (`xgboost.spark`)

AF314.3 - Distributed ML (`catboost_spark`)

AF314.4 - Distributed ML (`ai.catboost:catboost-spark`)

AF314.5 - Distributed ML (`hyperopt`)

AF314.6 - Distributed ML (`SparkTrials`)

AF314.7 - Distributed ML (`horovod.spark`)

AF314.8 - Distributed ML (`ray.util.spark`)

AF314.9 - Distributed ML (`databricks.automl`)

AF308.1 - Graphframes (`from graphframes`)

AF309.1 - Spark ML (`pyspark.ml.`)

AF315.1 - UDAF scala issue (`UserDefinedAggregateFunction`)

AF315.2 - applyInPandas (`applyInPandas`)

AF315.3 - mapInPandas (`mapInPandas`)

AF330.1 - Streaming (`.trigger(continuous`)

AF330.2 - Streaming (`kafka.sasl.client.callback.handler.class`)

AF330.3 - Streaming (`kafka.sasl.login.callback.handler.class`)

AF330.4 - Streaming (`kafka.sasl.login.class`)

AF330.5 - Streaming (`kafka.partition.assignment.strategy`)

AF330.6 - Streaming (`kafka.ssl.truststore.location`)

AF330.7 - Streaming (`kafka.ssl.keystore.location`)

AF330.8 - Streaming (`cleanSource`)

AF330.9 - Streaming (`sourceArchiveDir`)

AF330.10 - Streaming (`applyInPandasWithState`)

AF330.11 - Streaming (`.format("socket")`)

AF330.12 - Streaming (`StreamingQueryListener`)

AF330.13 - Streaming (`applyInPandasWithState`)