Support FHIR Extensions in Spark Datasets #68

mtsargent · 2020-01-08T19:51:23Z

Please fill out the below template as best you can.

Description of Issue

I am currently attempting to read in FHIR Bundles from a directory that contains JSON files and then extract certain resource types to Spark Datasets. While Datasets are being successfully created, Extensions that were part of resources in my FHIR bundle are being dropped altogether.

If I am looking at the correct places in code, it seems like lack of Extension support was a conscious decision:

bunsen/bunsen-core/src/main/scala/com/cerner/bunsen/EncoderBuilder.scala

Line 191 in e6a58c6

// Contained resources and extensions not yet supported.

bunsen/bunsen-core/src/main/scala/com/cerner/bunsen/SchemaConverter.scala

Line 36 in e3c1d5e

// Contained resources and extensions not yet supported.

I would like to be able to create Datasets for FHIR resources that still contain the Extensions from the original resources.

System Configuration

Project Version

Using Bunsen 0.4.9

Steps to Reproduce the Issue

Run this Scala code (or Java equivalent):

object BunsenExample {
  def main(args: Array[String]): Unit = {
    failBundles()
  }

  def failBundles(): Unit = {
    val conf = new SparkConf()
      .setMaster("local[*]")
      .set("spark.sql.crossJoin.enabled", "true")
    val spark = SparkSession.builder().config(conf).getOrCreate()
    
    val data = Bundles.forStu3().loadFromDirectory(spark, "/path/to/bundles/with/resource/extensions", 2).cache()

    val patients = Bundles.forStu3().extractEntry(spark, data, "Patient")
    patients.show()
    patients.printSchema()
  }
}

The patients dataset will not contain the extensions that were originally part of the Patient FHIR resources in the bundle. There does not appear to be a place for extensions to exist in the schema for the Dataset. I verified that the Extensions are being parsed successfully and are accessible through the BundleContainers returned if you run data.collect() and dive into the result.

Expected Outcomes

Add support for Extensions to be included in Datasets when they are created by extracting resources from a collection of FHIR Bundles.

The text was updated successfully, but these errors were encountered:

bdrillard · 2020-03-07T18:22:45Z

Extensions and Contained resources are now supported in Bunsen 0.5.x, which applies a different paradigm to creating Spark rows from FHIR resources. The Bundles API in this new major version is still much the same, so try loading your data in the latest version to see if you get the support you require.

While Contained resource support was added in Bunsen 0.4.9 I believe, Extensions were known to be more difficult to implement in the earlier way we did things, so I don't think users can expect Extension support will be back-ported.

mtsargent · 2020-04-20T23:09:38Z

I tried running an example similar to the one I posted (except using Observations instead of Patients), and I am still not seeing extensions when the resources are extracted from the bundle. Using a debugger, I can see the extensions exist on the resources in the Bundle. I am using com.cerner.bunsen:bunsen-spark-shaded:0.5.4. Is this the correct dependency?

Also, is there R4 support with bunsen 0.5.x? I was unable to find information similar to the information listed here for 0.5.x releases: https://engineering.cerner.com/bunsen/0.4.6/

Teej42 · 2020-04-28T15:04:07Z

Can we have an update on this question Matt posed here, please?

dhallam · 2020-09-01T11:34:51Z

From looking at the codebase, it looks like in a24851b on the 0.5.0-dev branch deleted the python tests for r4 and removed classes such as FhirEncoders which are still used by the bunsen-r4 sub-project. It looks like R4 has been abandoned in bunsen. Is that right?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support FHIR Extensions in Spark Datasets #68

Support FHIR Extensions in Spark Datasets #68

mtsargent commented Jan 8, 2020

bdrillard commented Mar 7, 2020 •

edited

Loading

mtsargent commented Apr 20, 2020

Teej42 commented Apr 28, 2020

dhallam commented Sep 1, 2020

Support FHIR Extensions in Spark Datasets #68

Support FHIR Extensions in Spark Datasets #68

Comments

mtsargent commented Jan 8, 2020

Please fill out the below template as best you can.

Description of Issue

System Configuration

Project Version

Steps to Reproduce the Issue

Expected Outcomes

bdrillard commented Mar 7, 2020 • edited Loading

mtsargent commented Apr 20, 2020

Teej42 commented Apr 28, 2020

dhallam commented Sep 1, 2020

bdrillard commented Mar 7, 2020 •

edited

Loading