Skip to content

Serialization

Liudmila Molkova edited this page Jul 20, 2021 · 6 revisions

Customizable serialization

Azure Core offers the ObjectSerializer and JsonSerializer interfaces as a standard way to handle serialization and deserialization of specific serialization formats. There is a default implementation of these interfaces built into Azure Core, but users and Azure SDKs can specify their own through service provider interface (SPI). The SPI allows for the serialization layer to be plug-able and removes dependencies on given implementations.

If you're Azure Core developer and want customers to provide their serializer for some Azure Core models, use ObjectSerializer abstraction and resolve instance with JsonSerializerProviders.createInstance.

  1. com.azure.core.util.serializer.ObjectSerializer - Generic interface for (de)serializing objects. Abstracts format-specific serializers.
  2. com.azure.core.util.serializer.JsonSerializer - Generic interface covering basic JSON (de)serialization methods.
  3. com.azure.core.util.serializer.JsonSerializerProvider - extension point - resolves JsonSerializer implementation in runtime.

Implementations

  1. Package azure-core-serializer-json-jackson: com.azure.core.serializer.json.jackson.JacksonJsonSerializer - JsonSerializer implementation of the JsonSerializer defaulting to Azure Core internal implementation.
  2. Package azure-core-serializer-json-gson: com.azure.core.serializer.json.gson.GsonJsonSerializer - JsonSerializer based on Gson. Intended for consumers.
  3. Package azure-core-serializer-avro-apache - includes experimental com.azure.core.serializer.avro.apache.ApacheAvroSerializer - AvroSerializer based on apache apache (de)serializer. Intended for consumers.

All of them come with a builder and provider.

Replacing the serializer

If you're a library developer and want users to be able to provide their own serializer - take a dependency on one of the implementation packages. We recommend using azure-core-serializer-json-jackson for Json and azure-core-serializer-avro-apache for Avro.

If you don't want users to be able to replace serializers and can't use the default one, you can create your implementation and instantiate it directly without calling into JsonSerializerProviders.createInstance.

JsonSerializer usage

  • Creation:: Get a new instance with JsonSerializerProviders.createInstance(useDefaultIfAbsent).
    • If you don't want to use the default implementation, pass useDefaultIfAbsent=false, make sure to include a custom one (e.g. one in azure-core-serializer-json-jackson) and configure it
  • Changing configuration:
    • it's not possible to change serialization configuration for instance obtained with JsonSerializerProviders.createInstance unless you fully replace the serializer.
    • Azure SDK developers should use JacksonJsonSerializerBuilder.serializer method to pass customized ObjectMapper. Consumers won't be able to replace or customize this instance.

Please refer to the reference docs for more details.

Example

Using customizable serializer instance

public final class User {
    private static final JsonSerializer SERIALIZER = JsonSerializerProviders.createInstance(true);

    public User() {}

    @JsonProperty
    public String firstName;

    @JsonProperty
    public String lastName;

    public static User fromString(String str) {
        return SERIALIZER.deserializeFromBytes(str.getBytes(StandardCharsets.UTF_8), TypeReference.createInstance(User.class));
    }

    public String toString() {
        return new String(SERIALIZER.serializeToBytes(this), StandardCharsets.UTF_8);
    }
}

Using JacksonJsonSerializerBuilder

If you don't need to support customizable serializers for the model, but want to configure some settings on the serializer, please use JacksonJsonSerializerBuilder from azure-core-serializer-json-jackson. A similar approach can be used with ApacheAvroSerializerBuilder or GsonJsonSerializerBuilder.

public final class User {

    private static final JsonSerializer SERIALIZER = new JacksonJsonSerializerBuilder()
            .serializer(new ObjectMapper().registerModule(
                    new SimpleModule().addSerializer(User.class, new UserSerializer())
                            .addDeserializer(User.class, new UserDeserializer())))
            .build();
    
    public User() {}

    @JsonProperty
    public String firstName;

    @JsonProperty
    public String lastName;

    public static User fromString(String str) {
        return SERIALIZER.deserializeFromBytes(str.getBytes(StandardCharsets.UTF_8), TypeReference.createInstance(User.class));
    }

    public String toString() {
        return new String(SERIALIZER.serializeToBytes(this), StandardCharsets.UTF_8);
    }
}

Writing a custom serializer plugin

You'd need to implement JsonSerializer as well as JsonSerializerProvider interfaces. Make sure to provider configuration to let Azure Core discover your implementation.

This serializer will be used for some of the Azure Code (CloudEvent, BinaryData, RequestContent) and SDK models.

public final class MySerializerProvider implements JsonSerializerProvider {
    @Override
    public JsonSerializer createInstance() {
        return new MySerializer();
    }
}

public final class MySerializer implements JsonSerializer {
    MySerializer() {}

    @Override
    public <T> T deserialize(InputStream stream, TypeReference<T> typeReference) { // TODO }

    @Override
    public <T> Mono<T> deserializeAsync(InputStream stream, TypeReference<T> typeReference) { // TODO }

    @Override
    public void serialize(OutputStream stream, Object value) { // TODO }

    @Override
    public Mono<Void> serializeAsync(OutputStream stream, Object value) { // TODO }
}

Default serialization

If you're an Azure Core or Azure SDK developer and don't want customers to provide their own serializer, use SerializerAdapter abstraction and JacksonAdapter implementation.

  1. com.azure.core.util.serializer.SerializerAdapter - interface for all serialization (Json and XML).
  2. com.azure.core.util.serializer.JacksonAdapter - implements SerializerAdapter - default (de)serialization (json and XML) implementation.

JacksonAdapter usage

  • Creation:: Create new instance with JacksonAdapter() constructor or use static createDefaultSerializerAdapter() method that returns a singleton instance.
  • Changing configuration: JacksonAdapter.serializer() returns underlying com.fasterxml.jackson.databind.ObjectMapper so you may add or override default configuration. WARNING: please avoid changing configuration on the singleton instance returned by createDefaultSerializerAdapter - Azure Core functionality depends on it.
  • Serialization methods : Please refer to the reference docs for more details. Some notable details: serializeRaw, serializeList, serializeIterable, <T> deserialize(HttpHeaders headers, Type deserializedHeadersType) only support Json.

Default configuration settings

Here is the default configuration applied on json/xml serialization in Azure Core.

Applies to:

  • JacksonAdapter - default implementation. Note: if there is no custom provider on the classpath, JsonSerializerProviders.createInstance(/*useDefaultIfAbsent*/ true) resolves to serializer that wraps JacksonAdapter, i.e. below configuration would apply.
  • JacksonJsonSerializer - (unless custom ObjectMapper is used) with some exceptions mentioned below.

Configuration does not apply to GsonJsonSerializer, JacksonAvroSerializer or ApacheAvroSerializer - they use corresponding underlying package (jackson, gson, apache avro) defaults unless customized.

Note: Jackson behavior regarding null/empty (de)serialization edge cases depends on version and subject to change (especially in xml). Please support variety of cases and avoid depending on specific behavior.

Null/Empty properties

JacksonJsonSerializer uses Jackson defaults (unless customized).

  • Null fields/properties:
    • serialization:
      • json:
        • JacksonAdapter: no
        • JacksonJsonSerializer: yes, as null.
      • xml: no
    • deserialization: as nulls
  • Empty strings
    • serialization:
      • json: empty string
      • xml: empty element (<a></a>)
    • deserialization to string:
      • json: as empty string
      • xml:
        • empty element (<a></a>) - empty string
        • self-closing tag (<a/>) - null
    • deserialization to an arbitrary object:
      • json: as null
      • xml:
        • empty element (<a></a>) - null
        • self-closing tag (<a/>) - null

Case Sensitivity

  • json: sensitive
  • xml: not sensitive

Xml Declaration

  • serialization: yes
  • deserialization: not required

Arrays and collections

  • Null arrays/collections
    • serialization:
      • json: no
      • xml: no
    • deserialization
      • json: no element/element with null value - null (map/iterable/list/array)
      • xml:
        • no element: null (map/iterable/list/array)
        • self-closing tag (<a/>) - null (map/iterable/list/array) since Jackson 2.12.4
  • Empty arrays/collections
    • serialization:
      • json: yes (as empty element)
      • xml: yes as a self-closing tag (<a/>)
        • empty Iterable is not serialized at all (TODO)
    • deserialization
      • json: empty array/collection
      • xml: there is no way to express empty array with XML (with wrapping turned off)
        • empty element (<a></a>), depends on type:
          • collection of Strings: [""]
          • array of not-Strings: [null] (because of DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY setting)
          • list of not-Strings: null
        • self-closing tag (<a/>) - null since Jackson 2.12.4
  • Single element
    • serialization:
      • json: as array
      • xml: as one element with value: a=[1] -> <a>1</a>
    • deserialization
      • json:
        • array: array
        • number: array
      • xml
        • one element with value <a>1</a>: 1-element array
        • empty element (<a></a>), depends on type:
          • collection of Strings: [""]
          • array of not-Strings: [null] (because of DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY setting that works for arrays only)
          • list of not-Strings: null
  • Multiple elements:
    • json: array/collection with multiple elements
    • xml: no wrapping for serialization, array/collection with multiple elements
  • Byte Arrays
    • serialization:
      • json/xml: base64-encoded string [1,2,3,4,5,6,7,8] -> AQIDBAUGBwg=
        • jackson default: array
    • deserialization: as array (base64-string serialization is not supported)

Visibility

Annotated properties/fields are always serialized regardless of visibility. For auto-detected fields/properties/etc JacksonAdapter and JacksonJsonSerializer have different settings. JacksonJsonSerializer uses jackson defaults (unless customized).

  • Not-annotated fields:

    • JacksonAdapter: serialized regardless of visibility
    • JacksonJsonSerializer: public-only not-annotated fields
  • Not-annotated properties (getter + setter):

    • JacksonAdapter: none
    • JacksonJsonSerializer: public-only getters, any setters
  • Not-annotated Is-Getter

    • JacksonAdapter: none
    • JacksonJsonSerializer: public-only
  • Creators: public-only.

Error tolerance

  • Unknown properties: ignored
  • Empty beans serialization (serialization of classes that are not serializable): empty object ({})
  • Exception handling:
    • JacksonAdapter: Jackson exceptions are not handled and not logged.
    • JacksonJsonSerializer: all IOExceptions (including MismatchedInputException, JackMappingException and JacksonProcessingException) are logged and re-thrown wrapped into UncheckedIOException.

AdditionalProperties

Json-specific. Not supported in XML.

additionalProperties is a magic word that allows to serialize map with String keys as top-level properties.

Please use @JsonAnyGetter/@JsonAnySetter instead (if you can) for performance reasons.

Limitations

  • whole word
  • case insensitive
  • fields only
  • has to be annotated with @JsonProperty (auto-detection does not work)
  • any visibility
  • can work with non-string keys if corresponding KeySerializer is provided.
  • can work with non-string values

Behavior

  • serialization: map is serialized as top-level properties
  • deserialization: any unknown properties are populated on the additionalProperties as long as value type allows that (and throws otherwise)

additionalProperties vs @JsonAnyGetter/@JsonAnySetter

additionalProperties are Azure SDK concept, @JsonAnyGetter/@JsonAnySetter is jackson annotations. Please use @JsonAnyGetter/@JsonAnySetter when possible. Usage of additionalProperties adds extra performance overhead (~10x) and is not recommended in perf-sensitive scenarios.

Flattening

Flattening is Azure SDK concept that allows to write more compact models.

Class-level flattening example

@JsonFlatten
class Model {
  @JsonProperty("property.name")
  private String name = "foo";

  @JsonProperty("property.value")
  private String value = "bar";

  @JsonProperty("property\\.escaped")
  private String escaped = "baz";
}

translates into

{
  "property" : {
    "name" : "foo",
    "value" : "bar"
  },
  "property.escaped" : "baz" 
}

Field-level flattening example

class Model {
  @JsonFlatten
  @JsonProperty("property.name")
  private String name = "foo"; 

  @JsonFlatten
  @JsonProperty("property.value")
  private String value = "bar";

  @JsonProperty("property.not.escaped")
  private String notEscaped = "baz";
}

translates into

{
  "property" : {
    "name" : "foo",
    "value" : "bar"
  },
  "property.not.escaped" : "baz" 
}

Limitations

  • class or any field has to have @JsonFlatten annotation, no auto-detection
  • properties (getters/setters) are not supported
  • there is no limitation on depth

Behavior

  • class-level annotation:
    • serialization: all fields that have unescaped . are populated as nested fields
    • deserialization: all fields with unescaped . in model definition are populated from nested nodes
  • field-level annotation:
    • serialization: all annotated fields (with unescaped .) are populated as nested fields
      • if there are multiple fields with the same root, one root node is created
    • deserialization: all annotated fields (with unescaped .) in model definition are populated from nested nodes
  • escaping: \. is used if class is annotated with @JsonFlatten and specific field does not need to be annotated.
  • Using additionalProperties (or @JsonAnyGetter/@JsonAnySetter) along with @JsonFlatten on class or field levels is not supported (flattening is not possible).

Dates, times and duration

  • Duration:

    • values: ISO 8601 String with a days component (e.g. "P1DT10H17M36.789S"), controlled by DurationSerializer
      • jackson default 123456.789 - number of seconds with nano precision
    • keys (controlled by JavaTimeModule)
      • serialization: not supported
      • deserialization: Jackson default (Duration.parse(ISO 8601 String)
  • Instant:

    • values: instant UTC ISO string ("2021-07-06T19:47:12.728012100Z"), controlled by MapperBuilder.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS)
      • jackson default: epoch time (number)
    • keys (controlled by JavaTimeModule)
      • serialization: not supported
      • deserialization: Jackson default (Instant.Parse(ISO string))
  • OffsetDateTime

    • values: UTC date-time ISO string ("2021-07-06T20:09:01.465447100Z"), controlled by DateTimeSerializer
      • jackson default: local date-time ISO string "2021-07-06T13:11:08.3230678-07:00", controlled by MapperBuilder.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS)
    • keys (controlled by JavaTimeModule)
      • serialization: not supported
      • deserialization: Jackson default (OffsetDateTime.parse(ISO string))
  • UnixTime

    • values: epoch seconds number (1.625602953E9), controlled by UnixTimeSerializer
      • jackson default (UnixTime not supported) - {"dateTime":"2021-07-06T20:20:15.193979500Z"} and depends on JavaTimeModule and WRITE_DATES_AS_TIMESTAMPS
    • keys: not supported
  • DateTimeRfc1123

    • values: RFC1123 string ("Tue, 06 Jul 2021 20:31:19 GMT"), controlled by DateTimeRfc1123Serializer
      • jackson default - not supported - {"dateTime":"2021-07-06T20:20:15.193979500Z"} and depends on JavaTimeModule and WRITE_DATES_AS_TIMESTAMPS
    • keys: not supported
  • ZonedDateTime:

    • values: ISO local date-time string with time zone "2021-07-06T14:08:08.0519546-07:00" (JavaTimeModule and WRITE_DATES_AS_TIMESTAMPS)
    • keys (controlled by JavaTimeModule)
      • serialization: not supported
      • deserialization: Jackson default (ZonedDateTime.parse(key, DateTimeFormatter.ISO_OFFSET_DATE_TIME))

Any few more with the same (de)serialization behavior as ZonedDateTime:

  • LocalDateTime - "2021-07-06T14:08:08.0389576"
  • LocalDate - "2021-07-06"
  • LocalTime - "14:08:08.0379605"
  • MonthDay - "--07-06"
  • OffsetTime - "14:08:08.050955100-07:00"
  • Period - "P10D"
  • Year - "2021"
  • YearMonth - "2021-07"
  • ZoneId - "-07:00"
  • ZoneOffset - "-07:00"

com.azure.core.util.Base64Url

  • serialization: as a string
  • deserialization: implicitly works through public constructor Base64Url(String)

com.azure.core.http.HttpHeaders

Serialized as Map<String, String>. If a header has multiple values, they are joined in a comma-separated string. Using headers along with additionalProperties (or @JsonAnyGetter/@JsonAnySetter) is not supported.

Serializable models included in azure-core

Following models are (de)serialized using serializer provided with JsonSerializerProvider, defaulting to JacksonAdapter which no custom serializer is provided:

  • CloudEvent
  • BinaryData
  • RequestContent
  • JsonPatchDocument with the exception that it does not allow using default implementation (TODO is is a bug?)

Consumers and Azure SDKs are encouraged to bring their own serializers to customize serialization for these models.

GeoObject and its friends are pure models and don't handle their own (de)serialization. DynamicRequest accepts any ObjectSerializer implementation in constructor to serialize request body.

Misc assumptions

  • If multiple (de)serializers are registered for the type, the last one to be set wins.
  • additionalProperties and JsonFlatten should not be used in performance-sensitive cases. They have overhead on each (de)serialization call. Please consider adjusting models for data-plane scenarios. We assume they are used in management SDKs and management SDKs are less sensitive to perf.
Clone this wiki locally