Skip to content

How to have Versioned Serialization in Rust

Wenqi Mou edited this page Apr 20, 2020 · 1 revision

Goals

As certain structs in Rust client might change over different versions, it is important to have the ability to support versioned serialization for those structs to enable the rolling upgrade feature for Pravega.

There are two concepts that are applied here: Revesion and Version.

Revision

A Revision is basically an incremental change of the struct, which means that two different Revisions are compatiable with each other.

Version

A Version could be seen as multiple Revisions bundled together. However, it would be nescessary to a create a new version if an imcompatible change has been introduced.

Conventions

We use Bincode to serialize structs into byte arrays. Bincode relies on the fact that #[derive(Deserialize, Serialize)] implementations of the Serde traits deserializes fields in exactly the same order as it serialized them. Check out here for a very intuitive explanation of how Bincode works.

Revision

As explained above, Revisions should be compatiable with each other. For example, the change below could be treated as a new revision

#[derive(Serialize, Deserialize]
struct FooRv1 {
  id: i32,
  name: String,
}

#[derive(Serialize, Deserialize]
struct FooRv2 {
  id: i32,
  name: String,
  description: String,
}

FooRv1 should be able to deserialize what is serialized by FooRv2 and vice versa. Luckily, Bincode actually has this feature for us. How it works is that FooRv2 will be serialized into an byte array i32 + String + String and FooRv1 just reads as many fields as it can when it deserializes, which in this case is just i32 + String. On the other hand, FooRv1 will be serialized as i32 + String and FooRv2 will read to the end to fill id and name, the remaining field description will be initiliazed as an empty String.

Version

Versoins are potenially incompatiable with each other. For example

#[derive(Serialize, Deserialize]
struct FooV1 {
  id: i32,
  name: String,
}

#[derive(Serialize, Deserialize]
struct FooV2 {
  id: i64,
  name: String,
}

In the case above, the serialized byte array legnth for the first field id is changed from 4 bytes to 8 bytes, thus making two versions incompatible. The way to solve that is to have an enum containing the Foo struct

#[derive(Serialize, Deserialize]
enum FooVersioned {
  V1(FooV1),
  V2(FooV2),
}

We can serialize and send this enum through wire. After deserializing the byte array to enum, we can use pattern matching to find the correct Version.

Example

// mod represents different versions of Rust client
mod v1rv1 {
  #[derive(Serialize, Deserialize]
  enum FooVersioned {
    V1(FooV1Rv1)
  }

  #[derive(Serialize, Deserialize]
  struct FooV1Rv1 {
    id: i32,
    name: String,
  }
}

mod v1rv2 {
  #[derive(Serialize, Deserialize]
  enum FooVersioned {
    V1(FooV1Rv2)
  }

  #[derive(Serialize, Deserialize]
  struct FooV1Rv2 {
    id: i32,
    name: String,
    description: String,
  }
}

mod v2rv1 {
  #[derive(Serialize, Deserialize]
  enum FooVersioned {
    V1(FooV1Rv2),
    V2(FooV2Rv1)
  }

  #[derive(Serialize, Deserialize]
  struct FooV1Rv2 {
    id: i32,
    name: String,
    description: String,
  }

  #[derive(Serialize, Deserialize]
  struct FooV2Rv1 {
    id: i64,
    name: String,
    description: String,
  }
}