Serialization

Data Serialization

The custom data serialization is one of the core features of AtomicAssets. It is inspired by Google's Protobuf and is saved as a byte vector (vector<uint8_t>) to the blockchain.
It is expected to save between 30-80% of RAM for the majority of asset collections compared to traditional methods like using JSON strings.

Schemes

Each preset (and thus also each asset) references a scheme that is used for serialization. A scheme describes the format of the data that can be serialized and is essential to the serialization. In practice, each scheme stores a vector of FORMAT types, each of which describes a single attribute that can be serialized. A scheme can be extended by adding more FORMATs to the vector, but previously added FORMATs can never be removed to ensure that any data serialized with a previous version of a scheme can still be deserialized with the new version.

FORMAT is a struct with a name and type value. The FORMAT names need to be unique within a given scheme.

struct FORMAT {
  std::string name;
  std::string type;
};

Valid types are:

int8/ int16/ int32/ int64
uint8/ uint16/ uint32/ uint64
fixed8/ fixed16/ fixed32/ fixed64
float/ double/ string/ ipfs/ bool/ byte

or any valid type followed by [] to describe a vector.
nested vectors (e.g. uint64[][]) are not allowed

How does the serialization work

Prerequisites

Just like with Protobuf, to understand the AtomicAssets serialization it is important to first understand Varints (Variable size integers). Check out the Protobuf docs for that here.

The data to be serialized is passed to the smart contract as an ATTRIBUTE_MAP, which maps attribute names to their values.

Pseudo Algorithm

vector<uint8_t> main(vector<FORMAT> format_lines, ATTRIBUTE_MAP data) {
   
   serialized_data = empty uint8_t vector
   //0-3 are reserved for possible later extensions
   identifier = 4
   
   For each line in format_lines {
      If line.name is defined in data {
         Append varint(identifier) to serialized_data
         linedata = data[line.name]
         Append serialize(linedata, line.type)
      }
      identifier += 1
   }
   
   return serialized_data
}

Explanation:

Data is serialized in the order of the respective FORMATs within a schemes format. Attributes that are not defined within the provided ATTRIBUTE_MAP is skipped completely and does not take up any space.
Ahead of a serialized attribute, there is a varint encoded identifier. This identifier is dependent on the position of the attribute within the format vector. Because the identifiers 0-3 are reserved, the first attribute has identifier 4.

atomicassets.io - developed with ❤️ by pink.network