Skip to content

timeemit/sol-glyph

Repository files navigation

Sol Glyph

Generate NFTs with on-chain deep learning models computed entirely on the Solana blockchain

This project is my (Liam Norris) submission to the 2022 Solana Riptide hackathon. If you find this work interesting, exciting, or possibly even lucrative, please consider voting your support for my submission here. A prototype can be found at www.sol-glyph.com but in case my poorly maintained and unreliable validator borks when you happen to visit the website, you can find my video presentation at demo.sol-glyph.com.

Deepfakes Using Generative Adversarial Networks (GAN) computed on the Solana Blockchain

Sol Glyph opens the door to dynamically creating a compelling collection of unique NFTs by using the first deep learning smart contact to generate deep fakes resembling portraits of people. It's important to note that the content of the images generated by the deep learning contract are simply a reflection of the images the model is trained with. If you're a Disney or a Coca Cola, these models can generate Marvel characters or soda cans to be traded as NFTs.

The demo at www.sol-glyph.com demonstrates an effect of generative adversarial networks that is surprisingly pertinent to the NFT space. While generating an image simply requires a random vector, blending two different images can be achieved simply by interpolating the their respective generating vectors. So it's perfectly possible to achieve a Crypto Kitties-esque dynamic where you can combine your McDonald's BigMac NFT with another one your friend has to get a new BigMac NFT that blends the two.

But How?

The Solana blockchain is unique in that it is able to execute arbitrary instructions compiled by LLVM, typically the result of Rust or C programs. But the LLVM compiler also happens to be supported within the Pytorch ecosystem by a little-known library called Glow. Glow is able to take a platform independent representation of a deep learning model called ONNX and then make it available for Ahead of Time (AOT) inference through an LLVM-compatible C API. So the end-to-end deployment looks something like this:

  1. Train a Generative Adversarial Network with Pytorch.
  2. Export the Network into (a series of) ONNX file(s).
  3. Use Glow to create C APIs from the exported ONNX file(s).
  4. Comple a Solana entrypoint program that invokes the linked C API.

Which is what the prototype achieves. Additional work is necessary to make the generated images SPL-compatible tokens, which is the standard that Solana necessitate to basically make them transferable.

The Technical Challenges

Putting a deep learning model onto the Solana blockchain has been an experience akin to shoving a skyscraper into a drainage pipe. Yes, the current model only produces blurry, grayscale images after 15 seconds. But that's more of a constraint of the time I was able to commit to the Solana Riptide hackathon. There are three hard constraints that I consistently had to work around: compute, heap, and paramter. Interestingly, the Runtime limit of 30 seconds never actually became a concern--likely because of workarounds I implemented for the other constraints.

Heap Constraint

Solana only permits 32kib of space for the heap. Huge by most standards, except for the purpose of machine learning. Glow needs memory for three different purposes:

  1. To store the weights of a trained machine learning model (referred to as Constant Memory).
  2. To store the input and output vectors of the model (referred to as Mutable Memory).
  3. To store the intermediatry calculations of the model (referred to as Activation Memory).

Typically, Mutable Memory << Activation Memory << Constant Memory, each differing by almost an order of magnitude from the previous. Constant Memory can consume as much as a dozen megabytes! Glow, thankfully, offers two APIs for executing the model: a dynamic and a static one. Although these are presented in the documentation as an either / or paradigm, I am actually using both APIs to step around Solana's memory constraints. The dynamic API is useful for reading and writing dynamically allocated, contiguous blocks of memory from the heap while the static API enables me to store the Constant Memory as a const static variable statically linked to at compile time. This frees the heap at the inconvenience of my wallet's balance when paying for the storage necessary to upload the program.

Compute Constraint

Solana mainnet only supports 200K instructions in a single transaction, which turned out to be such a tight constraint that I had to build all of this repository's functionality onto v1.9.9. This version enables clients to request a higher compute constraint (for a higher price) up to 1.4M instructions but was only partially rolled out to devnet by the conclusion of the hackathon. Unfortunately, even with the increased instuction cap I was only able to render 8x8 images on the blockchain, which were totally indiscernable blobs. I was able to bump up the resolution to 16x16 pixel images by breaking down the dozen or so layers that constitute the Generator and deploying each layer as its own blockchain program, successively calling the next layer with the results of the one before it from the client.

Parameter Constraint

Breaking the model into a program pipeline introduced a new problem: Solana only allows clients to send 1232 bytes to the blockchain in any single message, impeding my ability to communicate an intermediate computation between two layers of the model. To sidestep this particular issue, intermediary results are stored to the chain in randomly generated Accounts.

Code Disclaimers

As mentioned above Mainnet is blocked by feature "transaction wide compute cap" (5ekBxc8itEnPv4NzGJtr8BVVQLNMQuLMNQQj7pHoLNZ9). The demo website is running on a single-tenant that is liable to fail.

Please note that the codebase is a reflection of a rapid development and iteration process. It is not currently very user friendly or easily reproducible. There may be shoddy documentation, absolute paths or URLs specific to a custom development environment, etc. If you are trying to use this codebase and find yourself stuck please don't hesitate to open an issue or pull request.