Skip to content
This repository has been archived by the owner on Aug 11, 2021. It is now read-only.

Provide API to just encode / decode data without publishing a block #194

Open
Gozala opened this issue Mar 6, 2019 · 7 comments
Open

Provide API to just encode / decode data without publishing a block #194

Gozala opened this issue Mar 6, 2019 · 7 comments

Comments

@Gozala
Copy link

Gozala commented Mar 6, 2019

  • Version:0.34.4
  • Platform:OSX
  • Subsystem:*

Type:Enhancement

Severity:Other

Description:Provide an API to encode data without publishing a block

This is related to ipld/ipld#64, more specifically I wish to encode content with arbitrary IPLD format supported, but then encrypt that buffer before publishing it. However API available right now doesn't provide a way to accomplish that because put first encodes data to buffer and then publishes it. It is also impossible to retrieve buffer for that CID as it automatically decoded with a corresponding format.

If public API was further extended with encode(data, option):Buffer that would allow one to do multiple encoding passes before block is published. Doing reverse operation would also be required to provide get that has to unwrap multiple layers.

@Gozala
Copy link
Author

Gozala commented Mar 6, 2019

Since submitting I discovered that I can do following:

const cid = await ipfs.dag.put(buffer, {onlyHash:true})
const {data} = ipfs.block.get(cid)

However that is awkward, and I think something like following would make a lot more sense

const buffer = await ipfs.dag.encode(data)
const encrypted = await crypto.encrypt(secretKey, buffer)
const cid = await ipfs.dag.put(encrypted)

Then to doing revers could be something like:

const encrypted = await ipfs.dag.get(cid)
const buffer = await crypto.decrypt(secretKey, encrypted)
const data = await ipfs.dag.decode(buffer)

@vmx
Copy link
Member

vmx commented Mar 6, 2019

As you've probably seen there's a huge API rewrite, which will hopefully be merged soon.

Let me explain my current ideas about abstractions in IPLD. I see js-ipld as a library where you don't really get in touch with the binary representation. You start with your own data, hand it over to js-ipld to do some serialisation, but you don't actually care about what it does internally. You never see the binary data. When you retrieve the data, you get the already deserialised data back. So js-ipld is about structured data and CIDs.

At a lower level there's the IPLD Formats (which btw will also see an API rewrite soon). There you work on a lower level. Here it's about structured data and their serialisation/binary encoding.

So currently (as you found out), you would use the Block API from IPFS to work on a block/binary level.

I need to put more though into this. But I think it would be great if we could solve this on an IPLD Format level.

So it might look this (based on the new API):

const cid = await ipld.put([data], format: multicodec.enrypted, { key: secretKey })
// the reverse
const data = await ipld.get([cid], { key: secretKey })

And the actual encryption would be handled within it's special IPLD Format.

@vmx
Copy link
Member

vmx commented Mar 6, 2019

Funnily enough I was working on something today that also needed the serialsed data without storing it. After a quick chat with @mikeal I think that we might need some layer between what IPLD Formats and js-ipld is today. js-ipld would then use that layer to store the blocks.

@Gozala give me a bit of time on this (I'm not sure if I find the time this week and next week I'll be at a conference). I want to think more about the layers we need to also support ipld/ipld#64. I like the idea of the encode/decode() step.

@Gozala
Copy link
Author

Gozala commented Mar 7, 2019

@vmx I have posted feedback on new API in this post https://gozala.hashbase.io/posts/Constraints%20of%20an%20API%20design/

@Gozala
Copy link
Author

Gozala commented Mar 7, 2019

I need to put more though into this. But I think it would be great if we could solve this on an IPLD Format level.

So it might look this (based on the new API):

const cid = await ipld.put([data], format: multicodec.enrypted, { key: secretKey })
// the reverse
const data = await ipld.get([cid], { key: secretKey })

And the actual encryption would be handled within it's special IPLD Format.

I think solution should be somewhat more general. What I mean is there might be several layers of encoding JSON -> dag-cbor -> symetric encryption -> asymetric encryption .... and somehow each pass needs to encode information + metadata about codec so when you do reverse you can decode each layer.

The problem I'm running into right now (unless I'm missing something) e.g when I encode with dag-cbor and say encrypt that with secret key. Knowledge of encoding 'dag-cbor' is encoded in CID not the buffer itself which means that when doing decoding after decryption step I no longer know what is the codec / format to be used for next decode step.

Which in some ways suggests there should be a need for some canonical registry for codecs.

@Gozala
Copy link
Author

Gozala commented Mar 7, 2019

Ok ignore all that above clearly that's already being considered https://github.com/multiformats/multicodec

@mikeal
Copy link

mikeal commented Apr 4, 2019

This should be resolved once we migrate to https://github.com/ipld/js-ipld-stack as you can create all the block data lazily without publishing it and it does all the codec lookup for you still.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants