Ent (named after the Ent species from The Lord of the Rings by J. R. R. Tolkien) is an experimental universal, scalable, general purpose, Content-Addressable Store (CAS) to explore verifiable data structures, policies and graphs.
This is not an officially supported Google product.
Ent encourages a model in which files are referred to by their digest (as a proxy for their content), instead of which server they happen to be located (which is what a URL normally is for).
For example, instead of referring to the image below by its URL
https://upload.wikimedia.org/wikipedia/commons/thumb/4/48/The_Calling_of_Saint_Matthew-Caravaggo_%281599-1600%29.jpg/405px-The_Calling_of_Saint_Matthew-Caravaggo_%281599-1600%29.jpg
,
in Ent it would be referred to by its digest
sha256:f3e737f4d50fbf6bb6053e3b8c72d6bf7f1a7229aacf2e9b4c97e9dd27cb1dcf
.
The digest of a file is a stable cryptographic identifier for the actual data that is contained in the file, and does not depend on which server the file happens to be hosted on, or at what path. If at some point in the future the file were to disappear from the original location and be made available at a different location, the original URL would stop working, but the digest of the file would remain the same, and can be used to refer to that file forever.
Additionally, using a digest to refer to a file is useful for security and trustworthiness: if someone sends you the digest of a file to download (e.g. a program to install on your computer), you can be sure that, by resolving that digest to an actual file via Ent, the resulting file is exactly the one that the sender intended, without having to trust the Ent Server, the Ent Index or the server where the file is ultimately hosted.
The Ent CLI can be built and installed with the following command, after having cloned this repository locally:
go install ./cmd/ent
And is then available via the binary called ent
:
ent help
In order to fetch a file with a given digest, the ent get
subcommand can be
used.
You can try the following command in your terminal, which fetches the text of the Treasure Island book:
$ ent get sha256:4c350163715b7b1d0fc3bcbf11bfffc0cf2d107f69253f237111a7480809e192 | head
The Project Gutenberg EBook of Treasure Island, by Robert Louis Stevenson
This eBook is for the use of anyone anywhere in the United States and most
other parts of the world at no cost and with almost no restrictions
whatsoever. You may copy it, give it away or re-use it under the terms of
the Project Gutenberg License included with this eBook or online at
www.gutenberg.org. If you are not located in the United States, you'll have
to check the laws of the country where you are located before using this ebook.
Title: Treasure Island
The Ent CLI queries the default Ent index (https://github.com/tiziano88/ent-index) to resolve the digest to a URL, and then fetches the file at that URL, and also verifies that it corresponds to the expected digest. It first buffers the entire file internally in order to verify its digest, and only prints it to stdout if it does match the expected digest.
You can also manually double check that the returned file does in fact correspond to the expected digest:
$ ent get sha256:4c350163715b7b1d0fc3bcbf11bfffc0cf2d107f69253f237111a7480809e192 | sha256sum
4c350163715b7b1d0fc3bcbf11bfffc0cf2d107f69253f237111a7480809e192 -
An Ent Server provides access to an underlying Ent store via an HTTP-based REST API.
An Ent Server may be running locally (on port 27333 by default), or remotely.
Some Ent Servers require the user to be authenticated in order for the user to read and / or write, which is performed via an API key.
The JSON API allows retrieving and creating multiple objects at once. All the
methods below are invoked via HTTP POST
methods.
-
/api/v1/blobs/get
Get one or more objects by their digest.
-
/api/v1/blobs/put
Put one or more objects.
The raw HTTP API is meant to be used by existing basic tools without requiring any special serialization.
The API supports the following HTTP operations:
GET /raw/:digest
PUT /raw
For instance, this API may be used directly from the terminal via curl
:
-
Get an object by digest:
$ curl --header 'x-api-key: xxx' localhost:27333/raw/sha256:4c350163715b7b1d0fc3bcbf11bfffc0cf2d107f69253f237111a7480809e192 | sha256sum % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 390k 0 390k 0 0 5818k 0 --:--:-- --:--:-- --:--:-- 5832k 4c350163715b7b1d0fc3bcbf11bfffc0cf2d107f69253f237111a7480809e192 -
-
Put an object:
$ curl --header 'x-api-key: yyy' --upload-file README.md --head localhost:27333/raw % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0HTTP/1.1 100 Continue HTTP/1.1 201 Created Location: /raw/sha256:c1a4c83dfeca632af8dcac3591f4b01a303342cf0bae0a63d9a5d7688b0e77cc Date: Thu, 10 Mar 2022 18:00:45 GMT Content-Length: 0 100 8013 0 0 100 8013 0 132k --:--:-- --:--:-- --:--:-- 134k
An Ent index is a "cheap" way to provide access to existing (location-addressed) content on the internet, but in a content-addressable way.
It consists of a static website, which serves an entry for each digest, listing one or more "traditional" URLs which may provide the file in question.
For instance, it may be serialized as a Git repository with a directory structure corresponding to the digest of each entry, and a JSON file for each entry that lists one or more URLs at which the object may be found.
The directory path is obtained by grouping sets of two digits from the digest, and creating a nested folder for each of them; this is in order to limit the number of files or directories inside each directory, since that would otherwise not scale when there are millions of entries in the index.
For instance, the file with digest
sha256:4c350163715b7b1d0fc3bcbf11bfffc0cf2d107f69253f237111a7480809e192
is
stored in the Ent index under the file
/sha256/4c/35/01/63/71/5b/7b/1d/0f/c3/bc/bf/11/bf/ff/c0/cf/2d/10/7f/69/25/3f/23/71/11/a7/48/08/09/e1/92/entry.json
,
which contains the following entry:
Note that the Ent index only stores URLs, not actual data, under the assumption that each URL will keep pointing to the same file forever.
The client querying the index is responsible to verify that the target file still corresponds to the expected digest; if this validation fails, it means that the URL was moved to point to a diferent file after it was added to the Ent index.
Currently, entries may be added to the default index by creating a comment in tiziano88/ent-index#1 containing the URL of the file to index. A GitHub actions is then triggered that fetches the file, creates an appropriate entry in the index, and commits that back into the git repository.
You can try this by picking a URL of an existing file, and creating a comment in tiziano88/ent-index#1 ; after a few minutes, the GitHub action should post another comment in reply, confirming that the entry was correctly incorporated in the index, and printing out its digest, which may then be used with the Ent CLI as above.
If the URL stops pointing to the file that was originally indexed, the Ent CLI will detect that and produce an error.
There is no process for cleaning up / fixing inconsistent entries in the index (yet).
IPFS aims to be a fully decentralized and censorship-resistant protocol and ecosystem, which heavily relies on content-addressability.