Skip to content

Data Integrity

John Bradley edited this page Apr 20, 2017 · 6 revisions

Back End Data Store

The current DukeDS backend storage is OpenStack Swift.

Each file is replicated a number of times based on configuration.

Uploaded Data

Uploaded files are md5 check summed two ways.

Whole File Checksum

First we checksum the entire file. This is saved with the file on the DukeDS server. The DukeDS team is planning a service to re-check these checksums.

Upload File Chunk Checksums

Secondly the file is split up into chunks based on the chunk size. Each of the chunks is check summed and the checksum is sent along with the chunk. The swift backend object store receives the data, recalculates the checksum and raises an error if it doesn’t match. If this happened the upload would terminate. See Openstack Large Object Documentation for more details.