Skip to content

Shanfang/Sketch_Samples

Repository files navigation

Introduction

This repo contains implementation of sketching algorithms for size of join estimation. The update performance of sketches can be significantly improved if only a sample of the data is sketched, without significant degradation in the accuracy. In this repo, Bernoulli sampling is used. For details of the sampling algorithms and sketching techniques, please checkout the references page.

Prerequisite

Install GSL

If you are using Mac, follow these steps:

  1. launch the terminal
  2. run
    ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" < /dev/null 2> /dev/null
  3. run
    brew install gsl

For other systems, please checkout the documentation on GSL

How to run the code

  1. run make
  2. run ./sketch_bernoulli_sampling.out followed by the following parameters:
    dom_size
    tuples_no
    buckets_no
    rows_no
    DIST_PARAM
    DIST_SHUFF
    SAMP_PROB
    num_runs


For details of corresponding parameters, please checkout the documentation at GitHub Wiki

  1. run make clean to remove all intermediate files.

About

FAGMS sketch and sketch over Bernoulli sampling

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published