This repository contains the source of the microbenchmarks and use cases featured in the research paper On Data Processing Through the Lenses of S3 Object Lambda, from IEEE INFOCOM 2023.
The benchmarks and use cases have been written and executed using Python 3.8 running on Ubuntu 20.04.
- Clone this repository on your local machine:
git clone https://github.com/pablogs98/Object-Lambda-Benchmark
- From the repository's root directory, install its Python dependencies:
pip3 install -r requirements.txt
-
Make sure that your AWS account and AWS CLI are correctly set up. More information available here.
-
Install any additional dependencies. For instance, PycURL has additional requirements, namely, libcurl.
-
Make sure
PYTHONPATH
points to the repository's root directory.
Functions are automatically deployed when an example is executed. However, the deployment packages must be generated beforehand and located in the root directory of the microbenchmark/use case (or within a configurable, specified path). In the utils module, we provide scripts which take care of the generation of the deployment packages for Node.js and Python.
More information on Java function deployments here.
The datasets used for experimentation are publicly available and can be downloaded in the following locations:
Use case | Dataset |
---|---|
Grep | GHTorrent |
Parallel tree reduction (streaming pipelines) | HDFS logs |
Pablo Gimeno Sarroca, Marc Sànchez-Artigas. On Data Processing Through the Lenses of S3 Object Lambda, in IEEE INFOCOM 2023.
This project has received funding from the European Union's Horizon Europe (HE) Research and Innovation Programme (RIA) under Grant Agreement No. 101092646 and No. 101092644.