This repository aims at releasing smart contract datasets used in our works, to facilitate community research. Also, we present instructions on how to label a certain type of vulnerability and show the detailed pattern designs of investigated vulnerabilities.
-
This dataset consists of over 40K real world Ethereum smart contracts.
-
Download this resource at Ethereum_smart_contract.
-
Please cite one of the papers if you want to use the dataset in your paper.
@inproceedings{zhuangsmart,
title={Smart Contract Vulnerability Detection using Graph Neural Network},
author={Zhuang, Yuan and Liu, Zhenguang and Qian, Peng and Liu, Qi and Wang, Xiang and He, Qinming},
booktitle={IJCAI},
pages={3283--3290},
year={2020}
}
@inproceedings{liu2021smart,
title={Smart Contract Vulnerability Detection: From Pure Neural Network to Interpretable Graph Feature and Expert Pattern Fusion},
author={Liu, Zhenguang and Qian, Peng and Wang, Xiang and Zhu, Lei and He, Qinming and Ji, Shouling},
booktitle={IJCAI},
pages={2751--2759},
year={2021}
}
@article{liu2021combining,
title={Combining Graph Neural Networks with Expert Knowledge for Smart Contract Vulnerability Detection},
author={Liu, Zhenguang and Qian, Peng and Wang, Xiaoyang and Zhuang, Yuan and Qiu, Lin and Wang, Xun},
journal={IEEE Transactions on Knowledge and Data Engineering},
year={2021},
publisher={IEEE}
}
- This dataset concerns four types of vulnerabilities (i.e., reentrancy, timestamp dependency, integer overflow, dangerous delegatecall), where we give the preprocessing method.
- Check instructions for how to label these vulnerabilities.
- Download this resource at Dataset_preprocessing.
Please cite our paper if you want to use the dataset in your paper.
@inproceedings{10.1145/3543507.3583367,
author = {Qian, Peng and Liu, Zhenguang and Yin, Yifang and He, Qinming},
title = {Cross-Modality Mutual Learning for Enhancing Smart Contract Vulnerability Detection on Bytecode},
year = {2023},
isbn = {9781450394161},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the ACM Web Conference 2023},
pages = {2220–2229},
numpages = {10},
location = {Austin, TX, USA},
series = {WWW '23}
}
-
This dataset contains over 12K Ethereum smart contracts (where inherited contracts are also included) and concerns eight types of vulnerabilities.
-
Check the pattern design for more details.
-
Download this resource at Dataset.
-
Please cite our paper if you want to use the dataset in your paper.
@article{liu2023rethinking,
title={Rethinking Smart Contract Fuzzing: Fuzzing With Invocation Ordering and Important Branch Revisiting},
author={Liu, Zhenguang and Qian, Peng and Yang, Jiaxu and Liu, Lingfeng and Xu, Xiaojun and He, Qinming and Zhang, Xiaosong},
journal={arXiv preprint arXiv:2301.03943},
year={2023}
}
-
Here, we present three datasets to evaluate the performance of smart contract analyzers.
-
The first dataset D1 (released by [1]) is used to measure the branch coverage of fuzzers. The second dataset D2 (released by [2, 3, 4]) aims to evaluate the performance of vulnerability detection tools, while the purpose of the third dataset D3 (released by [5]) is to validate the effectiveness of our system in handling real-world contracts that involve large-scale transactions.
-
Download this resource at Dataset.
-
Please cite our paper if you want to use the dataset in your paper.
Coming soon.
[1] Christof Ferreira Torres, et al. CONFUZZIUS: A Data Dependency-Aware Hybrid Fuzzer for Smart Contracts. EuroSP 2021.
[2] SmartBug: https://github.com/smartbugs/smartbugs-wild
[3] VeriSmart: https://github.com/kupl/VeriSmart-benchmarks
[4] SWC registry: https://swcregistry.io
[5] Jaeseung Choi, et al. SMARTIAN: Enhancing Smart Contract Fuzzing with Static and Dynamic Data-Flow Analyses. ASE 2021.