EdSimChecker is a tool designed to detect similarity between source codes, even when they have been obfuscated using various techniques. It is ideal for programming teachers and students who want to verify the originality of the code.
- Similarity Detection: Detects similarity between source codes, even if they contain obfuscation techniques.
- Advanced Analysis: Utilizes abstract syntax trees, tokenization, and edit distance to perform the analysis.
- No Additional Dependencies: Implemented in pure Python, with no need to install additional libraries.
- Python: Main programming language.
- Abstract Syntax Trees (AST): To analyze the structure of the code.
- Tokenization: To break down the code into basic elements.
- Edit Distance: To measure the similarity between different code fragments.
No additional dependencies are required. You only need to have Python installed on your system.
- Clone the repository:
git clone https://github.com/your-username/edsimchecker.git
- Navigate to the project directory:
cd edsimchecker
- Install the package:
pip install .
--path
,-p
: Path to the directory containing the source code files.--files
,-f
: Specific input files to compare.--recursive
,-r
: Recursively search through directories.--threshold
,-t
: Similarity threshold (default: 0.75, range: 0.0 - 1.0).--level
,-l
: Obfuscation level (default: 0, range: 0 - 4).--window-percentage
,-w
: Window percentage (default: 1.0, range: 0.0 - 1.0).
EdSimChecker can be used from the command line. Here are some usage examples:
edsimchecker --files file1.py file2.py
edsimchecker --path /path/to/directory --recursive
edsimchecker --path /path/to/directory --threshold 0.8
edsimchecker --path /path/to/directory --level 2
edsimchecker --path /path/to/directory --window-percentage 0.5
Contributions are welcome. Please follow these steps to contribute:
- Fork the repository.
- Create a new branch (
git checkout -b feature/new-feature
). - Make your changes and commit them (
git commit -am 'Add new feature'
). - Push to the branch (
git push origin feature/new-feature
). - Open a Pull Request.
For more information on the techniques used, you can refer to the following resources:
This project is licensed under the MIT License. See the LICENSE file for more details.