Beginner Tutorial

Preparation

make install

If you have multi-version python, use the command like

make install ENVPIP=pip3.9

to specify your pip.

Also, you can use the following command:

pip install -e . --config-settings editable_mode=compat

(Optional)

If you want to include the spider deps, use the following command:

pip install -e .[spider] --config-settings editable_mode=compat

cd scripts/beginner
python beginner.py

If you see *** Spark *** in the terminal, then everything goes well.

Then, run the notebook eda.ipynb in scripts/EDA

NOTICE: Download the data before you run the scripts:

data/
├── test_X.xlsx
├── test_y.xlsx
└── train.xlsx

pip install -e .[test] --config-settings editable_mode=compat
pytest

# After you have trained the lgb model
tsl lgb imp scripts/lgb_model/lgb.dill

For M1/M2/M3 Mac Users: If you encounter issues with installing LightGBM, create a conda virtual environment, and install it using conda:
```
conda install -c conda-forge lightgbm
```
File Not Found Error: If you see an error like "No such file or directory: '../../data/train.xlsx'" after placing the files in the data directory, ensure that you are running the script from its directory (e.g., .../spark_learning/scripts/lgb_model) rather than the project root (e.g., .../spark_learning). Note that VSCode's default behavior is to use the project directory, so run the script from the command line instead.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
scripts		scripts
spark_learning		spark_learning
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
setup.cfg		setup.cfg
setup.py		setup.py