MELT: Mobile Evaluation of Language Transformers

This is the catch-all repository for the codebase of our on-device evaluation of LLMs.

Components

Structure

├── README.md
├── blade/  # PhoneLab infrastructure for automated evaluation
├── frameworks/  # LLM frameworks supported by MELT
├── jetsonlab/   # JetsonLab infrastructure for
├── melt_models/ # HF models
├── melt_models_converted/ # Converted/quantized models for each backend
└── src/          # Custom code for model conversion, prompt analysis, model evaluation, and result parsing.
    ├── configs/  # Configuration per model
    ├── model_evaluation/ # Code for the model evaluation on datasets
    ├── models/   # Model conversion logic
    ├── parsers/  # Results parsing logic
    └── prompts/  # Prompt analysis logic

Organisation

The codebase is structured with git submodules, for maintaining some level of separation. For checking everything out, please run:

git submodule update --init --recursive

This command will checkout the latest working version for each component, recursively.

How to run

The general workflow for running experiment goes as follows:

Go to frameworks/MLC/mlc-llm or frameworks/llama.cpp/llama.cpp and compile each framework. Please see the documentation (#1,#2) for more.
Go to src/models and download, convert models. Please see this for more.
After you build the models, you need to build the apps, that are going to be installed to the phones. To do so, please follow the rest of the documentation in (#1,#2).
Go to blade/experiments/ and follow the documentation there. You need to install the applications, transfer models on the local directories and then run the automated scripts.
If the experiment has successfully run, you'll have blade/experiment_outputs/ directory populated. You can run the blade/experiments/notebooks for analysis of the results.

For running on jetson platform, you need to build each framework with the appropriate script (see (#1. See also this documentation for more.

Further documentation

Additional documentation on how to run is provided in each of the subdirectories, as separate README files.

PhoneLab README
JetsonLab README
llama.cpp:
- building README
- running README
MLC-LLM:
- building README
- running README
LLMFarm README

Supported frameworks

MLC-LLM submodule, upstream repo
- TVM-Unity submodule, upstream repo
llama.cpp submodule, upstream
- LLMFarm submodule, upstream

Supported infrastructure backends

JetsonLab
PhoneLab

Authors/Maintainers

Stefanos Laskaridis (@stevelaskaridis)
Kleomenis Katevas (@minoskt)
Lorenzo Minto (@LorenzoMinto)

Citation

If you found this repo useful, please cite our paper "MELTing point: Mobile Evaluation of Language Transformers"

@article{laskaridis2024melting,
  title={MELTing point: Mobile Evaluation of Language Transformers},
  author={Laskaridis, Stefanos and Katevas, Kleomenis and Minto, Lorenzo and Haddadi, Hamed},
  journal={arXiv preprint arXiv:2403.12844},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MELT: Mobile Evaluation of Language Transformers

Components

Structure

Organisation

How to run

Further documentation

Supported frameworks

Supported infrastructure backends

Authors/Maintainers

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

MELT: Mobile Evaluation of Language Transformers

Components

Structure

Organisation

How to run

Further documentation

Supported frameworks

Supported infrastructure backends

Authors/Maintainers

Citation