ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning

Overview

ChemAgent leverages an innovative self-improving memory system to significantly enhance performance in complex scientific tasks, with a particular focus on Chemistry.

Key Features

Self-updating Memory System: Continuously improves model performance
Specialized for Chemistry: Tailored for chemical reasoning tasks
Modular Architecture: Includes task splitting, execution, association, and reflection modules

Project Structure

Xagent/agent/: Contains prompts and code for functional modules
dev_test.py: Execution code for memory pool construction
assets/config.yml: Configuration and module activation management
output/: Experimental results for GPT-4 and GPT-3.5 across four datasets
memory/: Plan Memory and Execute Memory for the four datasets
dataset/: Experimental datasets from SciBench

Getting Started

Installation: (Add installation instructions here)
Configuration: Modify assets/config.yml to suit your needs
Running Experiments: Use dev_test.py with appropriate parameters

For detailed usage instructions, see the "How to run" and "How to dev" sections below.

How to run

Run on dataset

python dev_test.py --tool python --mode test --list_source <put_the_datasetname_here>

You can choose from the following datasets located in the dataset folder: atkins, chemmc, matter, or quan.

How to dev

Before executing tasks using this framework, it is necessary to first construct memory according to the process outlined in the paper.

Add Few-shot Examples: In the XAgent/agent/simple_agent/prompt.py file, select two examples from fewshot_examples and incorporate them into the model prompt.
Update Configuration File: Modify the assets/config.yml file to adjust the configuration according to your requirements.

Run Development Tests: Execute the development test mode using the following command:

python dev_test.py --tool python --mode dev --list_source <put_the_datasetname_here>

Configuration

The assets/config.yml file allows you to customize various settings for the ChemAgent framework:

Model Settings

default_completion_kwargs:
  model: gpt-3.5-turbo-16k
  eva_model: gpt-4
  temperature: 0.2
  request_timeout: 60

To change other settings:

save_dir: "./memory/exec_memory/storage_matter_gpt4"
plan_save_dir: "./memory/plan_memory/plan_storage_matter_gpt4"
score: False
refine: False
image: False

The configuration parameters control various aspects of the system:

Memory Pools:
- plan_save_dir: Specifies the memory pool for Plan Memory
- save_dir: Specifies the memory pool for Execution Memory
Knowledge Memory:
- img: Controls the activation of Knowledge Memory
Task Execution:
- refine: Determines if the system should refine the task strategy upon execution failure
Evaluation:
- score: Decides if the evaluation module is enabled

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
XAgent		XAgent
assets		assets
dataset		dataset
fewshot_examples		fewshot_examples
memory		memory
output		output
trajectory_output_examples		trajectory_output_examples
.gitignore		.gitignore
README.md		README.md
command.py		command.py
dev_test.py		dev_test.py
env.py		env.py
requirements.txt		requirements.txt
run.py		run.py
sometest.py		sometest.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning

Table of Contents

Overview

Key Features

Project Structure

Getting Started

How to run

Run on dataset

How to dev

Configuration

Model Settings

About

Releases

Packages

Contributors 2

Languages

gersteinlab/chemagent

Folders and files

Latest commit

History

Repository files navigation

ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning

Table of Contents

Overview

Key Features

Project Structure

Getting Started

How to run

Run on dataset

How to dev

Configuration

Model Settings

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages