Experiments with conceptual engineering using an LLM-based implementation of Jennifer Nado's classification procedures


Experiments using prompt programming of large language models to implement targets of conceptual engineering as zero-shot chain-of-thought classifiers.



This repository contains code for the paper Conceptual Engineering Using Large Language Models, accepted for presentation at the 5th Conference on Philosophy of Artificial Intelligence, Erlangen, 15 - 16 December 2023.


  • Python 3.11 or higher.
  • A valid OpenAI API key.


Set the ‘OPENAI_API_KEY’ environment variable to your OpenAI account key, clone this repository, change directory into the repository directory, and then execute the following commands:

python -m venv env
source env/bin/activate
pip install -r requirements.txt




Conceptual engineering (CE) is a philosophical methodology concerned with the assessment and improvement of concepts [1]. Koch, Löhr and Pinder have surveyed recent work on the theory of CE, discussing different theories defining the targets of CE, i.e., "what conceptual engineers are (or should be) trying to engineer" [2]. In one such theory, Nado proposes as targets classification procedures, defined as abstract 'recipes' which sort entities "into an 'in'-group and an 'out'-group" [3]. Our work builds on Nado's idea by defining a method for implementing classification procedures consistent with this definition.

A large language model (LLM) is a probabilistic model trained on a natural language corpus that, given a sequence of tokens from a vocabulary occurring in the corpus, generates a continuation of the input sequence. LLMs exhibit remarkable capabilities for natural language processing and generation [4]. Our work uses prompt engineering [5] of LLMs to implement classification procedures.

A knowledge graph represents knowledge using nodes for entities and edges for relations [6]. Knowledge graphs are key information infrastructure for many Web applications [7]. Our work leverages knowledge graphs as a source of entities used to evaluate classification procedures.


Figure 1 illustrates our method for implementing classification procedures as zero-shot chain-of-thought [8] classifiers. Given a concept's name and intensional definition and an entity's name and description, we prompt an LLM to generate a rationale arguing for or against the entity as an element of the concept's extension, followed by a final 'positive' or 'negative' answer.

Figure 1. A classification procedure using the 24 August 2006 version of the IAU definition of PLANET, implemented as a zero-shot chain-of-thought classifier, and being applied to the description of the entity DENIS-P J08230313-491201 b.


To evaluate classification procedures built using this method, we sampled positive and negative examples of a concept from the Wikidata collaborative knowledge graph [9], retrieving for each entity a summary of its Wikipedia page to use as its description. Next, we apply the classification procedure for a given definition of the concept to each example and compute a confusion matrix from the classifications, which provides performance metrics for the classification procedure. False positives/negatives are then reviewed to determine if a given error arises from the concept's definition or the entity's description. All concept definitions are used verbatim. For the LLM, we use GPT-4 [10] with a temperature setting of 0.1.


  • A Python file defining a Python class ClassificationProcedure, which is used in the accompanying notebooks to define classification procedures for a given concept.
  • planet_experiment.ipynb: A Python notebook that evaluates three definitions for PLANET: one from the Oxford English Dictionary (OED) [11] and two from the 2006 International Astronomical Union (IAU) General Assembly [12,13]. We sampled 50 positive examples that are instances (P31) of planet (Q634), and 50 negative examples that are instances of substellar object (Q3132741), but not of planet.
  • woman_experiment.ipynb: A Python notebook that evaluates three definitions for WOMAN: one from the OED [14], the definition provided in Haslanger’s 2000 paper [15], and one from the Homosaurus vocabulary of LGBTQ+ terms [16,17]. We sampled 50 positive examples whose sex or gender (P21) is either female (Q6581072) or trans woman (Q1052281), and 50 negative examples whose sex or gender is either male (Q6581097), non-binary (Q48270), or trans man (Q2449503).
  • planet_experiment.json: A JSON file containing the experimental results of the evaluation implemented in planet_experiment.ipynb.
  • woman_experiment.json: A JSON file containing the experimental results of the evaluation implemented in woman_experiment.ipynb.


      title={Conceptual Engineering Using Large Language Models}, 
      author={Bradley P. Allen},


