One potentially promising application of language models is to use them to create natural language interfaces to software libraries. Obviously a lot of the language models can already directly generate code using functions that are either built into different languages and/or that are in open source/widely used software libraries, but I think it might also be interesting to use some of these language models to build natural language interfaces into proprietary/private libraries (which are unlikely to be in the data the language models are trained on). There are a couple reasons why I think this might be interesting/compelling:
-
Making applications easier to use. First, obviously these sorts of interfaces could potentially make applications easier to use–rather than needing to figure out how to do something through a potentially unfamiliar graphical interface and/or through an unfamiliar formal language, users can just say what they want the application to do in natural language.
-
Improving the tradeoff between flexibility and usability. Second, natural language interfaces to software libraries could enable the creation of applications that achieve a better tradeoff between flexibility and usability than was previously possible. Previously, giving users more optionality/ability to customize what an application does often seemed to require creating more cluttered/complicated interfaces, which can be aesthetically displeasing and make it harder/more time-consuming to figure out how to use an application. Powerful language models potentially enable the creation of natural language interfaces that are simple/clean, easy to learn and use, and that enable users to leverage the full power of all or most of the software/functions underlying an application.
-
Building a competitive advantage. An obvious question to ask when building an application that leverages powerful language models is how to create some sort of competitive advantage or differentiation relative to both the companies that provide the models and other companies that build on top of the models. Building proprietary software libraries that are specific to your application and using language models as interfaces to these libraries might be one way to achieve some initial differentiation/competitive advantage. In addition, if it turns out that current language models aren’t actually that useful for a particular application (e.g. because they’re not reliable enough), having built a library of useful functions may enable you to pivot more quickly to building a more traditional application than if you spend most of your time prompt engineering to try to figure out how to coax language models to directly perform certain tasks reliably.
This repository contains a short program I wrote that uses the OpenAI API to test out how viable this approach is with current language models. The program takes a description of a computational task the user wants to complete as an input and is supposed to output a program that leverages private software libraries (and potentially publicly-available libraries as well) to complete this task. The program works as follows:
-
When the user runs the program from the command line, they pass the prompt, a file path to a file that provides high-level information on each of the private libraries that are available (including the library name, a brief description of what types of things it can be used for, and a file path that points to more detailed documentation for the library), and a file path to a file where a log of model prompts and responses should be written.
-
The program makes a call to the OpenAI GPT API and asks the model to identify which of the libraries described in the high-level documentation might be useful and respond with the file paths to the libraries surrounded by tags.
-
For each library the language model identified as potentially useful, the program opens the more detailed documentation and makes a call to the OpenAI GPT API asking the model to identify which of the functions in the library might be useful for solving the task posed by the user. In this way, the program incrementally builds a table of functions from the private libraries that might be useful for the user’s task.
-
Finally, the program makes a call to the OpenAI GPT API and asks it to write a program that will accomplish the user’s task leveraging the functions identified in step (3).
The purpose of first asking the model to identify which libraries might be useful and then asking the model to identify useful functions in each of those libraries separately is to make sure the prompts submitted to the API stay within the model’s token limit (and leave sufficient room for the response). If model context windows become wide enough, it may eventually be possible to submit the user’s prompt along with detailed documentation for all the private libraries in one API call (though that may still be prohibitively costly for some applications).
I created some dummy/mock documentation for a few libraries to test out the program, and from what I can tell it seems to work pretty well (at least for an initial prototype). The repository contains a file (revenue_regression_log.txt) that shows the complete history of prompts sent to the API and model responses for a particular user prompt/task.