Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp.
The project started as a Go conversion of https://github.com/jxnl/instructor/, but evolved in a more general-purpose library.
Structured maps data from arbitrary JSON schema to arbitrary Go struct (or just plain JSON).
It also features a language-agnostic HTTP server that you can set up in front of llama.cpp.
It is focused on llama.cpp. Support for other vendor APIs (like OpenAI or Anthropic) might be added in the future.
- Language-agnostic HTTP server
- Go library with a simple API
- Model agnostic
- Focused on llama.cpp
Download the latest release from the releases page.
Alternatively you can clone the repository and build it yourself:
git clone [email protected]:distantmagic/structured.git
cd structured
go build
sequenceDiagram
You->>Structured: JSON schema + data
Structured->>llama.cpp: extract
llama.cpp->>Structured: extracted entity
Structured->>Structured: validates extracted entity (double check)
Structured-->>llama.cpp: retry if validation fails
Structured->>You: JSON matching your schema
Start a server and point it to your local llama.cpp instance:
./structured \
--llamacpp-host 127.0.0.1 \
--llamacpp-port 8081 \
--port 8080
Structured server connects to llama.cpp to extract the data.
Include schema
and data
in your POST body.
The server will respond with JSON matching your schema:
Request:
POST http://127.0.0.1:8080/extract/entity
{
"schema": {
"type": "object",
"properties": {
"hello": {
"type": "string"
}
},
"required": ["hello"]
},
"data": "Say 'world'"
}
Response:
{
"hello": "world"
}
Instead of using the HTTP API, you can use the Go library directly.
API can change with time until all features are implemented.
Point it to your local llama.cpp instance:
import (
"fmt"
"net/http"
"testing"
"github.com/distantmagic/structured/structured"
"github.com/distantmagic/paddler/llamacpp"
"github.com/distantmagic/paddler/netcfg"
)
var entityExtractor *EntityExtractor = &structured.EntityExtractor{
LlamaCppClient: &llamacpp.LlamaCppClient{
HttpClient: http.DefaultClient,
LlamaCppConfiguration: &llamacpp.LlamaCppConfiguration{
HttpAddress: &netcfg.HttpAddressConfiguration{
Host: "127.0.0.1",
Port: 8081,
Scheme: "http",
},
},
},
MaxRetries: 3,
}
After initializing the mapper, you can extract structured data from a string by providing a JSON schema and the string:
import "github.com/distantmagic/structured/structured"
responseChannel := make(chan structured.EntityExtractorResult)
go entityExtractor.ExtractFromString(
responseChannel,
map[string]any{
"type": "object",
"properties": map[string]any{
"name": map[string]string{
"type": "string",
},
"surname": map[string]string{
"type": "string",
},
"age": map[string]string{
"description": "Age in years.",
"type": "integer",
},
},
},
"I am John Doe - living for 40 years and I still like to play chess.",
)
for result := range responseChannel {
if result.Error != nil {
panic(result.Error)
}
// map[name:John, surname:Doe, age:40]
fmt.Print(result.Result)
}
Once you obtain the result you can map it to an arbitrary struct:
import "github.com/distantmagic/structured/structured"
type myTestPerson struct {
Name string `json:"name"`
Surname string `json:"surname"`
Age int `json:"age"`
}
func DoUnmarshalsToStruct(result structured.EntityExtractorResult) {
var person myTestPerson
err := structured.UnmarshalToStruct(result, &person)
if nil != err {
panic(err)
}
person.Name // John
person.Surname // Doe
}
Paddler - (work in progress) llama.cpp load balancer, supervisor and request queue