Ask questions to a PDF file using Retrieval-Augmented Generation with pgvector and OpenAI.
- Node.js LTS (>v20.9.0) - We recommend using Volta as version manager.
- An Heroku account
- Heroku CLI
- PostgreSQL psql client
- AWS Command Line Interface
Install dependencies by running:
npm install
Create an Heroku application with:
heroku create <app-name>
Install the Heroku PostgreSQL with pgvector addon:
heroku addons:create heroku-postgresql:essential-0
Install the Bucketeer addon:
heroku addons:create bucketeer:hobbyist
Once the PostgreSQL database is created, setup the database schema with:
heroku pg:psql < data/database.sql
Setup Bucketeer public policy, make sure to replace <bucket-name>
and run:
aws s3api put-public-access-block --bucket <bucket-name> --public-access-block-configuration BlockPublicAcls=FALSE,IgnorePublicAcls=FALSE,BlockPublicPolicy=FALSE,RestrictPublicBuckets=FALSE
Create a policy.json file and replace <bucket-name>
.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::<bucket-name>/public/*"]
}
]
}
Then run the following, replacing <bucket-name>
:
aws s3api put-bucket-policy --bucket <bucket-name> --policy file://policy.json
Note: To run the aws
commands you need to configure your credentials first by running:
aws configure
Create a .env
file with the following information:
BUCKETEER_AWS_ACCESS_KEY_ID=<value>
BUCKETEER_AWS_REGION=us-east-1
BUCKETEER_AWS_SECRET_ACCESS_KEY=<value>
BUCKETEER_BUCKET_NAME=<value>
DATABASE_URL=<value>
OPENAI_API_KEY=<value>
Note: BUCKETEER_*
and DATABASE_URL
can be fetched from Heroku using:
heroku config --shell > .env
The OPENAI_API_KEY
can be set using:
heroku config:set OPENAI_API_KEY=<value>
Run the project locally with:
npm run dev
To manually deploy to Heroku you can run:
git push heroku main