A bit like django-haystack, but everything in postgres, accessible via Django ORM, using postgres fullext search capabilites. The goal is to ease setup and maintainance for smaller and medium sized projects - without dependencies on search technology like elastic, solr or whoosh.
During conception, I was thinking about developing a backend for django-haystack, but decided against, to be able to develop from the ground up, as simple as possible. The project could still provide a haystack backend one day, but it was just not my priority.
- Searchindex in PostgreSQL
- No dependencies besides Django and PostgreSQL
- contrib.djangocms, for easy indexing of django-cms sites
Describe, index, search.
Default value, simplest possible configuration:
POSTGRES_SEARCHINDEX = {
"default": {},
}
Example for a multilanguage setup:
POSTGRES_SEARCHINDEX = {
"de": {
"kwargs": {
"language": "de",
}
},
"fr": {
"kwargs": {
"language": "fr",
}
},
}
More complex configurations could include django's SITE_ID
or other relevant infos
in searchindex key and kwargs.
Example, hopefully self explaining.
import html
from django.utils.html import strip_tags
from postgres_searchindex.base import IndexSource / MultiLanguageIndexSource
from postgres_searchindex.source_pool import source_pool
from news.models import News
@source_pool.register
class NewsIndexSource(IndexSource / MultiLanguageIndexSource):
model = News
def get_title(self, obj):
return strip_tags(obj.description)
def get_content(self, obj):
return html.unescape(strip_tags(obj.description))
def get_queryset(self):
return self.model.objects.published()
Place this code in index_sources.py
of your app, and it will be autodiscovered.
Run ./manage.py postgres_searchindex_update
to update/build the index.
» ./manage.py postgres_searchindex_update
====================================
Updating index "de" with kwargs {'language': 'de'}
Person. Indexing 5 entries
> Done. Removed from index: 0
Project. Indexing 66 entries
> Done. Removed from index: 0
Media. Indexing 36 entries
> Done. Removed from index: 2
====================================
Updating index "fr" with kwargs {'language': 'fr'}
Person. Indexing 5 entries
> Done. Removed from index: 0
Project. Indexing 66 entries
> Done. Removed from index: 0
Media. Indexing 36 entries
> Done. Removed from index: 2
If you want to control how things were indexed, you can check
your IndexEntry
instances in Django admin.
You can now search in your index. You are free to use Django's builtin fulltext features as you like - as in the following example, or in a way more advanced manner.
from django.contrib.postgres.search import SearchVector
from postgres_searchindex.models import IndexEntry
# this will return entries containing "überhaupt" and "uberhaupt"
IndexEntry.objects.annotate(
search=SearchVector("content", "title", config="german")
).filter(index_key=self.request.LANGUAGE_CODE, search="uberhaupt")
There is a full example in the source: views.py
and urls.py
will give you an idea.
To be done: |highlight:query templatefilter, to highlight the serach query in the search result text.
Either you'll regularly run ./manage.py postgres_searchindex_update
, or you'll
implement a realtime or near realtime solution, with signals, throug the
POSTGRES_SEARCHINDEX_SIGNAL_PROCESSOR
setting.
There are two currently one none (not yet) builtin processors:
postgres_searchindex.signal_processors.RealtimeSyncedSignalProcessor
postgres_searchindex.signal_processors.RealtimeCelerySignalProcessor
The async signal processor will require you to have celery configured.
A few tools to speed up indexing of django-cms sites.
Add postgres_searchindex.contrib.djangocms
to settings.INSTALLED_APPS
.
Configure one of your cms pages to use the app hook "Search Form (postgres_searchindex)". It will provide a very
basic search form, and you can override the template postgres_searchindex/search.html
if you want.
Add postgres_searchindex.contrib.djangocms
to settings.INSTALLED_APPS
.
And set settings.POSTGRES_SEARCHINDEX_USE_CMS_INDEX = True
to have your django-cms pages indexed automagically (with the next call of
./manage.py postgres_searchindex_rebuild
).
Example Event
model, with a PlaceholderField
called "content":
import html
from django.utils.html import strip_tags
from postgres_searchindex.base import MultiLanguageIndexSource
from postgres_searchindex.contrib.djangocms.base import PlaceholderIndexSourceMixin
from postgres_searchindex.source_pool import source_pool
from .models import Event
@source_pool.register
class EventIndexSource(PlaceholderIndexSourceMixin, MultiLanguageIndexSource):
model = Event
placeholder_field_name = "content"
def get_content(self, obj):
c = strip_tags(obj.description) # prepend with preview/description
c += super().get_content(obj) # render placeholder
c = html.unescape(c) # convert & to "
return c
def get_queryset(self):
return self.model.objects.published()
I used django-haystack for a decade, and I really like the concept. Building my first index though, was quite time intensive. After development of haystack and also some of it's backends have sometimes stalled, I was regularly thinking about writing my own search index, with PostgreSQL only.
See open issues.