Skip to content
This repository has been archived by the owner on Sep 18, 2019. It is now read-only.

Lunr 2.1 update & configurable indexed and template fields #118

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Lunr 2.1 update & configurable indexed and template fields #118

wants to merge 12 commits into from

Conversation

nkuehn
Copy link

@nkuehn nkuehn commented May 22, 2017

Hi @slashdotdash starting from my issue #117 I found that a migration to lunr 2.0.x can reduce the index size much more massively, so I started developing and testing a bit.
Here's the result. It required a bit of shuffling around in the indexer because the new lunr index is immutable so we need to go via the Builder class if we don't want to store the complete documents in ruby memory in parallel.

  • Index size in the site I test with reduced from 1.2 MB to 490kb (means we can consider using lunr at all now)
  • No client JS changes necessary as far as I see - but projects that have a hardcopy of lunr.js in their project will break due to completely incompatible index. So it should be worth setting a higher version number and some more documentation. This is a project release policy topic, leave that up to you and I haven't done any changes to the gem description etc. for that reason.

to everyone: please test, this project has no built-in tests so we need a bunch of feeback from real-world sites.

Nikolaus Kühn added 3 commits May 22, 2017 21:02
  - lunr 2.x index is immutable so it’s necessary to go via the Builder class
  - lunr 2.x is not distributed with built-in .min.js in bower, so locally using non-minified js file
  - some more git ignores
@slashdotdash
Copy link
Owner

Awesome work @nkuehn. Thanks for taking the time to get this done. I'll test it out locally and get it merged in and released.

@nkuehn
Copy link
Author

nkuehn commented May 24, 2017

Better stop testing in depth - while researching the weird behavior of the results I stumbled over olivernn/lunr.js#263 , there learning that field boosting was moved to query time in the new index structure.

It's an improvement, but leads to no field being boosted at all now, esp. the title not playing any role.

I'll have to touch the client code, too as it looks. Alternatively wait for lunr.js 2.1, which introduces per-field vectors in the index and behaves pretty good without any boosting at all.

@nkuehn nkuehn changed the title Lunr 2.0.x clean Lunr 2.1 update & configurable indexed and template fields Jun 12, 2017
@nkuehn
Copy link
Author

nkuehn commented Jun 12, 2017

@slashdotdash Lunr.js has released 2.1 to production now ( olivernn/lunr.js@cf96052 ) and I am pretty happy with the results I see, especially in comparison to the 2.0.x series.

So I'm skipping 2.0.x altogether for this upgrade. I have it in use on our site and am happy with the stability, but haven't actively played with other configurations (lack of available sites to test with).

The key changes here are:

  • (lunr 2.1 incurred): No index-time field boosting available any more. Boosting can be done query-time in the search expression language, but I have not felt any need in my index since the fields are well balanced automatically in the new index structure.
  • (indirectly 2.1 incurred): Since lunr 2.1 has a term vector per field, the number of choice of indexed fields becomes more important than other factors. So I introduced the ability to configure which of the built-in or any other custom front matter fields are indexed at all. Defaults are backwards-compatible if all is right.
  • IF a user has overridden the field boosting in _config.yml (possible but not configured) the _config.yml is not compatible any more since the structure is an array now instead of object (to reflect the lunr.js 2.1 config)

Client side was deliberately kept compatible although it would be nice to support some more of the query language features and make it easier to integrate into a bigger site as a JS dependency.

@slashdotdash
Copy link
Owner

@nkuehn Sorry I haven't made time to merge your pull request.

Would you be interested in becoming a contributor to this project so that you can merge PRs yourself?

@nkuehn
Copy link
Author

nkuehn commented Mar 5, 2018

Hi @slashdotdash It's pretty sure now that I won't be contributing any more, so that won't help - I have not been able to tune the underlying lunr.js good enough to match the use case and content size of the site that's driving my motivation here. We switched to a SaaS search offering now.

@slashdotdash
Copy link
Owner

@nkuehn No problem, thanks for letting me know.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants