Skip to content

Commit

Permalink
Fix interdoc links and formatting of products
Browse files Browse the repository at this point in the history
Signed-off-by: Eric Pugh <[email protected]>
  • Loading branch information
epugh committed Sep 18, 2024
1 parent 1b1b13e commit 53fcd3e
Show file tree
Hide file tree
Showing 5 changed files with 18 additions and 19 deletions.
5 changes: 2 additions & 3 deletions _search-plugins/ltr/building-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ without creating an error.
You'll notice we *appended* to the feature set. Feature sets perhaps
ought to be really called "lists". Each feature has an ordinal (its
place in the list) in addition to a name. Some LTR training
applications, such as Ranklib, refer to a feature by ordinal (the
applications, such as RankLib, refer to a feature by ordinal (the
"1st" feature, the "2nd" feature). Others more conveniently refer to
the name. So you may need both/either. You'll see that when features
are logged, they give you a list of features back to preserve the
Expand All @@ -304,8 +304,7 @@ ordinal.

Feature engineering is a complex part of OpenSearch Learning to Rank,
and additional features (such as features that can be derived from other
features) are listed in `advanced-functionality`{.interpreted-text
role="doc"}.
features) are listed in [advanced functionality]({{site.url}}{{site.baseurl}}/search-plugins/ltr/advanced-functionality/).

Next-up, we'll talk about some specific use cases you\'ll run into when
[Feature Engineering]({{site.url}}{{site.baseurl}}/search-plugins/ltr/feature-engineering/).
2 changes: 1 addition & 1 deletion _search-plugins/ltr/core-concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ In actual systems, you might log these values after the fact, gathering
them to annotate a judgement list with feature values. In others the
judgement list might come from user analytics, so it may be logged as the
user interacts with the search application. More on this when we cover
it in `logging-features`{.interpreted-text role="doc"}.
it in [logging features]({{site.url}}{{site.baseurl}}/search-plugins/ltr/logging-features/).

## Training a ranking function

Expand Down
6 changes: 3 additions & 3 deletions _search-plugins/ltr/fits-in.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ development.
Then other tools take over. With a logged set of features for documents,
you join data with your Judgement lists you've developed on your own.
You've now got a training set you can use to test/train ranking models.
Using of a tool like Ranklib or XGBoost, you'll hopefully arrive at a
Using of a tool like RankLib or XGBoost, you'll hopefully arrive at a
satisfactory model.

With a ranking model, you turn back to the plugin. You upload the model
Expand All @@ -52,12 +52,12 @@ company or mechanical turk.

The plugin does not train or test models. This also happens offline in
tools appropriate to the task. Instead the plugin uses models generated
by XGboost and Ranklib libraries. Training and testing models is CPU
by XGboost and RankLib libraries. Training and testing models is CPU

Check failure on line 55 in _search-plugins/ltr/fits-in.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'XGBoost' instead of 'XGboost'. Raw Output: {"message": "[Vale.Terms] Use 'XGBoost' instead of 'XGboost'.", "location": {"path": "_search-plugins/ltr/fits-in.md", "range": {"start": {"line": 55, "column": 4}}}, "severity": "ERROR"}
intensive task that, involving data scientist supervision and offline
testing. Most organizations want some data science supervision on model
development. And you would not want this running in your production
Elasticsearch cluster!

Check failure on line 59 in _search-plugins/ltr/fits-in.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Exclamation] Don't use exclamation points in documentation. Raw Output: {"message": "[OpenSearch.Exclamation] Don't use exclamation points in documentation.", "location": {"path": "_search-plugins/ltr/fits-in.md", "range": {"start": {"line": 59, "column": 15}}}, "severity": "ERROR"}

The rest of this guide is dedicated to walking you through how the
plugin works to get you there. Continue on to
`building-features`{.interpreted-text role="doc"}.
[building features]({{site.url}}{{site.baseurl}}/search-plugins/ltr/building-features/).
6 changes: 3 additions & 3 deletions _search-plugins/ltr/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ OpenSearch.

- Want a quickstart? Check out the demo in
[hello-ltr](https://github.com/o19s/hello-ltr).
- Brand new to learning to rank? head to
`core-concepts`{.interpreted-text role="doc"}.
- Otherwise, start with `fits-in`{.interpreted-text role="doc"}
- Brand new to learning to rank? Head to
[core concepts]({{site.url}}{{site.baseurl}}/search-plugins/ltr/core-concepts/).
- Otherwise, start with how the plugin [fits in]({{site.url}}{{site.baseurl}}/search-plugins/ltr/fits-in/).

## Installing

Expand Down
18 changes: 9 additions & 9 deletions _search-plugins/ltr/training-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ an extensive overview) and then dig into uploading a model.

## RankLib training

We provide two demos for training a model. A fully-fledged [Ranklib
We provide two demos for training a model. A fully-fledged [RankLib
Demo](http://github.com/o19s/elasticsearch-learning-to-rank/tree/master/demo)
uses Ranklib to train a model from OpenSearch queries. You can see
uses RankLib to train a model from OpenSearch queries. You can see
how features are
[logged](http://github.com/o19s/elasticsearch-learning-to-rank-learning-to-rank/tree/master/demo/collectFeatures.py)
and how models are
Expand All @@ -36,16 +36,16 @@ Here for query id 1 (Rambo) we've logged features 1 (a title `TF*IDF`
score) and feature 2 (a description `TF*IDF` score) for a set of
documents. In
[train.py](http://github.com/o19s/elasticsearch-learning-to-rank/demo/train.py)
you'll see how we call Ranklib to train one of it's supported models
you'll see how we call RankLib to train one of it's supported models
on this line:

cmd = "java -jar RankLib-2.8.jar -ranker %s -train%rs -save %s -frate 1.0" % (whichModel, judgmentsWithFeaturesFile, modelOutput)

Our "judgmentsWithFeatureFile" is the input to RankLib. Other
parameters are passed, which you can read about in [Ranklib's
parameters are passed, which you can read about in [RankLib's
documentation](https://sourceforge.net/p/lemur/wiki/RankLib/).

Ranklib will output a model in it's own serialization format. For
RankLib will output a model in it's own serialization format. For
example a LambdaMART model is an ensemble of regression trees. It looks
like:

Expand All @@ -65,8 +65,8 @@ like:
Notice how each tree examines the value of features, makes a decision
based on the value of a feature, then ultimately outputs the relevance
score. You'll note features are referred to by ordinal, starting by
"1" with Ranklib (this corresponds to the 0th feature in your feature
set). Ranklib does not use feature names when training.
"1" with RankLib (this corresponds to the 0th feature in your feature
set). RankLib does not use feature names when training.

## XGBoost example

Expand Down Expand Up @@ -130,10 +130,10 @@ to upload it to OpenSearch LTR. Models are uploaded specifying the
following arguments

- The feature set that was trained against
- The type of model (such as ranklib or xgboost)
- The type of model (such as RankLib or XGBoost)
- The model contents

Uploading a Ranklib model trained against `more_movie_features` looks
Uploading a RankLib model trained against `more_movie_features` looks
like:

```json
Expand Down

0 comments on commit 53fcd3e

Please sign in to comment.