diff --git a/_search-plugins/ltr/building-features.md b/_search-plugins/ltr/building-features.md index c6eb625251..7e30ccb77b 100644 --- a/_search-plugins/ltr/building-features.md +++ b/_search-plugins/ltr/building-features.md @@ -294,7 +294,7 @@ without creating an error. You'll notice we *appended* to the feature set. Feature sets perhaps ought to be really called "lists". Each feature has an ordinal (its place in the list) in addition to a name. Some LTR training -applications, such as Ranklib, refer to a feature by ordinal (the +applications, such as RankLib, refer to a feature by ordinal (the "1st" feature, the "2nd" feature). Others more conveniently refer to the name. So you may need both/either. You'll see that when features are logged, they give you a list of features back to preserve the @@ -304,8 +304,7 @@ ordinal. Feature engineering is a complex part of OpenSearch Learning to Rank, and additional features (such as features that can be derived from other -features) are listed in `advanced-functionality`{.interpreted-text -role="doc"}. +features) are listed in [advanced functionality]({{site.url}}{{site.baseurl}}/search-plugins/ltr/advanced-functionality/). Next-up, we'll talk about some specific use cases you\'ll run into when [Feature Engineering]({{site.url}}{{site.baseurl}}/search-plugins/ltr/feature-engineering/). diff --git a/_search-plugins/ltr/core-concepts.md b/_search-plugins/ltr/core-concepts.md index adfc4b1f79..cdfc8e3d5b 100644 --- a/_search-plugins/ltr/core-concepts.md +++ b/_search-plugins/ltr/core-concepts.md @@ -185,7 +185,7 @@ In actual systems, you might log these values after the fact, gathering them to annotate a judgement list with feature values. In others the judgement list might come from user analytics, so it may be logged as the user interacts with the search application. More on this when we cover -it in `logging-features`{.interpreted-text role="doc"}. +it in [logging features]({{site.url}}{{site.baseurl}}/search-plugins/ltr/logging-features/). ## Training a ranking function diff --git a/_search-plugins/ltr/fits-in.md b/_search-plugins/ltr/fits-in.md index 3f2dadff37..a1f3958f0f 100644 --- a/_search-plugins/ltr/fits-in.md +++ b/_search-plugins/ltr/fits-in.md @@ -30,7 +30,7 @@ development. Then other tools take over. With a logged set of features for documents, you join data with your Judgement lists you've developed on your own. You've now got a training set you can use to test/train ranking models. -Using of a tool like Ranklib or XGBoost, you'll hopefully arrive at a +Using of a tool like RankLib or XGBoost, you'll hopefully arrive at a satisfactory model. With a ranking model, you turn back to the plugin. You upload the model @@ -52,7 +52,7 @@ company or mechanical turk. The plugin does not train or test models. This also happens offline in tools appropriate to the task. Instead the plugin uses models generated -by XGboost and Ranklib libraries. Training and testing models is CPU +by XGboost and RankLib libraries. Training and testing models is CPU intensive task that, involving data scientist supervision and offline testing. Most organizations want some data science supervision on model development. And you would not want this running in your production @@ -60,4 +60,4 @@ Elasticsearch cluster! The rest of this guide is dedicated to walking you through how the plugin works to get you there. Continue on to -`building-features`{.interpreted-text role="doc"}. +[building features]({{site.url}}{{site.baseurl}}/search-plugins/ltr/building-features/). diff --git a/_search-plugins/ltr/index.md b/_search-plugins/ltr/index.md index 5ddadd264c..52c94dcb37 100644 --- a/_search-plugins/ltr/index.md +++ b/_search-plugins/ltr/index.md @@ -25,9 +25,9 @@ OpenSearch. - Want a quickstart? Check out the demo in [hello-ltr](https://github.com/o19s/hello-ltr). -- Brand new to learning to rank? head to - `core-concepts`{.interpreted-text role="doc"}. -- Otherwise, start with `fits-in`{.interpreted-text role="doc"} +- Brand new to learning to rank? Head to + [core concepts]({{site.url}}{{site.baseurl}}/search-plugins/ltr/core-concepts/). +- Otherwise, start with how the plugin [fits in]({{site.url}}{{site.baseurl}}/search-plugins/ltr/fits-in/). ## Installing diff --git a/_search-plugins/ltr/training-models.md b/_search-plugins/ltr/training-models.md index e5dc63c7b2..0a0bcd07c7 100644 --- a/_search-plugins/ltr/training-models.md +++ b/_search-plugins/ltr/training-models.md @@ -17,9 +17,9 @@ an extensive overview) and then dig into uploading a model. ## RankLib training -We provide two demos for training a model. A fully-fledged [Ranklib +We provide two demos for training a model. A fully-fledged [RankLib Demo](http://github.com/o19s/elasticsearch-learning-to-rank/tree/master/demo) -uses Ranklib to train a model from OpenSearch queries. You can see +uses RankLib to train a model from OpenSearch queries. You can see how features are [logged](http://github.com/o19s/elasticsearch-learning-to-rank-learning-to-rank/tree/master/demo/collectFeatures.py) and how models are @@ -36,16 +36,16 @@ Here for query id 1 (Rambo) we've logged features 1 (a title `TF*IDF` score) and feature 2 (a description `TF*IDF` score) for a set of documents. In [train.py](http://github.com/o19s/elasticsearch-learning-to-rank/demo/train.py) -you'll see how we call Ranklib to train one of it's supported models +you'll see how we call RankLib to train one of it's supported models on this line: cmd = "java -jar RankLib-2.8.jar -ranker %s -train%rs -save %s -frate 1.0" % (whichModel, judgmentsWithFeaturesFile, modelOutput) Our "judgmentsWithFeatureFile" is the input to RankLib. Other -parameters are passed, which you can read about in [Ranklib's +parameters are passed, which you can read about in [RankLib's documentation](https://sourceforge.net/p/lemur/wiki/RankLib/). -Ranklib will output a model in it's own serialization format. For +RankLib will output a model in it's own serialization format. For example a LambdaMART model is an ensemble of regression trees. It looks like: @@ -65,8 +65,8 @@ like: Notice how each tree examines the value of features, makes a decision based on the value of a feature, then ultimately outputs the relevance score. You'll note features are referred to by ordinal, starting by -"1" with Ranklib (this corresponds to the 0th feature in your feature -set). Ranklib does not use feature names when training. +"1" with RankLib (this corresponds to the 0th feature in your feature +set). RankLib does not use feature names when training. ## XGBoost example @@ -130,10 +130,10 @@ to upload it to OpenSearch LTR. Models are uploaded specifying the following arguments - The feature set that was trained against -- The type of model (such as ranklib or xgboost) +- The type of model (such as RankLib or XGBoost) - The model contents -Uploading a Ranklib model trained against `more_movie_features` looks +Uploading a RankLib model trained against `more_movie_features` looks like: ```json