-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update limit for es response #4303
Conversation
315212d
to
8f47923
Compare
conn.indices.put_settings(body={"index": {"max_result_window": settings.MAX_RESULT_WINDOW}}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is global setting, it should be set only once. it should not be set in a periodically running management command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setting is added to each index separately. Having it in this place specifically works as all the new/updated indexes will have this setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the operation idempotent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is.
c58dc9f
to
b24cdba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need it in 2 management commands? Wouldn't the settings update when only one runs? Would it be nicer to extract this into a separate management command?
To update this setting for both cases where we're either updating (through update_index) or creating new/rebuilding indexes (through search_index).
In case when we create new index, it won't have the settings. These settings work for pre-existing indexes.
That's debatable although the same command will be needed to run in both of the commands above. |
try: | ||
self.es.indices.put_settings(body={"index": {"max_result_window": settings.MAX_RESULT_WINDOW}}) | ||
except NotFoundError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally am not a fan of this, specially in unit tests. Its a recipe for weirdness and unexpected behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, no setting is added when there is no index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under what Testcases is this valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see any tests for this. Not sure why this should be required in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some testing, can you try this in setUpclass? I still dont understand why this is needed for tests as there are not any items in unit tests thaa can cross the threshold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The settings needs to be updated bcz I changed ELASTICSEARCH_DSL_QUERYSET_PAGINATION
, ELASTICSEARCH_DSL_LOAD_PER_QUERY
to 15000 in base.py
.
Should I set this to 10k for tests only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, yeah I think we can keep the settings as it is for tests and not complicate this unnecessarily.
@@ -8,6 +9,7 @@ | |||
from course_discovery.apps.course_metadata.tests.factories import CourseFactory, CourseRunFactory | |||
|
|||
|
|||
@pytest.mark.django_db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed? I am assuming APITestCase is child of django or DRF's TestCase which takes care of Django DB ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, it can be removed. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, go ahead then.
try: | ||
es_connection = get_connection() | ||
es_connection.indices.put_settings(body={"index": {"max_result_window": settings.MAX_RESULT_WINDOW}}) | ||
except NotFoundError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what cases will NotFoundError raise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case if no index is created, we won't be adding this setting to any index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? This is being done inside _create. Under what circumstances will we not create an index despite calling create?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to put this logic inside the ElasticSearchUtils
class. Then you might be able to get by by calling it within ElasticsearchUtils.create_index
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zawan-ila can you elaborate on where I should put these settings in the utils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what @zawan-ila is referring to https://github.com/openedx/course-discovery/blob/master/course_discovery/apps/core/utils.py#L28. When calling create_index, this result_window update can be done there (possibly without try/catch?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried adding this settings isn't being added for all indexes. Seems like it needs to be added after all the indexes and aliases are defined.
try: | ||
self.es.indices.put_settings(body={"index": {"max_result_window": settings.MAX_RESULT_WINDOW}}) | ||
except NotFoundError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see any tests for this. Not sure why this should be required in the first place?
try: | ||
es_connection = get_connection() | ||
es_connection.indices.put_settings(body={"index": {"max_result_window": settings.MAX_RESULT_WINDOW}}) | ||
except NotFoundError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to put this logic inside the ElasticSearchUtils
class. Then you might be able to get by by calling it within ElasticsearchUtils.create_index
.
@@ -127,6 +127,8 @@ def _update(self, models, options): | |||
for index_alias_mapper in alias_mappings: | |||
index_alias_mapper.registered_index._name = index_alias_mapper.alias # pylint: disable=protected-access | |||
|
|||
conn.indices.put_settings(body={"index": {"max_result_window": settings.MAX_RESULT_WINDOW}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my comment above, can we place this inside ElasticsearchUtils
?
3cc82b8
to
bee2823
Compare
@@ -2624,8 +2624,7 @@ def get_expected_data(cls, person, request): | |||
|
|||
|
|||
@pytest.mark.django_db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might not need django_db mark if you are already using Django testcase.
69709a0
to
141c3c9
Compare
8d5f9cc
to
7b2a58e
Compare
@@ -117,11 +117,13 @@ def _update(self, models, options): | |||
) | |||
if record_count_is_sane: | |||
ElasticsearchUtils.set_alias(conn, alias, new_index_name) | |||
ElasticsearchUtils.update_max_result_window(conn, settings.MAX_RESULT_WINDOW, new_index_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be really nice if we could do this just once at the start of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be done only after set_alias
.
PROD-3986
Adds a setting to
update_index
. This will increase themax_result_window
for all existing indexes and subsequently update any newly created index as well.Also updates the unit test that previously didn't support a custom value ofmax_result_window
in ES related logic.Keeping the same limit count for unit tests as we won't reach this limit while testing.