Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error indexing posts: Limit of total fields in index has been exceeded #101

Open
danielbachhuber opened this issue Sep 28, 2018 · 2 comments

Comments

@danielbachhuber
Copy link

When I run:

$ wp searchpress index --flush --put-mapping

I eventually see this error:

Warning: Error indexing post 5021340; HTTP response code: 400; Data: {"index":{"_index":"bgr.test","_type":"post","_id":"5021340","status":400,"error":{"type":"illegal_argument_exception","reason":"Limit of total fields [1000] in index [bgr.test] has been exceeded"}}}

The stats end up as:

Completed page 1/1 (29.36s / 101.48M current / 157.01M max), 7s remaining
Success: Index Complete!
1481	posts processed
1081	posts indexed
400	errors/warnings
Replacing the default search with SearchPress...
Success: Successfully activated SearchPress!

Any ideas why there's a cap?

@mboynes
Copy link
Contributor

mboynes commented Sep 28, 2018

Yes, Elasticsearch 5.0+ instituted a schema field cap and an accompanying index setting, ultimately for performance. We haven't decided how exactly SearchPress should address that in the long-term, but here's a fix in the interim:

/**
 * Bump up field limit.
 *
 * @param array $mapping ES mapping.
 * @return array Updated mapping.
 */
add_filter( 'sp_config_mapping', function( $mapping ) {
	$mapping['settings']['index']['mapping']['total_fields']['limit'] = 5000;
	return $mapping;
} );

Ultimately, this happens because of the way SearchPress indexes post meta. For each unique meta key, the Elasticsearch schema gets upwards of 8 fields added. We do this to allow us to maintain parity with WP_Query, since ES doesn't (natively) do runtime type casting. The only time we've ever seen performance issues from this is in a case where a site was setting dynamic meta keys (e.g. related-post-{$post_id}) and not filtering them out prior to indexing -- that led to the ES schema inflating to hundreds of thousands of fields before it caused issues.

We have a few ideas for long-term fixes, but we need to do some benchmarking first. I'll leave this open until we do merge in a long-term fix.

@danielbachhuber
Copy link
Author

danielbachhuber commented Sep 28, 2018

We haven't decided how exactly SearchPress should address that in the long-term, but here's a fix in the interim:

Thanks! The filter works for my needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants