Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ContentTooLongException: entity content is too long [8175] for the configured buffer limit [104857600] #1370

Open
milind-rao-efx opened this issue Dec 23, 2024 · 1 comment
Labels
bug Something isn't working untriaged

Comments

@milind-rao-efx
Copy link

What is the bug?

When I use the java client to run an MSearch query, I get an error "ContentTooLongException: entity content is too long [8175] for the configured buffer limit [104857600]".

The default HeapBufferedResponseConsumerFactory is hardcoded to use a buffer of 100MB. There is no way to set the buffer limit in the builder. There is a convoluted way to change the value. See below.

How can one reproduce the bug?

Attempt to use a MSearch query with the returned data size greater than 100MB

What is the expected behavior?

I should be able to set the buffer size in the builder and successfully retrieve data greater than 100MB

What is your host/environment?

RHEL, 8.10 (Ootpa)

Do you have any screenshots?

Caused by: org.apache.hc.core5.http.ContentTooLongException: entity content is too long [8175] for the configured buffer limit [104857600] at org.opensearch.client.transport.httpclient5.internal.HeapBufferedAsyncEntityConsumer.data(HeapBufferedAsyncEntityConsumer.java:101) ~[opensearch-java-2.14.0.jar:?] at org.apache.hc.core5.http.nio.entity.AbstractBinDataConsumer.consume(AbstractBinDataConsumer.java:75) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.http.nio.support.AbstractAsyncResponseConsumer.consume(AbstractAsyncResponseConsumer.java:134) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.client5.http.impl.async.HttpAsyncMainClientExec$1.consume(HttpAsyncMainClientExec.java:243) ~[httpclient5-5.2.1.jar:5.2.1] at org.apache.hc.core5.http.impl.nio.ClientHttp1StreamHandler.consumeData(ClientHttp1StreamHandler.java:255) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.http.impl.nio.ClientHttp1StreamDuplexer.consumeData(ClientHttp1StreamDuplexer.java:354) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.http.impl.nio.AbstractHttp1StreamDuplexer.onInput(AbstractHttp1StreamDuplexer.java:324) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.http.impl.nio.AbstractHttp1IOEventHandler.inputReady(AbstractHttp1IOEventHandler.java:64) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.http.impl.nio.ClientHttp1IOEventHandler.inputReady(ClientHttp1IOEventHandler.java:41) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.ssl.SSLIOSession.decryptData(SSLIOSession.java:609) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.ssl.SSLIOSession.access$200(SSLIOSession.java:74) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.ssl.SSLIOSession$1.inputReady(SSLIOSession.java:202) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.InternalDataChannel.onIOEvent(InternalDataChannel.java:142) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.InternalChannel.handleIOEvent(InternalChannel.java:51) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.SingleCoreIOReactor.processEvents(SingleCoreIOReactor.java:178) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.SingleCoreIOReactor.doExecute(SingleCoreIOReactor.java:127) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.AbstractSingleCoreIOReactor.execute(AbstractSingleCoreIOReactor.java:86) ~[httpcore5-5.2.5.jar:5.2.5] at org.apache.hc.core5.reactor.IOReactorWorker.run(IOReactorWorker.java:44) ~[httpcore5-5.2.5.jar:5.2.5] at java.lang.Thread.run(Thread.java:1589) ~[?:?]

Do you have any additional context?

I used the following code to set the buffer size

private static class BigBufferTransportOptions extends RestClientOptions
{
    private final Map<String, String> params;

    public BigBufferTransportOptions(RequestOptions theOptions)
    {
        super(theOptions);
        params = Collections.emptyMap();
    }

    @Override
    public Map<String, String> queryParameters() {
        return params;
    }
}
    ApacheHttpClient5TransportBuilder theBuilder =
        ApacheHttpClient5TransportBuilder.builder(theHttpHosts);
       
    theBuilder.setHttpClientConfigCallback(httpClientBuilder -> {
        TlsStrategy theTlsStrategy =
            ClientTlsStrategyBuilder.create()
                                    .setSslContext(theSSLContext)
                                    .build();
           
        PoolingAsyncClientConnectionManager theConnectionManager =
            PoolingAsyncClientConnectionManagerBuilder.create()
                                                      .setTlsStrategy(theTlsStrategy)
                                                      .build();

        return httpClientBuilder.setDefaultCredentialsProvider(theCredentialsProvider)
                                    .setConnectionManager(theConnectionManager);
    });
       
    OpenSearchTransport theTransport = theBuilder.build();

    // We need to increase the buffer size that is hard coded to
    // 100 MB in the HeapBufferedResponseConsumerFactory class
    // Set Buffer limit to 200 MB
    int BUFFER_SIZE = 200 * 1024 * 1024;
    HttpAsyncResponseConsumerFactory theFactory =
        new HeapBufferedResponseConsumerFactory(BUFFER_SIZE);
    RequestOptions.Builder theRequestOptionsBuilder =
        RequestOptions.DEFAULT.toBuilder();
    theRequestOptionsBuilder.setHttpAsyncResponseConsumerFactory(theFactory);
    BigBufferTransportOptions theTransportOptions =
        new BigBufferTransportOptions(theRequestOptionsBuilder.build());
       
    iOpenSearchClient = new OpenSearchClient(theTransport, theTransportOptions);

This didn't work. I still got the 100MB error. This is due to I believe a bug in the following code.

In the ApacheHttpClient5Transport constructor,

this.transportOptions = (options == null) ? ApacheHttpClient5Options.initialOptions() : ApacheHttpClient5Options.of(options);

And in the ApacheHttpClient5Options class

static ApacheHttpClient5Options of(TransportOptions options) {
    if (options instanceof ApacheHttpClient5Options) {
        return (ApacheHttpClient5Options) options;
    } else {
        final Builder builder = new Builder(DEFAULT.toBuilder());
        options.headers().forEach(h -> builder.addHeader(h.getKey(), h.getValue()));
        options.queryParameters().forEach(builder::setParameter);
        builder.onWarnings(options.onWarnings());
        return builder.build();
    }
}

If the class is not ApacheHttpClient5Options (as in my case), the headers, query parameters etc are copied over, but the HttpAsyncResponseConsumerFactory is ignored.

I did get it working with the following code

private TransportOptions getTransportOptions(OpenSearchTransport theTransport,
                                             int theBufferSize)
{
    // We need to increase the buffer size that is hard coded to
    // 100 MB in the HeapBufferedResponseConsumerFactory class
    // Set Buffer limit to the specified value in application.properties
    theBufferSize = theBufferSize * 1024 * 1024;
    HeapBufferedResponseConsumerFactory theFactory =
        new HeapBufferedResponseConsumerFactory(theBufferSize);
    TransportOptions theDefaultOptions = theTransport.options();
    ApacheHttpClient5Options.Builder theOptionsBuilder =
        (Builder) theDefaultOptions.toBuilder();
    theOptionsBuilder.setHttpAsyncResponseConsumerFactory(theFactory);
    TransportOptions theTransportOptions = theOptionsBuilder.build();
   
    return theTransportOptions;
}

There is no way to set the transport options in the builder directly. So we have to first build the transport from the builder, then get the options from the transport, set the bigger buffer and then set the transport options on the client.

This is needlessly complicated. So, either this is a miss or an intentional effort to not increase the default 100MB buffer for performance or other reasons. I’d like to know which it is. If this is intentional, then what is the suggested advice for using MSearch? If it’s not intentional, then we need a method in the Builder to set the buffer.

@milind-rao-efx milind-rao-efx added bug Something isn't working untriaged labels Dec 23, 2024
@reta
Copy link
Collaborator

reta commented Dec 23, 2024

@milind-rao-efx do you mind submitting the pull request with the fix? thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged
Projects
None yet
Development

No branches or pull requests

2 participants