Question about large spark application's being throttled #661

KiritoCurry · 2020-03-07T00:51:00Z

Hi I'm kinda new to dr elephant and when I was deploying and testing it on my machines, I found large spark application logs (large than 100MB by default) will be ignored and won't show up in the UI due to the throttle behavior. There might be something I missed, but based on the code, does it mean dr elephant will skip all spark applications whose size is larger than eventLogSizeLimitMb (by default 100MB)? Is this how dataCollection.throttle() expected to behave? If my understanding is wrong, can someone tell me how the throttle works for large spark applications? If my understanding is right, is there any remarkable bottleneck on dr elephant for large spark applications? I think it's easy for a spark application log to go beyond several GB, and how dr elephant's gonna solve it? Thanks in advance for any helps and suggestions!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about large spark application's being throttled #661

Question about large spark application's being throttled #661

KiritoCurry commented Mar 7, 2020

Question about large spark application's being throttled #661

Question about large spark application's being throttled #661

Comments

KiritoCurry commented Mar 7, 2020