Skip to content
This repository has been archived by the owner on Dec 3, 2019. It is now read-only.

Don't fail on first error #138

Merged
merged 2 commits into from
Apr 19, 2018
Merged

Conversation

stevenkaras
Copy link
Contributor

When profiling a long lived multithreaded python process, sometimes the stack can't be read correctly.

This change simply handles such errors by logging the failure to stderr, and continues sampling.

Doesn't solve the underlying issue from #129, but it works around such failures (under the assumption it's ok to lose a handful of samples out of a few thousand)

@CLAassistant
Copy link

CLAassistant commented Jan 17, 2018

CLA assistant check
All committers have signed the CLA.

@eklitzke
Copy link
Contributor

Apologies for the delay on looking at this. In general I am kind of hesitant about these types of changes because it can lead to a situation where you just fail in a tight loop. Is this really just a rare failure, or does it keep happening?

@stevenkaras
Copy link
Contributor Author

I am able to reproduce the issue on a regular basis. It typically happens every few minutes when sampling over a long period of time. I agree that if the first cycle fails it probably indicates a real issue, so perhaps some logic to fail loudly for that case?

A better approach would be to include these as "failed" samples in the profiling data.

@eklitzke
Copy link
Contributor

In that case your suggestion of reporting them as failed makes sense, I would accept that. Should be pretty easy to do, see how idle is reported. I also updated Travis last night so the tests should pass if you get the latest changes in master.

@eklitzke
Copy link
Contributor

I was thinking about doing a small release (v1.6.4) to include #144, as I've gotten a number of complaints about the default sample rate setting. I will hold off on that for a bit so we can get this change in as well.

@stevenkaras
Copy link
Contributor Author

I set it up to log both to the summary as (failed), and when dumping with timestamps as (failed). I don't use chrome's cpu profiler, so I'd prefer to leave it to someone else to sort out that integration.

@eklitzke
Copy link
Contributor

Thanks for this, I will create a new release with this change.

@eklitzke eklitzke merged commit 6724797 into uber-archive:master Apr 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants