Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editor constantly freezing in a large Rust codebase #19022

Open
1 task done
alexkirsz opened this issue Oct 10, 2024 · 21 comments
Open
1 task done

Editor constantly freezing in a large Rust codebase #19022

alexkirsz opened this issue Oct 10, 2024 · 21 comments
Assignees
Labels
bug [core label] panic / crash [core label] performance Feedback for performance issues, speed, memory usage, etc rust Rust programming language support

Comments

@alexkirsz
Copy link

alexkirsz commented Oct 10, 2024

Check for existing issues

  • Completed

Describe the bug / provide steps to reproduce it

I'm trying out Zed again today after a few months of going back to VS Code because of usability issues with Zed.

We have a very large codebase consisting of some 360 Rust crates (+ about the same amount of Swift and Kotlin code). I'm seeing constant freezes in the editor, where the macOS beach ball appears, and I can't click anywhere or do anything while Zed finishes whatever it's doing.

These freezes usually last from 5s to a minute.

I just had one where Zed was hogging 100% CPU, and I had to kill it after 5 minutes of not responding.

image

Unfortunately, this, alongside the other reported issues, make Zed completely unusable for us. These might be regressions, as I don't recall hitting nearly as many freezes the last time I attempted to switch to Zed.

Environment

Zed: v0.156.1 (Zed)
OS: macOS 14.6.1
Memory: 32 GiB
Architecture: aarch64

If applicable, add mockups / screenshots to help explain present your vision of the feature

No response

If applicable, attach your Zed.log file to this issue.

Zed.log

These were the last logs before Zed froze. I often see these so I don't think they're related.

2024-10-10T14:47:03.460827+02:00 [WARN] Generic lsp request to rust-analyzer failed: content modified
2024-10-10T14:47:03.462978+02:00 [ERROR] content modified
2024-10-10T14:47:03.563114+02:00 [INFO] Summarizing updated entries took 3.458µs
2024-10-10T14:47:03.676028+02:00 [ERROR] failed to fetch cached embeddings via cloud model

Caused by:
    RPC request GetCachedEmbeddings failed: permission denied
2024-10-10T14:47:05.709405+02:00 [WARN] Generic lsp request to rust-analyzer failed: content modified
2024-10-10T14:47:05.709505+02:00 [WARN] Generic lsp request to rust-analyzer failed: content modified
2024-10-10T14:47:05.7103+02:00 [ERROR] content modified
2024-10-10T14:47:05.974646+02:00 [INFO] Summarizing updated entries took 4.208µs
2024-10-10T14:47:06.12069+02:00 [ERROR] failed to fetch cached embeddings via cloud model

Caused by:
    RPC request GetCachedEmbeddings failed: permission denied
2024-10-10T14:47:11.829188+02:00 [ERROR] no worktree found for diagnostics path "/Users/alexandrekirszenberg/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.1/src/macros/select.rs"
2024-10-10T14:47:11.992336+02:00 [ERROR] no worktree found for diagnostics path "/Users/alexandrekirszenberg/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.1/src/macros/select.rs"
2024-10-10T14:47:11.993259+02:00 [ERROR] no worktree found for diagnostics path "/Users/alexandrekirszenberg/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.1/src/macros/select.rs"
@alexkirsz alexkirsz added admin read Pending admin review bug [core label] triage Maintainer needs to classify the issue labels Oct 10, 2024
@SomeoneToIgnore
Copy link
Contributor

This report is not actionable without profile sample(s) or at least the repository in question + the sequence of actions to reproduce this.

@alexkirsz
Copy link
Author

alexkirsz commented Oct 10, 2024

@SomeoneToIgnore I'd be happy to provide a profile sample! But I can't share the repository.

Is there a guide on how to best profile Zed on macOS?

@notpeter notpeter added performance Feedback for performance issues, speed, memory usage, etc panic / crash [core label] rust Rust programming language support and removed triage Maintainer needs to classify the issue admin read Pending admin review labels Oct 10, 2024
@SomeoneToIgnore
Copy link
Contributor

Nice.

Then it's the only way, really: you need Instruments (comes with macOS devtools for free) and do similar things you'd do with other profiles of that kind:

  • open Zed

  • open Instruments, pick a time profiler:
    image

  • Select Zed in a somewhat convoluted UI (shows process path on hover):
    image

  • Press record, switch over to Zed and start doing things that are slow.
    Ideally, there has to be some pattern you do things; and some "before was good" -> "now it's bad" scenario in that patter, but anything works, really.

  • After a few seconds of reproducing the issue (does not make any sense to record sampling of a minute of freezing, but one or two beachballs + a few seconds around it seems fine?), switch back to the profiler and hit stop

  • You'll see something like this (albeit mine is normal and very short, yours will be longer and much more red)

image

  • hit cmd-s and save the trace — that is what I need from you (one or multiple; I expect this won't be a miraculous fix so I'll have to come back for you multiple times)

  • bonus points: check out that heaviest stack trace on the right and try to make a sense of what is exactly within Zed logic code takes the longest (note that multiple threads may have something stuck)

@alexkirsz
Copy link
Author

Thanks for the guide, I'll go through this in a couple of hours.

Should I build Zed with debug symbols or are you good with the latest release?

@SomeoneToIgnore
Copy link
Contributor

Good point, I expect it to be fine but you can also try the --profile release-fast and see if the traces are more readable this way.

@gitsmol
Copy link

gitsmol commented Oct 14, 2024

I'm seeing the same thing. It happened for me when I split my monorepo into a workspace with multiple crates. The project isn't huge, 15k loc in 110 files. I'm refactoring a lot of code and now whenever I break something (e.g. I change the function signature in a trait) and enter the diagnostics window, Zed hangs for a while.

I suspect this is related to #18658 and I also recall seeing a (fixed) bug in the issue tracker that relates to the order of operations when saving files and Zed/LSP polling for changed files?

@alexkirsz
Copy link
Author

On my end, apologies for not getting back to this sooner. I've changed quite a few things in my Zed setup, and I'm no longer running into this issue. I will update this thread once I can reproduce the issue consistently.

@alexkirsz
Copy link
Author

alexkirsz commented Oct 17, 2024

Hey @SomeoneToIgnore! I recorded two runs: one of them has small hangs, and the other has Zed completely frozen for a while (probably forever, but I killed it after 10 minutes).

The heaviest stack trace is <gpui[5720f5226d2db6bd]::app::entity_map::AnyWeakModel>::upgrade.

Here's the trace: https://www.icloud.com/iclouddrive/05bETfxUb_6jBA3BiGfGexrDw#Zed_Hangs

Let me know if you need me to upload it somewhere else. GitHub unfortunately won't accept it as it's 48MB.

EDIT: Recorded a third run with a forever hang: https://www.icloud.com/iclouddrive/027klYrmex898JZFKbbyrq1jw#Zed_Hangs_2

@gitsmol
Copy link

gitsmol commented Oct 17, 2024

I'm seeing the same thing, the AnyWeakModel::upgrade call hangs Zed. Not indefinitely, but long enough for me to kill and restart most times. Here's another trace:
http://polyprax.nl/pub/zed_hangs_1_20241016_1826.trace.zip

@Goffen
Copy link

Goffen commented Oct 21, 2024

We only have one crate any only a medium sized repository and I get hangs aswell. Rust analyser + Zed borking out

edit: I think for me this was related to disk also running out of space

@SomeoneToIgnore
Copy link
Contributor

Thank you, there seems to be a collection of issues, I see at least two, but all somehow related to diagnostics.

Here's one from the long hangs, I won't cover it yet, as most of the comments around wanted to look at AnyWeakModel::upgrade (and it also shows in the 2nd profile)
second

So let's look at
first

The upgrade is a red herring, it is very visible on the trace but bear in mind that you're looking at some kind of a projection/sample of things, so it lacks details about how much certain things were called in the course of sampling.

upgrade is a weak pointer turned into a strong pointer, not much chance to become slow in one particular case:

pub fn upgrade(&self) -> Option<AnyModel> {
let ref_counts = &self.entity_ref_counts.upgrade()?;
let ref_counts = ref_counts.read();
let ref_count = ref_counts.counts.get(self.entity_id)?;
// entity_id is in dropped_entity_ids
if ref_count.load(SeqCst) == 0 {
return None;
}
ref_count.fetch_add(1, SeqCst);
drop(ref_counts);
Some(AnyModel {
entity_id: self.entity_id,
entity_type: self.entity_type,
entity_map: self.entity_ref_counts.clone(),
#[cfg(any(test, feature = "test-support"))]
handle_id: self
.entity_ref_counts
.upgrade()
.unwrap()
.write()
.leak_detector
.handle_created(self.entity_id),
})
}

Such "fundamental" calls being "slow" on the profile usually indicate some hot loop around it.
On the same screenshot, above the upgrade call in the hottest trace, there's a ProjectDiagnosticsEditor::new_with_context and by looking at this, there's an only place that's able to produce such stacktraces:

_update_excerpts_task: cx.spawn(move |this, mut cx| async move {
while let Some((path, language_server_id)) = update_excerpts_rx.next().await {
if let Some(buffer) = project_handle
.update(&mut cx, |project, cx| project.open_buffer(path.clone(), cx))?
.await
.log_err()


I think this is a good place to inspect further, and the profile cannot lead us further, so we need to compile Zed and debug (or use dbg!) it during the slow part.

  • One way to investigate is the pub fn get_by_path and see how many buffers are there on the moment of opening.
    I do not expect it to be a problem but worth a check: overall, looking up a buffer by path by iterating all buffers open sounds somewhat dangerous if we somehow "leak" them and keep too many items in that collection.
    Do you have many files open when the problem starts to appear? Or do things start to break only after some time? Then it's definitely worth checking.

  • The main issue still seems to be the diagnostics: I expect some issues with them catching up during ssh remoting, so will try to investigate things my way.

But that means, when things get slow you have the diagnostics panel open? Are things better without it being open?
Do you receive many diagnostics for some files?
Or, if you do not have any such panel open, we need to look there.

Assuming that the diagnostics panel is open and, we need to check out how update_excerpts_rx.next()'s counterpart, update_paths_tx is used — maybe it's overly spammed with diagnostics data somehow.
One of the usages is enqueue_update_stale_excerpts but that seems not too hot, as being called on a ! button toggle and the initial constructions.
But enqueue_update_stale_excerpts is used everywhere, so I would check whether it's overly called somewhere.

@alexkirsz
Copy link
Author

Yes, I think this is linked to the diagnostics panel in most cases for me. It's not the only diagnostics-panel related issue I'm seeing, it may be related to #19019.

I have between 100 and 600 diagnostic issues active at all time, depending on what part of my project I'm currently looking at.
image

@SomeoneToIgnore
Copy link
Contributor

Even when dragging the scrollbar with my mouse, I keep getting sent back to the top

And that sounds like enqueue_update_stale_excerpts in the works, spamming us with the diagnostics updates.
So, now we need to track down that spam source and maybe many diagnostics-related issues are solved.

@alexkirsz
Copy link
Author

How can I help? I don't expect I'll have time to dive into Zed internals to figure out what's going wrong on my end, but I'd be happy to run a test build and try to repro the issue.

@SomeoneToIgnore
Copy link
Contributor

Unfortunately, not sure there's another way except debugging around that method or sharing a project so someone else can debug around.

@plichard
Copy link

plichard commented Oct 24, 2024

Just in case it helps, here's a another trace recorded on linux during what looks like the exact same freeze, while working on a relatively small rust project. I believe I just pressed I to go into insert mode with vim mode.

My project only had 5 warnings at the time, with the diagnostics panel open in another hidden tab. Cannot share the project unfortunately.

I have v0.159.0 installed.

https://share.firefox.dev/4hk75qV

@jorikvanveen
Copy link

jorikvanveen commented Oct 24, 2024

I have (unfortunately) been able to reproduce this issue consistently in this project: https://github.com/jorikvanveen/web-meteen with Zed 0.157.5 on NixOS (full system config can be found on my github page).

It takes about 5 minutes of jumping around and editing random stuff before problems start to occur.

No clue if its related but I am using vim keybinds as well.

I'm on a decently modern Intel laptop with 32GB of ram and plenty of free storage, let me know if more details are needed.

@plichard
Copy link

I have been using Zed the whole day and I did not have a single freeze since disabling vim mode. It could be a coincidence, but usually I would get one freeze per hour at the very least.

@gitsmol
Copy link

gitsmol commented Oct 26, 2024

@plichard I'm not using vim mode so its probably not the sole cause of the freezes.

@SomeoneToIgnore
Copy link
Contributor

Heads-up, there's #18658 which seems to be very close to this issue.
That one is being checked by @\osiewicz , so hopefully some improvements will appear in the future.

ConradIrwin added a commit that referenced this issue Nov 4, 2024
Related to: #19022

Release Notes:

- Improve editor performance with large # of diagnostics.

---------

Co-authored-by: Conrad <[email protected]>
Co-authored-by: Conrad Irwin <[email protected]>
@osiewicz
Copy link
Contributor

osiewicz commented Nov 7, 2024

A fix for this issue is available in latest Preview (0.161.0) build. Check it out and see if it fixes the issue for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug [core label] panic / crash [core label] performance Feedback for performance issues, speed, memory usage, etc rust Rust programming language support
Projects
Development

No branches or pull requests

9 participants