Elfeed is too slow when when there are a lot of rss list #317

testinggithub1222 · 2019-05-02T02:32:20Z

I have been using elfeed for a while now and build up like 30 to 50 rss list. Whenever I call elfeed-update to update everything, it takes quite a long time and hang emacs during the time. It might be 20mn, i am not sure. It consumes 100% of cpu and 54% of my ram which should be 4gb of 8gb of ram and garbage collection always collect. So, my question is, is there a way to make this run faster without interruption or hang emacs? Is that parsing xml consume that much cpu and took so long period of time? Isn't there async process?

skeeto · 2019-05-02T13:08:51Z

Fetching 30 to 50 feeds should only disrupt Emacs for a couple of seconds, so this isn't normal. The very first thing to check is that Elfeed is using curl to fetch feeds. The variable elfeed-use-curl indicates whether curl will be used or not, and it will automatically set to t if curl was found when Elfeed was loaded. If it's nil, you should install curl and ensure it's on your PATH. This has two major benefits: * url-retrieve isn't nearly as asynchronous as advertised, especially for certain platforms, but curl is always fully asynchronous. The synchronous parts of url-retrieve will block Emacs, but not make it use 100% CPU. * url-queue-retrieve doesn't allow for custom headers, which means Elfeed can't take advantage of ETag or If-Modified-Since. These features prevent Elfeed from reparsing feeds that haven't changed. (It's also much nicer to servers.) Elfeed spends most of its time parsing XML, and XML parsing is not done asynchronously, so this is important. If you're not using curl, Emacs will spend more time at 100% CPU when fetching. Also make sure Elfeed has been byte-compiled. This should happen automatically if you installed it via package.el. Since you're seeing such excessive memory use, it sounds like one of the feeds you're fetching may be humongous. A casual experiment suggests that, on 64-bit computers, the s-expression representation of an RSS feed is about four times larger than the XML content. This is due to all the pointer overhead (lots of cons cells for each element). But if you're seeing 4GB of memory use, then that puts the feed at about 800MB (buffer + s-exp = 4GB), which seems unlikely. Even more so for the 8GB case. Take a look at each of the feeds in your list and see if any of them are particularly large. Also have Elfeed fetch them one at a time to see which one causes the problem. If you can narrow it down to a particular feed, I'd like to know about it.

testinggithub1222 · 2019-05-03T01:50:12Z

@skeeto, I have check elfeed-use-curl variable and it return true and it's already byte-compile as i seen .elc file there in my elfeed package dir.

Last night i have tried it again on my another more power computer and give gc-cons-threshold 8gb of ram as i got 16gb there and yes it took all the 8gb with 100% cpu again.

Also have Elfeed fetch them one at a time to see which one causes the problem

I will tried them. I doubt if reddit feed is the cause as i got many of them.

testinggithub1222 · 2019-05-03T01:59:52Z

@skeeto, I have set my feed list to empty as below:

(setq elfeed-feeds
        '())

and then tried to call elfeed-update to see the cpu time but it still took 89% of cpu again. I doubt this might be because i have too many feeds (when calling elfeed). I got 6004 list of feed there listing from 2019-05-02 to 2018-12-16

skeeto · 2019-05-03T17:21:20Z

When you set elfeed-feeds to an empty list, did you evaluate that expression before using elfeed-search-fetch (G) or elfeed-update? That should be a no-op when elfeed-feeds is empty. The other thing to check is your search filter. The search buffer listing is updated in full every time a feed completes. If your filter is blank so that every entry in the database is listed, that will waste a lot of time recomputing the listing again and again as feeds complete. Make sure you have a time cutoff (e.g. @1-week-ago), and sooner is better.

testinggithub1222 · 2019-05-07T07:32:12Z

I have tried delete ~/.elfeed folder in the purpose of testing this. After deleted this folder, elfeed is empty so then i tried to call elfeed-update to update my feed. But before that, I have already set filtering only show feed in the last 3 months. So, during the elfeed-update at the first time, everything seem smooth but after 1000 record was fetched, thing start to slow down. It took a while until reach 2000 record and it stuck there for so long but no more record is added beyond this. I doubt if it still fetch more data but only show the last 3 months of record.

If you don't mind, here is my configuration:

(use-package elfeed
  :ensure t
  :config
  (setq 
elfeed-search-filter "@3-months-ago "
elfeed-feeds
'("https://www.reddit.com/r/programming/.rss" "https://www.xda-developers.com/feed/" "https://www.reddit.com/r/science/.rss" "https://www.reddit.com/r/kurzgesagt/.rss" "https://www.reddit.com/r/worldnews/.rss" "https://www.reddit.com/r/todayilearned/.rss" "https://www.reddit.com/r/LifeProTips/.rss" "https://www.reddit.com/r/explainlikeimfive/.rss" "https://www.reddit.com/r/DIY/.rss" "https://www.reddit.com/r/technology/.rss" "https://www.reddit.com/r/Android/.rss" "https://www.reddit.com/r/linux/.rss" "https://www.reddit.com/r/archlinux/.rss" "https://www.reddit.com/r/gadgets/.rss" "https://www.reddit.com/r/IAmA/.rss" "https://www.reddit.com/r/Futurology/.rss" "https://www.reddit.com/r/AskMen/.rss" "https://www.reddit.com/r/AskWomen/.rss" "https://www.reddit.com/r/vim/.rss" "https://www.reddit.com/r/YouShouldKnow/.rss" "https://www.reddit.com/r/learnprogramming/.rss" "https://www.reddit.com/r/investing/.rss" "https://www.reddit.com/r/hacking/.rss" "https://www.reddit.com/r/javascript/.rss" "https://www.reddit.com/r/java/.rss" "https://www.reddit.com/r/Piracy/.rss" "https://www.reddit.com/r/homeautomation/.rss" "https://www.reddit.com/r/quotes/.rss" "https://www.reddit.com/r/androidapps/.rss" "https://www.reddit.com/r/HowToHack/.rss" "https://www.reddit.com/r/arduino/.rss" "https://www.reddit.com/r/startup/.rss" "https://www.reddit.com/r/Entrepreneur/.rss" "https://lifehacker.com/tag/linux/rss" )))

skeeto · 2019-05-10T21:08:31Z

Everything seems to work fine when I use this configuration directly, but I can see how it would slow down over time. Several of these feeds are very, very busy with several new entries per minute. After a few days of regular pulling, you're going to end up with _tons_ of entries. Elfeed can handle this, but only if you don't always ask to see so much of it all at once. Your broad default filter is constantly refilling and redrawing the search listing, which puts a huge load on Emacs. That's the real issue here. Change the default filter to something like "@3-days-ago +unread" to significantly constrain what's being shown. FYI, I tested your configuration by putting it (minus the use-package part), in a file named tmp.el in the repository, then: $ make clean $ HOME=. make virtual ARGS='-l tmp.el' That provides a clean, empty, isolated, temporary test environment. The special "virtual" target in the Makefile imports a copy of your real database by default, so the HOME=. part stops this from happening.

testinggithub1222 · 2019-05-16T09:44:46Z

Sorry for late reply i was so busy in this month.
By the way, I am not sure what or how could i do your given idea. I have read it many time but still not sure about it. Should i download this Makefile from elfeed repo and then execute your given script:

$ make clean
$ HOME=. make virtual ARGS='-l tmp.el'

And what does it mean by? How it is really work?

That provides a clean, empty, isolated, temporary test environment. The
special "virtual" target in the Makefile imports a copy of your real
database by default.

And especially, this last word, how does it working and what should i do with it?:

so the HOME=. part stops this from happening

skeeto · 2019-05-16T16:31:54Z

I was just describing how I was testing in isolation so you, or anyone else following along, could reproduce it my tests if needed. You can just ignore that if it doesn't make sense.

testinggithub1222 · 2019-05-21T01:26:35Z

I have tried to reduce to "@15-days-ago +unread" and feed show around a thousand. Around this number, it seem works good.

mssdvd · 2021-12-05T17:41:36Z

@testinggithub1222 are you using Flycheck? If so, #448 could fix your problem.

jiacai2050 mentioned this issue May 5, 2022

Add mode-class property where appropriate #448

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elfeed is too slow when when there are a lot of rss list #317

Elfeed is too slow when when there are a lot of rss list #317

testinggithub1222 commented May 2, 2019

skeeto commented May 2, 2019 via email

testinggithub1222 commented May 3, 2019 •

edited

Loading

testinggithub1222 commented May 3, 2019

skeeto commented May 3, 2019 via email

testinggithub1222 commented May 7, 2019

skeeto commented May 10, 2019 via email

testinggithub1222 commented May 16, 2019

skeeto commented May 16, 2019 via email

testinggithub1222 commented May 21, 2019

mssdvd commented Dec 5, 2021

Elfeed is too slow when when there are a lot of rss list #317

Elfeed is too slow when when there are a lot of rss list #317

Comments

testinggithub1222 commented May 2, 2019

skeeto commented May 2, 2019 via email

testinggithub1222 commented May 3, 2019 • edited Loading

testinggithub1222 commented May 3, 2019

skeeto commented May 3, 2019 via email

testinggithub1222 commented May 7, 2019

skeeto commented May 10, 2019 via email

testinggithub1222 commented May 16, 2019

skeeto commented May 16, 2019 via email

testinggithub1222 commented May 21, 2019

mssdvd commented Dec 5, 2021

testinggithub1222 commented May 3, 2019 •

edited

Loading