Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rss格式问题 #28

Open
H4lo opened this issue Aug 5, 2024 · 0 comments
Open

rss格式问题 #28

H4lo opened this issue Aug 5, 2024 · 0 comments

Comments

@H4lo
Copy link

H4lo commented Aug 5, 2024

yarb.py中,在parseThread中解析rss xml的内容时,有些updated_parsed字段会放在feed块中,而不在entries中,就会报错:


'entries': [
]

...


'feed': {
        'title': 'Talkback Tech',
        'title_detail': {'type': 'text/plain', 'language': None, 'base': '', 'value': 'Talkback Tech'},
        'links': [
            {'rel': 'alternate', 'type': 'text/html', 'href': 'https://talkback.sh/tech/feed/'},
            {'href': 'https://talkback.sh/tech/feed/', 'rel': 'self', 'type': 'application/atom+xml'}
        ],
        'link': 'https://talkback.sh/tech/feed/',
        'subtitle': 'Latest technical resources on Talkback',
        'subtitle_detail': {'type': 'text/html', 'language': None, 'base': '', 'value': 'Latest technical resources on Talkback'},
        'language': 'en-us',
        'updated': 'Mon, 05 Aug 2024 03:08:08 +0000',
        'updated_parsed': time.struct_time(tm_year=2024, tm_mon=8, tm_mday=5, tm_hour=3, tm_min=8, tm_sec=8, tm_wday=0, tm_yday=218, tm_isdst=0)
    },

这里加上对d变量的检查,将d变量从feed块中取。
同时有些rss订阅只会有当天发布的链接,这里将当天和昨天发布的链接都放在一起防止抓不到当天的订阅内容:

...
        for entry in r.entries:
            d = entry.get('published_parsed') or entry.get('updated_parsed')

+            if(not d):
+               d = (r.feed.updated_parsed)
            yesterday = datetime.date.today()# + datetime.timedelta(-1)
            pubday = datetime.date(d[0], d[1], d[2])
-            if (pubday == yesterday) and filter(entry.title):
+           if (pubday == yesterday or datetime.date.today()+datetime.timedelta(-1) == pubday) and filter(entry.title):
                item = {entry.title: entry.link}
                # print(item)
                result |= item

...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant