Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

link rel= detection misses some links #22

Open
tmcw opened this issue Jul 30, 2024 · 1 comment
Open

link rel= detection misses some links #22

tmcw opened this issue Jul 30, 2024 · 1 comment
Labels
Module: Parsers Extracting information from raw data Type: Bug

Comments

@tmcw
Copy link

tmcw commented Jul 30, 2024

I think this is the code that's looking for a rel=me twitter account:

const match2 = html.match(

My website does have such a meta tag, but it's all HTML5'd out - uses optional quotes and an implied self-closing tag, so this doesn't get caught:

<link href=https://mastodon.social/@tmcw rel=me>

Probably this is one of the more tricky variations to catch, but ideally different attribute order is okay with shovel's parser.

@jmduke jmduke added Type: Bug Module: Parsers Extracting information from raw data labels Jul 30, 2024
@jmduke
Copy link
Member

jmduke commented Jul 30, 2024

Yup, we should absolutely use node-html-parser (which we already have depped) for this. T/Y for flag!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module: Parsers Extracting information from raw data Type: Bug
Projects
None yet
Development

No branches or pull requests

2 participants