Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't handle cookies where the Domain starts with a dot. #36

Open
TonyDev314 opened this issue Feb 28, 2024 · 7 comments
Open

Doesn't handle cookies where the Domain starts with a dot. #36

TonyDev314 opened this issue Feb 28, 2024 · 7 comments
Labels

Comments

@TonyDev314
Copy link

The AdvancedCookieProvider does not cater for cookies lacking the "www" on the front.

The website I am working with has cookies with the "www.websitename.com" domain and cookies with the ".websitename.com" domain.

On adding the cookies the code currently strips off the "." and when selecting cookies to send on the next request, ignores all these cookies because they don't match the host name.

Should the code be altering the Domain name of cookies??

I think it needs to leave the Domain alone when set, and do a comparison on the basis of the host and the cookie domain excluding a leading "www." or "."

@TonyDev314 TonyDev314 added the bug label Feb 28, 2024
@FlorianRappl
Copy link
Contributor

OK we need a spec to back this up. Can you point to a spec where this is written?

excluding a leading "www." or "."

I can already guarantee you that "www." should not be excluded. It's a subdomain like any other. Regarding the "." I'd like to see a spec that defines such edge cases and how they should be handled. I think this might be a case for some normalization, but I need some spec confirmation.

@TonyDev314
Copy link
Author

TonyDev314 commented Feb 29, 2024 via email

@FlorianRappl
Copy link
Contributor

Perhaps the change needs to be in the matching algorithm and instead of
looking for an absolute match, checking to see if the cookie domain forms
part of the URL domain?

I'd not change the matching algorithm - I'd rather normalize what we see in the domain field. From the RFC it seems that the leading dot is the change necessary, otherwise all algorithms can continue to work as-is. So if we normalize this input from the domain field (only chance a leading dot appears) then we should be fine.

@TonyDev314
Copy link
Author

TonyDev314 commented Feb 29, 2024 via email

@FlorianRappl
Copy link
Contributor

FlorianRappl commented Feb 29, 2024

Surely you have to change the matching algorithm though. At present the match must be exact between the URL domain and the cookie domain. So whether or not a leading dot is included, the cookie is excluded.

No - as I wrote; if we normalize the input then the matching algorithm does not need to be changed. There will never be a dot in front of a domain.

For me a change from cookie.Domain.Is(domain) to domain.EndsWith(cookie.Domain, StringComparison.OrdinalIgnoreCase) in
both FindCookie and FindCookies makes the behaviour as expected.

This will not work in general. Yes, of course it works in your case - but then you could also just "return true". Because if domain is "www.foobar.com" it would also match a cookie.Domain of "bar.com", but it should only match "foobar.com".

@TonyDev314
Copy link
Author

It works if you include the "." in the cookie retained.
That requires further change

Need to store a cookie ".mywebsite.com" such that it matches requests to www.mywebsite.com AND www.foo.mywebsite.com etc.
But as you point out, it should NOT match www.thisismywebsite.com.

There's no way around that without altering the matching algorithm.

@FlorianRappl
Copy link
Contributor

There's no way around that without altering the matching algorithm.

I agree that the matching algorithm in general needs to be changed, but not due to the dot. The dot has to be taken care of in an input normalization. We'll remove the leading dot and handle all those cases in the matching algorithm.

If you'd retain the dot you'd always need to take care of it in the matching algorithm, which does not make much sense. You can already remove it beforehand and drop that case immediately. Matching will be done more often than setting / writing - so normalizing the input makes sense also from a performance POV. It's just unnecessary work to retain the dot.

As the spec said - "foo.com" and ".foo.com" should behave the same. So let's normalize ".foo.com" to be "foo.com".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants