Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/publicsuffix: table.go has been stale for a year #56656

Closed
jub0bs opened this issue Nov 8, 2022 · 11 comments
Closed

x/net/publicsuffix: table.go has been stale for a year #56656

jub0bs opened this issue Nov 8, 2022 · 11 comments
Assignees
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@jub0bs
Copy link

jub0bs commented Nov 8, 2022

What version of Go are you using (go version)?

$ go version
go version go1.19.3 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What did you do?

package main

import (
	"fmt"

	"golang.org/x/net/publicsuffix"
)

func main() {
	fmt.Println(publicsuffix.EffectiveTLDPlusOne("foo.aivencloud.com"))
}

(playground)

What did you expect to see?

foo.aivencloud.com <nil>

aivencloud.com was indeed added to the Public Suffix List (PSL) as far back as February 24th 2022, i.e. much earlier than Go 1.19.3's release on November 1st 2022.

What did you see instead?

aivencloud.com <nil>

This undesired output can be explained be explained by the fact that file table.go hasn't been regenerated with an up-to-date version of the PSL since November 2021, which predates the addition of aivencloud.com to the PSL.

Why this matters

Of course, there's nothing specific about PSL entry aivencloud.com. I just happened to play a small part in its addition. But it's an example of a PSL entry missing from the current version of golang.org/x/net/publicsuffix.

The PSL is the foundation on which the modern concept of site is based. Go developers (I included) may depend on x/net/publicsuffix in order to decide whether to establish a trust relationship between different Web origins. However, if x/net/publicsuffix doesn't stay abreast of changes to the PSL (give or take a few weeks), it becomes somewhat undependable, as relying on it to make sensitive decisions may introduce security holes.

Is there a good reason why table.go does not get regenerated at each minor release (or even, ideally, at each patch release) of Go? I'd be more than willing to help to make this happen.

@gopherbot gopherbot added this to the Unreleased milestone Nov 8, 2022
@ianlancetaylor
Copy link
Contributor

CC @nigeltao @golang/release

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 8, 2022
@heschi
Copy link
Contributor

heschi commented Nov 8, 2022

I don't see any reason to tie this to Go releases. Other than that it seems ripe for automation. It should be a pretty straightforward extension to stuff we already have in relui, notably the recently-added support for regenerating the goimports stdlib. The owners (Ian, @neild) will need to be okay to receive CLs periodically.

This might be doable for a motivated contributor, though they won't have access to test it fully. It should be very doable for random members of the Go team. I don't know if the release team will get to it any time soon but we can discuss.

@jub0bs
Copy link
Author

jub0bs commented Nov 8, 2022

@heschi Thanks! That sounds very promising.

@neild neild self-assigned this Nov 8, 2022
@nigeltao
Copy link
Contributor

nigeltao commented Nov 9, 2022

I agree that there's no need to synchronize with Go releases.

Other than that it seems ripe for automation

It is automatable but there is a trade-off. The more frequently you update (re-generate) the table, the bigger (more disk) the git repository will weigh. A possible mitigation is in #15518 (comment)

One idea (not necessarily a good one, but I don't want to forget it) to cap git repo size concerns is to keep x/net/publicsuffix as the package name that callers should import, for back compat, but have its implementation delegate to x/publicsuffixvolume000 (yes, there's probably a better name), which is updated more frequently. Whenever that gets too big (for whatever the value of "big" is), roll it over to volume001, then volume002, etc. Over time, programmers who go get golang.org/x/net/publicsuffix might end up also cloning a moderately large git repo, but they shouldn't need to clone very large ones, nor should x/net grow too large with old PSL data, and they might be able to garbage collect no longer needed 'volumes'.

See also issue #38169 "x/net/publicsuffix: Release as a separate module".

As for the OP, it should also be fairly straightforward to fork x/net/publicsuffix and run "go generate" within it at whatever cadence you want.

@nigeltao
Copy link
Contributor

nigeltao commented Nov 9, 2022

@neild, if you're mucking about in x/net/publicsuffix, where we currently generate:

var nodes = [...]uint8{
    0x00, 0x00, 0x53, 0x0b, 0x03,
    0x00, 0x00, 0x5b, 0x6e, 0x44,
    // ...9000 more lines...
    0x00, 0x00, 0x40, 0x65, 0x43,
    0x00, 0x00, 0x41, 0xf1, 0x43,
}

// etc

var children = [...]uint32{
    // etc,
}

It might be better (in terms of git repo weight) to generate a binary data file and use the relatively new (Go 1.16) //go:embed directive.

@heschi
Copy link
Contributor

heschi commented Nov 9, 2022

Does proxy.golang.org change the tradeoffs here? Many fewer people clone the repository than in 2018.

@nigeltao
Copy link
Contributor

Can you elaborate on how this helps? I'm not very familiar with proxy.golang.org services. Even if people download a mirror of a large repo, it's still a large download, right?

@seankhliao
Copy link
Member

seankhliao commented Nov 10, 2022

People (via go) download singular versions from the proxy, containing only a snapshot of contents at that revision. This does not include git history.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/450835 mentions this issue: publicsuffix: update table to latest list from publicsuffix.org

@neild
Copy link
Contributor

neild commented Nov 15, 2022

I'm going to update the public suffix list manually this time, will consider automating the process in existing issue #15518.

It might be better (in terms of git repo weight) to generate a binary data file and use the relatively new (Go 1.16) //go:embed directive.

This appears to knock about 70% off the size of the git repo, FWIW. https://go.dev/cl/450935

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 17, 2022
@jub0bs
Copy link
Author

jub0bs commented Nov 17, 2022

@neild Thanks! That's a very good start. I know it will be useful to many people.

@golang golang locked and limited conversation to collaborators Nov 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants