Demonstrate internal job row type using struct conversion #37

brandur · 2023-11-18T20:36:09Z

I wanted to throw this one up as a possible alternative to a public
facing rivertype module in #36, in which we demonstrate an alternate
strategy using struct conversions. The problems it aims to solve are the
same:

A way to access core job types from within subpackages, allowing more
modularity (e.g. Rescuer could live with other maintenance services).
Long term, a way of giving drivers access to job primitives without
having to reference the top-level river.

The advantage of this approach is there's fewer user-facing changes, and
the user never has to interact with an alternate package.

It works by doing a struct conversion as necessary when moving from an
internal package to one that bubbles up to river. Go allows pure
struct conversions as long as their fields are the same, but there are
some limitations:

Substructs like AttemptError can't be converted, even if identical.
Custom types like JobState can't be converted, even if identical.

We work around these issues by:

Keeping errors as a [][]byte instead of it being unmarshaled by pgx.
A helper called ErrorsUnmarshaled() is added to the top-level
JobRow to give easy access to unmarshaled errors.
State is changed to a basic string akin to functions in http. This
is technically a little less type-safe, but practically speaking would
make very little difference because all these string-based types can
be converted to and from each other with no checks anyway. Also, users
aren't expected to access job state all that often. When they do, they
can still use job.State == river.JobStateAvailable because the
constants have been made available as strings.

I think this approach does also work reasonably well, although I worry a
little bit that there may be some sharp edge in the future that ends up
limiting us in some unexpected way. Also, having to use helpers like
ErrorsUnmarshaled() is definitely a little bit worse. That said, ~no
user-facing API changes, and has some marginal performance benefits:

Converting structs between each other should be ~free.
Errors probably aren't used that much, so by not unmarshaling them
most of the time, we save on a lot of unnecessary JSON parsing.

Unlike #36, I didn't push the rescuer down (just in case this is a total
dead end), but this approach should make that possible.

I wanted to throw this one up as a possible alternative to a public facing `rivertype` module in #36, in which we demonstrate an alternate strategy using struct conversions. The problems it aims to solve are the same: * A way to access core job types from within subpackages, allowing more modularity (e.g. `Rescuer` could live with other maintenance services). * Long term, a way of giving drivers access to job primitives without having to reference the top-level `river`. The advantage of this approach is there's fewer user-facing changes, and the user never has to interact with an alternate package. It works by doing a struct conversion as necessary when moving from an internal package to one that bubbles up to `river`. Go allows pure struct conversions as long as their fields are the same, but there are some limitations: * Substructs like `AttemptError` can't be converted, even if identical. * Custom types like `JobState` can't be converted, even if identical. We work around these issues by: * Keeping errors as a `[][]byte` instead of it being unmarshaled by pgx. A helper called `ErrorsUnmarshaled()` is added to the top-level `JobRow` to give easy access to unmarshaled errors. * `State` is changed to a basic string akin to functions in `http`. This is technically a little less type-safe, but practically speaking would make very little difference because all these string-based types can be converted to and from each other with no checks anyway. Also, users aren't expected to access job state all that often. When they do, they can still use `job.State == river.JobStateAvailable` because the constants have been made available as strings. I think this approach does also work reasonably well, although I worry a little bit that there may be some sharp edge in the future that ends up limiting us in some unexpected way. Also, having to use helpers like `ErrorsUnmarshaled()` is definitely a little bit worse. That said, ~no user-facing API changes, and has some marginal performance benefits: * Converting structs between each other should be ~free. * Errors probably aren't used that much, so by not unmarshaling them most of the time, we save on a lot of unnecessary JSON parsing. Unlike #36, I didn't push the rescuer down (just in case this is a total dead end), but this approach should make that possible.

brandur · 2023-11-18T20:39:10Z

client.go

@@ -809,7 +809,7 @@ func (c *Client[TTx]) distributeJobCompleterCallback(update jobcompleter.Complet
 		c.statsNumJobs++
 	}()

-	c.distributeJob(jobRowFromInternal(update.Job), jobStatisticsFromInternal(update.JobStats))
+	c.distributeJob((*JobRow)(dbsqlc.JobRowFromInternal(update.Job)), jobStatisticsFromInternal(update.JobStats))


Not that all of these currently convert river.JobRow <-> dbsqlc.RiverJob, but in the future we'd expect dbsqlc to get pushed entirely into the driver.

(dbsqlc is pgx v5 specific, so if we were to support an alternate package, it'd need its own dbsqlc-like package. For example, if we supported pgx v4, we'd have a dbsqlc configured to emit pgx v4, and that'd live in the riverpgxv4 driver.)

So the driver would convert to and return a rivertype.JobRow, and then all these conversion would become simple (*JobRow)(update.Job).

brandur · 2023-11-18T20:39:35Z

internal/rivertype/river_type.go

+// JobRow contains the properties of a job that are persisted to the database.
+// Use of `Job[T]` will generally be preferred in user-facing code like worker
+// interfaces.
+type JobRow struct {


Identical to river.JobRow so we're allowed to convert between them.

brandur · 2023-11-18T20:40:11Z

job.go

-// there's no way to put a helper in a shared package that can produce one,
-// which is why we have this copy/pasta. There are some potential alternatives
-// to this, but none of them are great.
-func jobRowFromInternal(internal *dbsqlc.RiverJob) *JobRow {


This moves into dbsqlc, and like #36, we can remove the duplicate that's in rivertest.

brandur · 2023-11-18T20:42:21Z

@bgentry Sorry I know these are going to be complex as hell to try and parse, but this is one change I think we should make before stabilizing the API so we have more runway for reconfiguring things down the line. Both the approaches in #36 and #37 seem plausible to me, with this one being the slightly more exotic/risky option, but with the smaller user-facing change.

bgentry · 2023-11-18T23:35:20Z

job.go

@@ -69,7 +68,7 @@ type JobRow struct {

 	// Errors is a set of errors that occurred when the job was worked, one for
 	// each attempt. Ordered from earliest error to the latest error.
-	Errors []AttemptError
+	Errors [][]byte


This seems like a fairly big downside and it's the reason I think #36 is the only viable option. I know we'll also be making use of metadata in the future and I think it's important that we keep these both as usable as possible.

Yeah it's definitely a sizable downside.

I played around with this a little more and one thing we could do there is rename the existing JobRow to JobRowRaw:

type JobRowRaw struct { // ID of the job. Generated as part of a Postgres sequence and generally // ascending in nature, but there may be gaps in it as transactions roll // back. ID int64 ...

And then make JobRow embed a raw row with unmarshaled Errors that shadows the raw errors:

type JobRow struct { *JobRowRaw Errors []AttemptError }

It does work pretty well actually, but ugh, the downside is that it unfortunately does still complicate the inheritance a little much because we now have JobRowRaw -> JobRow -> Job which is starting to get gnarly.

brandur · 2023-11-25T19:37:53Z

Closing this one as we went with #36 instead.

brandur commented Nov 18, 2023

View reviewed changes

brandur requested a review from bgentry November 18, 2023 20:40

brandur mentioned this pull request Nov 18, 2023

Push rescuer down into internal/maintenance + JobRow to rivertype #36

Merged

bgentry reviewed Nov 18, 2023

View reviewed changes

brandur closed this Nov 25, 2023

brandur deleted the brandur-struct-conversions branch November 25, 2023 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demonstrate internal job row type using struct conversion #37

Demonstrate internal job row type using struct conversion #37

brandur commented Nov 18, 2023

brandur Nov 18, 2023

brandur Nov 18, 2023

brandur Nov 18, 2023

brandur commented Nov 18, 2023

bgentry Nov 18, 2023

brandur Nov 20, 2023

brandur commented Nov 25, 2023

Demonstrate internal job row type using struct conversion #37

Demonstrate internal job row type using struct conversion #37

Conversation

brandur commented Nov 18, 2023

brandur Nov 18, 2023

Choose a reason for hiding this comment

brandur Nov 18, 2023

Choose a reason for hiding this comment

brandur Nov 18, 2023

Choose a reason for hiding this comment

brandur commented Nov 18, 2023

bgentry Nov 18, 2023

Choose a reason for hiding this comment

brandur Nov 20, 2023

Choose a reason for hiding this comment

brandur commented Nov 25, 2023