-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demonstrate internal job row type using struct conversion #37
Conversation
I wanted to throw this one up as a possible alternative to a public facing `rivertype` module in #36, in which we demonstrate an alternate strategy using struct conversions. The problems it aims to solve are the same: * A way to access core job types from within subpackages, allowing more modularity (e.g. `Rescuer` could live with other maintenance services). * Long term, a way of giving drivers access to job primitives without having to reference the top-level `river`. The advantage of this approach is there's fewer user-facing changes, and the user never has to interact with an alternate package. It works by doing a struct conversion as necessary when moving from an internal package to one that bubbles up to `river`. Go allows pure struct conversions as long as their fields are the same, but there are some limitations: * Substructs like `AttemptError` can't be converted, even if identical. * Custom types like `JobState` can't be converted, even if identical. We work around these issues by: * Keeping errors as a `[][]byte` instead of it being unmarshaled by pgx. A helper called `ErrorsUnmarshaled()` is added to the top-level `JobRow` to give easy access to unmarshaled errors. * `State` is changed to a basic string akin to functions in `http`. This is technically a little less type-safe, but practically speaking would make very little difference because all these string-based types can be converted to and from each other with no checks anyway. Also, users aren't expected to access job state all that often. When they do, they can still use `job.State == river.JobStateAvailable` because the constants have been made available as strings. I think this approach does also work reasonably well, although I worry a little bit that there may be some sharp edge in the future that ends up limiting us in some unexpected way. Also, having to use helpers like `ErrorsUnmarshaled()` is definitely a little bit worse. That said, ~no user-facing API changes, and has some marginal performance benefits: * Converting structs between each other should be ~free. * Errors probably aren't used that much, so by not unmarshaling them most of the time, we save on a lot of unnecessary JSON parsing. Unlike #36, I didn't push the rescuer down (just in case this is a total dead end), but this approach should make that possible.
@@ -809,7 +809,7 @@ func (c *Client[TTx]) distributeJobCompleterCallback(update jobcompleter.Complet | |||
c.statsNumJobs++ | |||
}() | |||
|
|||
c.distributeJob(jobRowFromInternal(update.Job), jobStatisticsFromInternal(update.JobStats)) | |||
c.distributeJob((*JobRow)(dbsqlc.JobRowFromInternal(update.Job)), jobStatisticsFromInternal(update.JobStats)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that all of these currently convert river.JobRow
<-> dbsqlc.RiverJob
, but in the future we'd expect dbsqlc
to get pushed entirely into the driver.
(dbsqlc
is pgx v5 specific, so if we were to support an alternate package, it'd need its own dbsqlc-like package. For example, if we supported pgx v4, we'd have a dbsqlc configured to emit pgx v4, and that'd live in the riverpgxv4
driver.)
So the driver would convert to and return a rivertype.JobRow
, and then all these conversion would become simple (*JobRow)(update.Job)
.
// JobRow contains the properties of a job that are persisted to the database. | ||
// Use of `Job[T]` will generally be preferred in user-facing code like worker | ||
// interfaces. | ||
type JobRow struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identical to river.JobRow
so we're allowed to convert between them.
// there's no way to put a helper in a shared package that can produce one, | ||
// which is why we have this copy/pasta. There are some potential alternatives | ||
// to this, but none of them are great. | ||
func jobRowFromInternal(internal *dbsqlc.RiverJob) *JobRow { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This moves into dbsqlc
, and like #36, we can remove the duplicate that's in rivertest
.
@bgentry Sorry I know these are going to be complex as hell to try and parse, but this is one change I think we should make before stabilizing the API so we have more runway for reconfiguring things down the line. Both the approaches in #36 and #37 seem plausible to me, with this one being the slightly more exotic/risky option, but with the smaller user-facing change. |
@@ -69,7 +68,7 @@ type JobRow struct { | |||
|
|||
// Errors is a set of errors that occurred when the job was worked, one for | |||
// each attempt. Ordered from earliest error to the latest error. | |||
Errors []AttemptError | |||
Errors [][]byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a fairly big downside and it's the reason I think #36 is the only viable option. I know we'll also be making use of metadata in the future and I think it's important that we keep these both as usable as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's definitely a sizable downside.
I played around with this a little more and one thing we could do there is rename the existing JobRow
to JobRowRaw
:
type JobRowRaw struct {
// ID of the job. Generated as part of a Postgres sequence and generally
// ascending in nature, but there may be gaps in it as transactions roll
// back.
ID int64
...
And then make JobRow
embed a raw row with unmarshaled Errors
that shadows the raw errors:
type JobRow struct {
*JobRowRaw
Errors []AttemptError
}
It does work pretty well actually, but ugh, the downside is that it unfortunately does still complicate the inheritance a little much because we now have JobRowRaw
-> JobRow
-> Job
which is starting to get gnarly.
Closing this one as we went with #36 instead. |
I wanted to throw this one up as a possible alternative to a public
facing
rivertype
module in #36, in which we demonstrate an alternatestrategy using struct conversions. The problems it aims to solve are the
same:
A way to access core job types from within subpackages, allowing more
modularity (e.g.
Rescuer
could live with other maintenance services).Long term, a way of giving drivers access to job primitives without
having to reference the top-level
river
.The advantage of this approach is there's fewer user-facing changes, and
the user never has to interact with an alternate package.
It works by doing a struct conversion as necessary when moving from an
internal package to one that bubbles up to
river
. Go allows purestruct conversions as long as their fields are the same, but there are
some limitations:
Substructs like
AttemptError
can't be converted, even if identical.Custom types like
JobState
can't be converted, even if identical.We work around these issues by:
Keeping errors as a
[][]byte
instead of it being unmarshaled by pgx.A helper called
ErrorsUnmarshaled()
is added to the top-levelJobRow
to give easy access to unmarshaled errors.State
is changed to a basic string akin to functions inhttp
. Thisis technically a little less type-safe, but practically speaking would
make very little difference because all these string-based types can
be converted to and from each other with no checks anyway. Also, users
aren't expected to access job state all that often. When they do, they
can still use
job.State == river.JobStateAvailable
because theconstants have been made available as strings.
I think this approach does also work reasonably well, although I worry a
little bit that there may be some sharp edge in the future that ends up
limiting us in some unexpected way. Also, having to use helpers like
ErrorsUnmarshaled()
is definitely a little bit worse. That said, ~nouser-facing API changes, and has some marginal performance benefits:
Converting structs between each other should be ~free.
Errors probably aren't used that much, so by not unmarshaling them
most of the time, we save on a lot of unnecessary JSON parsing.
Unlike #36, I didn't push the rescuer down (just in case this is a total
dead end), but this approach should make that possible.