-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to marshal a slice of large structs #235
Comments
The problem is that it generates a method receiver on the pointer type: func (z *FailureObject) EncodeMsg(en *msgp.Writer) (err error) { While for func (z OkObject) EncodeMsg(en *msgp.Writer) (err error) { The reason why is this piece of code: Lines 233 to 244 in 53e4ad1
Which was introduced in commit cd70faa which says The I don't understand where the 3 comes from or what for loop is actually checking but changing it to just: case *Struct:
return p.TypeName() still passes all tests and fixes this bug. In theory all receivers for @philhofer do you remember writing this code 3 years ago and why you did it this way? |
So, originally I wrote the code to always use pointer receivers, since there's sometimes a substantial performance penalty to copying the whole struct into the method. Then someone complained that they wanted to use value receivers. The middle-ground that the code sits in right now is that it uses Looking back on things, I should have just stuck with pointer receivers, because the behavior is not intuitive or well-documented. |
I increased the value from 3 to 10 and now it works. Anything I can do to help? |
You likely don't want to be copying (or boxing) 80-byte structures. The simplest fix for you is:
You can cause your example to fail during type-checking (rather than at run-time) by making your slice of type |
I understand the first suggestion, but the second one isn't obvious to me (I read the wiki a couple of weekends ago actually looking for some detail on how to marshal a slice, and I don't remember reading anything like that, then I learned about slices of interfaces as a concept). Anyhow, I tried []msgp.Marshaler, and now I get a different error: I'm using it as: What am I doing wrong? As a suggestion, I think this info can be in the tips and tricks section of the wiki (unless you consider it too obvious). I would write it myself in order to help, but with less than 10 hours of experience, I don't consider it wise ;) |
You'll need |
I was really impressed about the editor I'm using (vscode) because it automatically detected the need for the import and added it. But still, I get that error message when I try to manually run: |
The full import is "github.com/tinylib/msgp/msgp" (note the second 'msgp'; the top-level one is the code generation tool, not a library). Your editor is probably selecting the wrong import. |
Also, keep in mind that "unresolved identifier" is just a warning produced by the tool when it encounters a type not defined in the input file, not an error. But you'll probably want to avoid generating methods for that type. |
Yes, the editor actually included the right one (two msgp). Here a simplified example: package main
import (
"github.com/tinylib/msgp/msgp"
)
//go:generate msgp
type ListOfObjects []msgp.Marshaler
//msgp:tuple Obj
type Obj struct {
A uint8
B uint8
C uint8
}
func AppendObj(list ListOfObjects, b uint8, c uint8) ListOfObjects {
return append(list, Obj{0, b, c})
}
func main() {
list := make(ListOfObjects, 0, 10)
list = AppendObj(list, 0, 0)
res, _ := list.MarshalMsg(nil)
} The warning prints as usual, but if I run
|
Stick a (It's not possible to generate code for the |
@philhofer is this something you are still willing to change?
That actually takes only a couple of nanoseconds. I wrote a simple test: https://gist.github.com/erikdubbelboer/98c8a910b1151c6df0e1ecba56fffcec
As you can see the difference is neglectable. |
After adding the ignore as per instructions, the warning disappears as expected, but when I try to compile I still get an error: Which I kind of understand as valid |
@philhofer what you want isn't possible. Since it's a slice you have to generate code for it. But the generated code assumes each element has the methods The closest you can get is: type ListType interface {
msgp.Decodable
msgp.Encodable
msgp.Marshaler
msgp.Unmarshaler
msgp.Sizer
}
type ListOfObjects []ListType But this results in a:
|
I suggest changing msgp to always use a value receiver just like json.RawMessage for example. |
@philhofer I updated my benchmark a bit: https://gist.github.com/erikdubbelboer/98c8a910b1151c6df0e1ecba56fffcec |
Why would generating code that passes structs by value cause any difference in escape analysis? Escape analysis should pretty much always conclude that the struct does not escape, since the generated code doesn't maintain a pointer anywhere into the structure... Your benchmark seems to show performance parity because of the compiler being clever enough to inline the calls. The generated code is rarely (never, I think) a candidate for inlining. I've always viewed passing structures as value receivers as a code smell, perhaps because I have written too much C. (But, also, perhaps because I see beginner Go programmers coming from Java do it too frequently because they don't really grok pointers.) So, I may be unreasonably biased. |
I looked into it more and I think you are right about it not affecting escape analysis. Escape analysis is only affected by pointer cycles and interfaces which both aren't used in the generated code. So why don't you change the code to always use pass by reference? I still think the current solution of switching methods after 4 or more fields is not the best solution. It doesn't seem like the compiler is inlining the calls. I don't think passing structs as value is code smell. I think in Go it is somewhat similar as to using |
Beware of the following:
|
We can implement a fairly generic wrapper that can be used for all types that are generated. There is a minimal amount of (AFAICT necessary) reflection, for creating new instances. package arr
import (
"errors"
"math"
"reflect"
"github.com/tinylib/msgp/msgp"
)
// RoundTripper provides an interface for type roundtrip serialization.
type RoundTripper interface {
msgp.Unmarshaler
msgp.Marshaler
msgp.Sizer
msgp.Encodable
msgp.Decodable
comparable
}
// Array provides a wrapper for an underlying array of serializable objects.
type Array[T RoundTripper] struct {
value []T
}
// Msgsize returns the size of the array in bytes.
func (j *Array[T]) Msgsize() int {
if j.value == nil {
return msgp.NilSize
}
sz := msgp.ArrayHeaderSize
for _, v := range j.value {
sz += v.Msgsize()
}
return sz
}
// Value returns the underlying value.
// Regular append mechanics should be observed.
func (j *Array[T]) Value() []T {
return j.value
}
// Append a value to the underlying array.
// The returned Array is always the same as the one called.
func (j *Array[T]) Append(v ...T) *Array[T] {
if j.value == nil {
j.value = make([]T, 0, len(v))
}
j.value = append(j.value, v...)
return j
}
// Set the underlying value.
func (j *Array[T]) Set(val []T) {
j.value = val
}
// MarshalMsg implements msgp.Marshaler
func (j *Array[T]) MarshalMsg(b []byte) (o []byte, err error) {
if j.value == nil {
return msgp.AppendNil(b), nil
}
if uint64(len(j.value)) > math.MaxUint32 {
return b, errors.New("array: length of array exceeds math.MaxUint32")
}
b = msgp.AppendArrayHeader(b, uint32(len(j.value)))
for _, v := range j.value {
b, err = v.MarshalMsg(b)
if err != nil {
return b, err
}
}
return b, err
}
// EncodeMsg implements msgp.Encoder
func (j *Array[T]) EncodeMsg(w *msgp.Writer) error {
if j.value == nil {
return w.WriteNil()
}
if uint64(len(j.value)) > math.MaxUint32 {
return errors.New("array: length of array exceeds math.MaxUint32")
}
if err := w.WriteArrayHeader(uint32(len(j.value))); err != nil {
return err
}
for _, v := range j.value {
err := v.EncodeMsg(w)
if err != nil {
return err
}
}
return nil
}
// UnmarshalMsg will unmarshal the value into a typed array.
// Nil values are supported.
func (j *Array[T]) UnmarshalMsg(bytes []byte) ([]byte, error) {
if bytes, err := msgp.ReadNilBytes(bytes); err == nil {
j.value = nil
return bytes, nil
}
l, bytes, err := msgp.ReadArrayHeaderBytes(bytes)
if err != nil {
return bytes, err
}
if j.value == nil {
j.value = make([]T, 0, l)
} else {
j.value = j.value[:0]
}
for i := uint32(0); i < l; i++ {
v := newRT[T]()
bytes, err = v.UnmarshalMsg(bytes)
if err != nil {
return bytes, err
}
j.value = append(j.value, v)
}
return bytes, nil
}
// DecodeMsg will decode the value into a typed array.
// Nil values are supported.
func (j *Array[T]) DecodeMsg(r *msgp.Reader) error {
if err := r.ReadNil(); err == nil {
j.value = nil
return nil
}
l, err := r.ReadArrayHeader()
if err != nil {
return err
}
if j.value == nil {
j.value = make([]T, 0, l)
} else {
j.value = j.value[:0]
}
for i := uint32(0); i < l; i++ {
v := newRT[T]()
err = v.DecodeMsg(r)
if err != nil {
return err
}
j.value = append(j.value, v)
}
return nil
}
// newRT will create a new RoundTripper, which is a pointer that points to a zero value element.
func newRT[T RoundTripper]() T {
var t T
// Use reflection to get the type of T
ptrType := reflect.TypeOf(t)
// T will always have a pointer type, so we create a new element.
elemType := ptrType.Elem()
// Create a new instance of T using reflect.New
// This will create a pointer to the element.
newValue := reflect.New(elemType)
return newValue.Interface().(T)
} Usage:
The only annoying part is that array elements must be pointers. |
I'm trying to marshal an "heterogeneous" slice as an experiment (one of interfaces), and I'm hitting a wall with it. If I fill it with structs of up to 3 elements, it will encode them fine. If I add a single struct with 4 elements, it fails at runtime, saying:
FailureObject being the problematic struct in my example to reproduce the problem, though I proved I can marshal an individual object of that type.
Note: I intentionally need the objects to be encoded as arrays, and that's why I'm adding the "tuple" on top.
I'm completely new in Go, so most probably is my mistake. Anyhow, can you please take a look at the code:
If I'm assuming something here that is wrong, what would be the correct way, cleanest and most efficient way of encoding an array of arrays that can have different sizes and that are mapped to existing types.
The text was updated successfully, but these errors were encountered: