Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field is set to the wrong value after serialize to file #171

Open
sn4k3 opened this issue Jun 9, 2021 · 14 comments · May be fixed by #214
Open

Field is set to the wrong value after serialize to file #171

sn4k3 opened this issue Jun 9, 2021 · 14 comments · May be fixed by #214

Comments

@sn4k3
Copy link

sn4k3 commented Jun 9, 2021

I'm having a problem where i set a field to be 6 and on file is written with 5
Here's the defenition:

            /// <summary>
            /// Gets the size of the printer model
            /// </summary>
            [FieldOrder(3)]
            [FieldEndianness(Endianness.Big)]
            public uint PrinterModelSize { get; set; } = 6;

            /// <summary>
            /// Gets the printer model
            /// </summary>
            [FieldOrder(4)]
            [FieldLength(nameof(PrinterModelSize))]
            [SerializeAs(SerializedType.TerminatedString)]
            public string PrinterModel
            {
                get => _printerModel;
                set
                {
                    _printerModel = value;
                    PrinterModelSize = string.IsNullOrEmpty(value) ? 0 : (uint)value.Length+1;
                }
            }

I place a breakpoint on both PrinterModelSize = string.IsNullOrEmpty(value) ? 0 : (uint)value.Length+1; and public uint PrinterModelSize { get; set; } = 6; Both shows 6 with a value of "CL-89" which is correct since it have a null terminated char (00)
When the file got written that field turn into 5. Does it redefine the value when serializing due the link with [FieldLength(nameof(PrinterModelSize))] ? If so is there a way to overcome this?

image

EDIT 1:
I added to attribute [FieldLength(nameof(PrinterModelSize), BindingMode = BindingMode.OneWay)], that way the size become 6 but string still not null terminated (no zero appended)

image

@jefffhaynes
Copy link
Owner

The serializer does not support non-trivial getters and setters so the body of your setter will not be used during deserialization. Maybe that's the problem?

@sn4k3
Copy link
Author

sn4k3 commented Jun 20, 2021

But the TerminatedString flag should append a 00 byte at serialization time right?

So far i have this working by using a byte array instead:

            /// <summary>
            /// Gets the size of the printer model
            /// </summary>
            [FieldOrder(3)]
            [FieldEndianness(Endianness.Big)]
            public uint PrinterModelSize { get; set; } = 6;

            [FieldOrder(4)]
            [FieldLength(nameof(PrinterModelSize))]
            public byte[] PrinterModelArray { get; set; }  = { 0x43, 0x4C, 0x2D, 0x38, 0x39, 0x0 };

            [Ignore]
            public string PrinterModel
            {
                get => Encoding.ASCII.GetString(PrinterModelArray).TrimEnd(char.MinValue);
                set
                {
                    PrinterModelArray = Encoding.ASCII.GetBytes(value + char.MinValue);
                    PrinterModelSize = (uint) PrinterModelArray.Length;
                }
            }

@hoshinokanade
Copy link

But the TerminatedString flag should append a 00 byte at serialization time right?

So far i have this working by using a byte array instead:

            /// <summary>
            /// Gets the size of the printer model
            /// </summary>
            [FieldOrder(3)]
            [FieldEndianness(Endianness.Big)]
            public uint PrinterModelSize { get; set; } = 6;

            [FieldOrder(4)]
            [FieldLength(nameof(PrinterModelSize))]
            public byte[] PrinterModelArray { get; set; }  = { 0x43, 0x4C, 0x2D, 0x38, 0x39, 0x0 };

            [Ignore]
            public string PrinterModel
            {
                get => Encoding.ASCII.GetString(PrinterModelArray).TrimEnd(char.MinValue);
                set
                {
                    PrinterModelArray = Encoding.ASCII.GetBytes(value + char.MinValue);
                    PrinterModelSize = (uint) PrinterModelArray.Length;
                }
            }

I have a similar usage case that a string is null-terminated at the same time prefixed with FieldLength. I looked up at the source code it seems when FieldLength is there, then null-termination won't apply. Similarly I have resorted to using a custom class.

@sn4k3
Copy link
Author

sn4k3 commented Jul 19, 2021

Similarly I have resorted to using a custom class.

Can you share your custom class to fix this?

@hoshinokanade
Copy link

Mine is not a lot better than yours indeed.

public class MyString
{
    [FieldOrder(0)]
    public byte SerializedLength { get; set;}

    [FieldOrder(1)]
    [FieldLength(nameof(SerializedLength))]
    public string SerializedValue { get; set; }

    [Ignore]
    public string Value
    {
        get => SerializedValue[0..^1];
        set => SerializedValue = value + char.MinValue;
    }
}

One concern is that you would always want to interact via Value property instead of SerializedValue (while it is necessary to keep that public, unfortunately). This shares the same downside with your class.

I didn't do benchmark on which is more performant. The main reason of why I didn't widely use it in my codebase is that this requires to new the MyString class everywhere. I say I can easily make use of code generator to assist me defining my binary classes but I yet to come up effective solution to seamlessly use ordinary string.

@sn4k3
Copy link
Author

sn4k3 commented Jul 20, 2021

Ah, i tought you created some sort of custom flag attribute to deal with this strings, for example: [FieldStringNullTerminated]
Nevertheless you definition look simpler than my 👍

@hoshinokanade
Copy link

hoshinokanade commented Jul 20, 2021

All these adds up to more level of indirections in order to more seamlessly to use primitives:

namespace MyTest
{
    public class MyTestClass
    {
        [FieldOrder(0)]
        public MyString Name { get; set; }

        [FieldOrder(1)]
        public MyList<MyString> Foods { get; set; }

        public static MyTestClass Make(string name, IReadOnlyList<string> foods)
        {
            // factory class to make my life easier
        }

        public void Deconstruct(out string name, out IReadOnlyList<string> foods)
        {
            // destructuring to save me from conversions
        }
    }
}

Usage:

        public async void TestSerialization()
        {
            var stream = new MemoryStream();
            var serializer = new BinarySerializer();
            var obj = MyTestClass.Make // with primitives
            (
                name: "ABC",
                foods: new List<string>
                {
                     "A" ,
                     "B" ,
                     "C" ,
                }
            );

            await serializer.SerializeAsync(stream, obj);
            var output = stream.ToArray();
            stream.Seek(0, SeekOrigin.Begin);

            var deserializedObj = await serializer.DeserializeAsync<MyTestClass>(stream);
            var (name, foods) = deserializedObj; // string, IReadOnlyList<string>, great!
        }

Along with that it is also less likely for the remaining part of code to accidentally use the SerializedValue in the MyString class above. Note that this method doesn't scale well when you have, say, tens of fields!

@jefffhaynes
Copy link
Owner

Sorry, just getting back to this. Have either of you tried a ValueConverter to solve this?

@sn4k3
Copy link
Author

sn4k3 commented Jan 4, 2022

Sorry, just getting back to this. Have either of you tried a ValueConverter to solve this?

Today i step again with a field of this type, so i tried to use a ValueConverter on FieldLength, it output the right length to the file field but the string would not respect the size and not output a extra 00 byte in the end of the string. Maybe i'm doing this wrong but i can't find a way to manipulate the string value on serialization time

Class:

    public class NullTerminatedLengthConverter : IValueConverter
    {
        // Read
        public object Convert(object value, object converterParameter, BinarySerializationContext context)
        {
            return value;
        }

        // Write
        public object ConvertBack(object value, object converterParameter, BinarySerializationContext context)
        {
            return System.Convert.ToUInt32(value) + 1;
        }
    }


        public sealed class SlicerInfoV3
        {
            [FieldOrder(0)] [FieldEndianness(Endianness.Big)] public uint SoftwareNameSize { get; set; } = (uint)About.SoftwareWithVersion.Length + 1;

            [FieldOrder(1)]
            [FieldLength(nameof(SoftwareNameSize), ConverterType = typeof(NullTerminatedLengthConverter))]
            [SerializeAs(SerializedType.TerminatedString)]
            public string SoftwareName { get; set; } = About.SoftwareWithVersion;

            [FieldOrder(2)] [FieldEndianness(Endianness.Big)] public uint MaterialNameSize { get; set; }

            [FieldOrder(3)] 
            [FieldLength(nameof(MaterialNameSize), ConverterType = typeof(NullTerminatedLengthConverter))] 
            [SerializeAs(SerializedType.TerminatedString)]
            public string MaterialName { get; set; }

            [FieldOrder(4)] public uint Unknown1 { get; set; }
            [FieldOrder(5)] public uint Unknown2 { get; set; }
            [FieldOrder(6)] public uint Unknown3 { get; set; }
            [FieldOrder(7)] public uint Unknown4 { get; set; }
            [FieldOrder(8)] public byte Unknown5 { get; set; } = 1;
            [FieldOrder(9)] public byte LightPWM { get; set; } = byte.MaxValue;
            [FieldOrder(10)] public ushort Unknown6 { get; set; } = 2;
            [FieldOrder(11)] public PageBreak PageBreak { get; set; } = new();
        }

010Editor_2022-01-04_04-40-00

Note that the SoftwareNameSize is correct now (16), and text is 15 length but there is a missing 0x00 byte on the end of the string. So i get 00 00 00 1A but i should get 00 00 00 00 1A just after the string

We need a simple way to keep this null terminated strings intact on both read and write time...
Something like a [SerializeAs(SerializedType.TerminatedStringAppendZero)] ?

@bevanweiss
Copy link
Contributor

Have you tried this as below?

            /// <summary>
            /// Gets the size of the printer model
            /// </summary>
            [FieldOrder(3)]
            [FieldEndianness(Endianness.Big)]
            public uint PrinterModelSize;

            /// <summary>
            /// Gets the printer model
            /// </summary>
            [FieldOrder(4)]
            [FieldLength(nameof(PrinterModelSize))]
            [SerializeAs(SerializedType.TerminatedString)]
            public string? PrinterModel;

I think the nullability of the PrinterModel is important here, to provide the PrinterModelSize=0 option.
Otherwise if it's a terminated string and it exists at all, then this is a minimum of 1 byte right (the '\0'). The only way to get rid of it is to say that the PrinterModel string may not exist at all, when the PrinterModelSize is 0.

It's possible that this won't actually work. But I'd argue in this situation it would be a bug in the BinarySerializer.
Since a null field MUST have a length of 0, nothing else makes sense for it (unless one wants to get truly abstract, and argue that ALL values are valid... since the length of something that doesn't exist is arbitrary).

@sn4k3
Copy link
Author

sn4k3 commented Apr 29, 2023

@bevanweiss the main problem is that SerializedType.TerminatedString is not appending a \0 at end of string while serializing.
Example:

test\0 (length=5) Deserialize ok to "test"
But when serialize back it is written as: test (length=4) without appending a null termination '\0'

I solved that by implementing a custom class NullTerminatedUintStringBigEndian which also uses nullable value

@bevanweiss
Copy link
Contributor

Ahh yes, I think I do recall this when I was looking into the Graph logic
https://github.com/jefffhaynes/BinarySerializer/blob/fa8181145d89c19d5f3d5070edd793407f6b1917/BinarySerializer/Graph/TypeGraph/TypeNode.cs#LL389-L403

I wasn't a fan of how it did this.
I personally think if it's a TerminatedString it should remain as this, but if a length constraint is applied then it should be NULL padded ('\0') to this length. If the string length (plus termination char) is greater than the length, it should be truncated with the last character being the termination.
i.e. if configured as:

[FieldLength(4)]
[SerializeAs(SerializeType.TerminatedString]]
public string LongString = "This string here";

it should result in a Serialized output of
'T', 'h', 'i', '\0'

Likewise if it's

[FieldLength(4)]
[SerializeAs(SerializeType.TerminatedString]]
public string LongString = "T";

it should result in a Serialized output of
'T', '\0', '\0', '\0'

@jefffhaynes Is there a particular reason for the current behaviour (where FieldLength removed the TerminatedString type)?
My thoughts are that the TerminatedString characteristic should remain. But it should just gain a 'size' parameter which is used for truncation / padding.
Would you accept a Pull Request to:
a. Add tests for this
b. Implement the revised behaviour mentioned? (this would change the current serialize/deserialize behaviour of potentially existing logic however... so unsure if there's any backward compatibility flag that might be desired).

@jefffhaynes
Copy link
Owner

I think I agree with your reasoning here. Probably just an oversight on my part when I first wrote it. Changing it would be a breaking change but probably not a significant one? 🤞

@bevanweiss
Copy link
Contributor

I think I agree with your reasoning here. Probably just an oversight on my part when I first wrote it. Changing it would be a breaking change but probably not a significant one? 🤞

I'll put together a PR for this (maybe this week.. maybe into next week).
I think there would be a pretty straight forward way to prevent it being a breaking change by applying some additional optional argument on the FieldLength attribute (RetainStringTermination?) that would default to false (current behaviour).

bevanweiss added a commit to bevanweiss/BinarySerializer that referenced this issue May 20, 2023
This allows to keep the existing TerminatedString behaviour when a field
length constraint is applied, where it essentially becomes a SizedString.
Whilst also allowing new behaviour that would keep the terminated string
aspect, but truncate / pad the terminated string around the length constraint

Test case shows non-standard terminator (\n or 0x0A), and non-standard
padding byte 0x0D just to prove that these work as expected also.

Possibly fixes jefffhaynes#171 (confirmation required)

Signed-off-by: Bevan Weiss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants