Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zcbor test cases not working #201

Open
nslowell opened this issue Jul 26, 2023 · 11 comments
Open

zcbor test cases not working #201

nslowell opened this issue Jul 26, 2023 · 11 comments

Comments

@nslowell
Copy link

nslowell commented Jul 26, 2023

I have successfully used https://github.com/NordicSemiconductor/zcbor for generating C code from CDDLs, but my CDDLs and their provided test cases are getting error messages when I try to run them through this tool.

Can you provide some clarification on why the errors are occurring? Is this tool missing support?

Example for https://github.com/NordicSemiconductor/zcbor/blob/main/tests/cases/corner_cases.cddl:

    -
 14 -     onetofourbytes: uint .size 0b1..0o4, ; Testing binary and octal
    -                                 ^^^^^^^ missing definition for rule b1..0o4
    -
 34 - TaggedUnion = #6.4321(bool) // #6.2345(uint)
    -                             ^^ expected rule identifier followed by an assignment token '=', '/=' or '//='
    -
 46 -     #6.0(tentothirtybytetstr: tstr .size 10..30),
    -          ^^^^^^^^^^^^^^^^^^^ invalid tag syntax
    -
 64 -     #6.10(boolval: bool),
    -           ^^^^^^^ invalid tag syntax
    -
 73 - Union = Group / MultiGroup / (3,4) / "\"hello\""
    -                                 ^ expected rule identifier followed by an assignment token '=', '/=' or '//='
    -
 77 -     union: (7=>uint) / (-8=>uint),
    -                    ^ missing closing delimiter
    -
 86 - Level1 = [Level2]
    -           ^^^^^^ missing definition for rule Level2
 87 - Level2 = [2**3Level3] ; Use ** here to test backwards compatibility.
    -             ^ invalid group entry syntax
    -
116 -     unabstractedunion1: (choice1: 1 // choice2: 2),
    -                                     ^^ invalid group entry syntax
    -
121 -     upto4nils: 0*0x04nil,
    -                       ^^ missing definition for rule il
    -
140 -     floats: *float,
    -             ^ invalid group entry syntax
    -
196 -     "r" : (result:result_code),
    -                              ^ missing closing delimiter
@rooooooooob
Copy link
Collaborator

It's not our code that is giving the errors, but the CDDL parsing library we use. I see you already made an issue there (anweiss/cddl/issues/199).

Could you provide your own CDDLs that are not working? That could be more helpful as they're probably more real-world and less edge cases (that file is called corner_cases.cddl.)

There are definitely things not supported by cddl-codegen though that would be able to parse with cddl.

Some things look a bit weird in that CDDL though:

For the line 64 one if you rewrite it a bit it should be accepted by us. e.g. change #6.10(boolval: bool) to boolval: #6.10(bool). I can't find any examples using the formatting there in the CDDL RFC, but I also can't find any using tagged members in general so maybe you can do it both ways.

For line 34 I'm confused as it uses group choice syntax // instead of type choice / but it doesn't seem to be declaring a group? At least the examples I've seen before would be inside of parenthesis marking it as a group e.g. how it's done in line 116.

@nslowell
Copy link
Author

nslowell commented Jul 27, 2023

I appreciate the quick feedback.

Here's a simpler CDDL based on what I'm trying to do:

CommandIDs =
    (Zero: 0) /
    (One: 1) /
    (Two: 2) /
    (Three: 3)

MyArray = [
    a: uint .le 3,
    b: uint .le 2,
    c: uint .le 1,
]

Info = [
    ? MyArray,
    ? myBStr: bstr .size 10
]

Command = [
    cmd: CommandIDs,
    myInt: uint,
    ? Info,
]

Response = [
    cmd: CommandIDs,
    myInt: uint,
    rsp: (Ack: 0) / (NAck: 1),
    ? Info
]

And here's the error:

error: parser errors
   - input:4:15
   -
 4 -     (Zero: 0) /
   -               ^ expected rule identifier followed by an assignment token '=', '/=' or '//='
   -
29 -     rsp: (Ack: 0) / (NAck: 1),
   -                 ^ missing closing delimiter

Error: "incremental parsing error"

I welcome any feedback. Again, this is working with zcbor but not with, I guess, the CDDL parsing library in use.

@rooooooooob
Copy link
Collaborator

What do you want the output generated to look like for CommandIDs? The way we handle basic groups like that we tend to create a struct with the fields you have there. If you instead remove the fields (they're all single constants anyway) and just make it a type choice on integers we can instead convert it into a simple C-style enum since all the fields of those basic groups are constants.

With these changes:

CommandIDs =
    0 ; @name Zero
  / 1 ; @name One
  / 2 ; @name Two
  / 3 ; @name Three

AckRsp =
    0 ; @name Ack
  / 1 ; @name NAck

Response = [
...
    rsp: AckRsp,
...
]

Those ; @name comments aren't required, it's just a way to tell the tool exactly what you want things to be called. If you don't add them they'll be called CommandIds0, CommandIds1, etc.

Unfortunately there are two unsupported things even after rewriting those:

  1. .le on numbers. We've only implemented it on array/map lengths since that's the only place we've needed it so far in our use cases - but it wouldn't be too hard to add into deserialization code. In the meantime if it's not completely necessary you could remove it and either not have the check or add it in afterwards by hand.

  2. We don't support deserialization for array-encoded structs with optional fields. The general case for this can be complicated, if not even totally ambiguous e.g. situations like [ ? uint, uint, ? (uint, text), ? text]. The situation you have there is actually something we have in one place in one of the Cardano CDDL specs that we use this tool to generate from so I might actually take a look at this today and see what I can do, at least in cases where it's totally non-ambiguous. In the meantime we still generate serialization code, we just don't implement the Deserialize trait.

e.g. after running the code I get after the fact:

Not generating Command::deserialize() - reasons:
	Array with optional field info: Info
Not generating Info::deserialize() - reasons:
	Array with optional field my_array: MyArray
	Array with optional field my_b_str: Vec<u8>
Not generating Response::deserialize() - reasons:
	field info: Info couldn't generate serialize
	Array with optional field info: Info

@nslowell
Copy link
Author

The resulting C code for CommandIDs is this:

struct CommandIDs_ {
    enum {
        _CommandIDs_Zero = 0,
        _CommandIDs_One = 1,
        _CommandIDs_Two = 2,
        _CommandIDs_Three = 3,
    } _CommandIDs_choice;
};

I don't see anything in the RFC about using "; @ label" but I do seem sentences like:
From 2.1.2

[A colon] is also the common way of naming elements of an array just for documentation; see Section 3.4.

This less clean solution works and produces basically the same C code (there's an extra underscore '_' for some reason):

Zero = 0
One = 1
Two = 2
Three = 3
CommandIDs = Zero / One / Two / Three

So, now I am running with this (removed all "labels")

Zero = 0
One = 1
Two = 2
Three = 3

CommandIDs =
    Zero /
    One /
    Two /
    Three

MyArray = [
    uint,
    uint,
    uint,
]

Info = [
    ? MyArray,
    ? bstr .size 10
]

Command = [
    CommandIDs,
    uint,
    ? Info,
]

Ack = 0
NAck = 1
Response = [
    CommandIDs,
    uint,
    Ack / NAck,
    ? Info
]

and am getting a crash:

-----------------------------------------
- Generating code...
------------------------------------
thread 'main' panicked at 'should not expose Fixed type in member, only needed for serializaiton: Fixed(Uint(0))', src/intermediate.rs:1426:31
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Any ideas?

Can you share what you ran with to get your "Not generating..." output?
I think the zcbor solution to an array-encoded struct with option fields is by adding "_present" booleans to the generated struct, since the order has to be maintained. I think Section 3.5.3 of the RFC helps explain things.

@rooooooooob
Copy link
Collaborator

@nslowell I got the array struct deserialization with optional fields working: #204

Note: regarding the .le on the integers we still generate without errors, they're just ignored. So the deserialization code looks something like:

let a = Ok(raw.unsigned_integer()? as u64)
    .map_err(|e: DeserializeError| e.annotate("a"))?;
let b = Ok(raw.unsigned_integer()? as u64)
    .map_err(|e: DeserializeError| e.annotate("b"))?;
let c = Ok(raw.unsigned_integer()? as u64)
    .map_err(|e: DeserializeError| e.annotate("c"))?;

without any checks.

So now with the above change to the CommandIds choice + Ack one and the PR I linked your code generates code, just without the .le checks for now. I might take a look later to see if it's a trivial change to add in a quick error check to deserialization code. If it is I can make a PR, if not it's not a high priority for us since we haven't needed to use it for our own generation purposes (outside of ones that just specifying rust primitives like uint .le 4294967295 and such, which we do support as a way to generate the code as u32, etc instead of u64).

@rooooooooob
Copy link
Collaborator

rooooooooob commented Jul 27, 2023

I don't see anything in the RFC about using "; @ label" but I do seem sentences like:

It's a non-standard thing. The RFC specifies comments as ; but they don't impact the underlying CBOR. We just use comments as a way to tell cddl-codegen certain things about how we want the generated output code. You could omit them and the underlying CBOR supported by the output generated code wouldn't change, just how the code looks like.

Here's the code I used using the default CLI options:

CommandIDs =
    0 ; @name Zero
  / 1 ; @name One
  / 2 ; @name Two
  / 3 ; @name Three

MyArray = [
    a: uint, ; .le 3,
    b: uint, ; .le 2,
    c: uint, ; .le 1,
]

Info = [
    ? MyArray,
    ? myBStr: bstr .size 10
]

Command = [
    cmd: CommandIDs,
    myInt: uint,
    ? Info,
]

AckRsp =
    0 ; @name Ack
  / 1 ; @name NAck

Response = [
    cmd: CommandIDs,
    myInt: uint,
    rsp: AckRsp,
    ? Info
]

Note: obviously if you run using master you won't get the Deserialize trait generated for some of those types, but if you run on that PR I linked the deserialization should be generated.

@rooooooooob
Copy link
Collaborator

[A colon] is also the common way of naming elements of an array just for documentation; see Section 3.4.

Yes, we do that in general. It's just that when you make a basic group like (zero: 0) our generator will then think you want a struct and it will generate pub struct Zero; pub struct One, etc then pub enum CommandIDs { Zero(Zero), One(One), ... }. If this is a common CDDL pattern for enums we could maybe add in a special case for it. The reason it's generating these structs is we just have the general group code generation which makes more sense when the fields aren't fixed e.g. vec3 = [x: uint, y: uint, z: uint] would generate pub struct Vec3 { pub x: u64, pub y: u64, pub z: u64 } etc.

@nslowell
Copy link
Author

nslowell commented Aug 4, 2023

Sorry the delay.

I ran with your exact CDDL and am getting a crash. See below. Any ideas?

> RUST_BACKTRACE=full ~/cddl-codegen/target/release/cddl-codegen --input Test2.cddl --output export
int
int
Switching from scope 'lib' to 'lib'
CommandIDs
MyArray
Info
Command
AckRsp
Response
_CDDL_CODEGEN_EXTERN_TYPE_
_CDDL_CODEGEN_RAW_BYTES_TYPE_


------------------------------------------
- Handling rule: lib:AckRsp
------------------------------------
AckRsp


------------------------------------------
- Handling rule: lib:CommandIDs
------------------------------------
CommandIDs


------------------------------------------
- Handling rule: lib:MyArray
------------------------------------
MyArray


------------------------------------------
- Handling rule: lib:Info
------------------------------------
Info
MyArray
MyArray


------------------------------------------
- Handling rule: lib:Command
------------------------------------
Command
CommandIDs
CommandIDs
Info
Info


------------------------------------------
- Handling rule: lib:Response
------------------------------------
Response
CommandIDs
CommandIDs
AckRsp
AckRsp
Info
Info


------------------------------------------
- Handling rule: lib:_CDDL_CODEGEN_EXTERN_TYPE_
------------------------------------
_CDDL_CODEGEN_EXTERN_TYPE_


------------------------------------------
- Handling rule: lib:_CDDL_CODEGEN_RAW_BYTES_TYPE_
------------------------------------
_CDDL_CODEGEN_RAW_BYTES_TYPE_

-----------------------------------------
- Generating code...
------------------------------------
thread 'main' panicked at 'can't read: static/serialization.rs', src/generation.rs:225:21
stack backtrace:
   0:     0x5641b7c903b1 - std::backtrace_rs::backtrace::libunwind::trace::h81b3ec7e24504ef6
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x5641b7c903b1 - std::backtrace_rs::backtrace::trace_unsynchronized::h06b108a8ac6d8321
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x5641b7c903b1 - std::sys_common::backtrace::_print_fmt::hddd2a53a0a5c1518
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:65:5
   3:     0x5641b7c903b1 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h8286d7e4527fc2ef
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x5641b7cbb0cc - core::fmt::rt::Argument::fmt::h8fd89467fbfd7337
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/fmt/rt.rs:138:9
   5:     0x5641b7cbb0cc - core::fmt::write::h51943d696b01a399
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/fmt/mod.rs:1094:21
   6:     0x5641b7c8d68e - std::io::Write::write_fmt::h917c762025be867c
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/io/mod.rs:1714:15
   7:     0x5641b7c901c5 - std::sys_common::backtrace::_print::h80196d93ed8f3afc
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x5641b7c901c5 - std::sys_common::backtrace::print::hfc2f9057cab2ca91
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x5641b7c9195a - std::panicking::panic_hook_with_disk_dump::{{closure}}::h044c489edf02dba6
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:278:22
  10:     0x5641b7c915f3 - std::panicking::panic_hook_with_disk_dump::hc012686c97dee411
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:312:9
  11:     0x5641b7c91f5b - std::panicking::default_hook::h38859ccab48a30da
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:239:5
  12:     0x5641b7c91f5b - std::panicking::rust_panic_with_hook::hbec509e90f57ee70
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:729:13
  13:     0x5641b7c91e57 - std::panicking::begin_panic_handler::{{closure}}::h24e3657f27ebdc7b
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:621:13
  14:     0x5641b7c90816 - std::sys_common::backtrace::__rust_end_short_backtrace::h0fb71023354b550a
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:151:18
  15:     0x5641b7c91ba2 - rust_begin_unwind
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:617:5
  16:     0x5641b79be133 - core::panicking::panic_fmt::haa55128da9cd75d4
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/panicking.rs:67:14
  17:     0x5641b79ece7c - cddl_codegen::generation::GenerationScope::export::he5da078ac6a96e1a
  18:     0x5641b7aa257f - cddl_codegen::main::h579e05b56747a76c
  19:     0x5641b7aa71d3 - std::sys_common::backtrace::__rust_begin_short_backtrace::hf7258d21fb51bae9
  20:     0x5641b7aac421 - std::rt::lang_start::{{closure}}::hd7b4de7351dc80e1
  21:     0x5641b7c86ffb - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h2240a7a2e9d7937c
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/ops/function.rs:284:13
  22:     0x5641b7c86ffb - std::panicking::try::do_call::h6d94044a2dac5948
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:524:40
  23:     0x5641b7c86ffb - std::panicking::try::hc16721a649b6347e
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:488:19
  24:     0x5641b7c86ffb - std::panic::catch_unwind::ha5ec842243b12570
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panic.rs:142:14
  25:     0x5641b7c86ffb - std::rt::lang_start_internal::{{closure}}::h718bb866ab67d475
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/rt.rs:148:48
  26:     0x5641b7c86ffb - std::panicking::try::do_call::hca57e22a5a3d68fb
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:524:40
  27:     0x5641b7c86ffb - std::panicking::try::hfd3cbded253620e6
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:488:19
  28:     0x5641b7c86ffb - std::panic::catch_unwind::h1fa2ff1a7cb410d5
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panic.rs:142:14
  29:     0x5641b7c86ffb - std::rt::lang_start_internal::h1dfc41da9ec083ca
                               at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/rt.rs:148:20
  30:     0x5641b7aa2c85 - main
  31:     0x7f4218c2a083 - __libc_start_main
                               at /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
  32:     0x5641b79be85e - _start
  33:                0x0 - <unknown>

@rooooooooob
Copy link
Collaborator

@nslowell I think it depends where you're running it from. So those files in /static/ aren't being located. Try running it from the cddl-codegen directory. Or pass in --static-dir=<location of /static/> so it knows where to find them.

@nslowell
Copy link
Author

nslowell commented Aug 7, 2023

cool, thanks. I was able to produce code with your sample you gave me.

So, now I've got this CDDL that's causing a different crash:

CommandIDs = (
    Zero: 0 /
    One: 1 /
    Two: 2 /
    Three: 3 /
    Four: 4 /
    Five: 5 /
    Six: 6 /
    Seven: 7
)

u32 = uint .size 4


Version = [
    major: u32,
    minor: u32,
    patch: u32,
]

Info = [
    ? Version,
    ? myStr1: bstr .size 10,
    ? myStr2: [*bstr],
    ? myStr3: [*bstr],
    ? myStr4: [*bstr],
    ? myInt: u32,
]

Command = [
    cmd: CommandIDs,
    value: u32,
    ? Info,
]

rsp = (Ack: 0 / NAck: 1)

Response = [
    cmd: CommandIDs,
    resp: rsp,
    ? Info,
]

backtrace:

-----------------------------------------
- Generating code...
------------------------------------
BytesList
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/intermediate.rs:1129:64
stack backtrace:
   0: rust_begin_unwind
             at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:617:5
   1: core::panicking::panic_fmt
             at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/panicking.rs:67:14
   2: core::panicking::panic
             at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/panicking.rs:117:5
   3: core::option::Option<T>::unwrap
             at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/option.rs:935:21
   4: cddl_codegen::intermediate::RustType::cbor_types
             at ./src/intermediate.rs:1129:39
   5: cddl_codegen::intermediate::EnumVariant::cbor_types
             at ./src/intermediate.rs:1946:46
   6: cddl_codegen::generation::generate_enum
             at ./src/generation.rs:6140:31
   7: cddl_codegen::generation::GenerationScope::generate_type_choices_from_variants
             at ./src/generation.rs:2947:9
   8: cddl_codegen::generation::GenerationScope::generate
             at ./src/generation.rs:625:25
   9: cddl_codegen::main
             at ./src/main.rs:152:5
  10: core::ops::function::FnOnce::call_once
             at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/ops/function.rs:250:5

@rooooooooob
Copy link
Collaborator

@nslowell If you rewrite those enums as mentioned earlier in the issue it won't panic e.g.:

CommandIDs =
   0 ; @name Zero
 / 1 ; @name One
 / 2 ; @name Two
 / 3 ; @name Three
 / 4 ; @name Four
 / 5 ; @name Five
 / 6 ; @name Six
 / 7 ; @name Seven

rsp = 0 ; @name Ack
    / 1 ; @name NAck

Unfortunately even with #204 we can't emit deserialization traits (serialization works though) because the optional array fields are ambiguous.

Look here:

    ? myStr2: [*bstr],
    ? myStr3: [*bstr],
    ? myStr4: [*bstr],

Let's say you're deserializing and you get two arrays of bytes. How will the code know if they should be myStr2 and myStr3? Or myStr2 and myStr4? Or myStr3 and myStr4?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants