Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial more managed #40

Closed
wants to merge 1 commit into from
Closed

Conversation

roozbehid
Copy link

@roozbehid roozbehid commented Oct 12, 2021

Initial commit of more managed argument.

I am not expecting this to be merged any time soon.
But overall I think I am good with this version. Looks more CSharpish and much more concise.

I tried to separate the logic of more managed in different files, but I think I broke some parts that were related to const in regular mode.

Copy link
Member

@lithiumtoast lithiumtoast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks more CSharpish and much more concise.

What do you mean?

public readonly int SizeOf;
public readonly int AlignOf;
public readonly int ArraySize;
public readonly struct CSharpType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a tab vs spaces thing going on here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I work on Windows and couldn't figure out how to work with analyzers and fix indents.

@lithiumtoast
Copy link
Member

It is not immediately clear to me what is meant be making the code "more managed". Could you please elaborate? Possibly with an example?

@roozbehid
Copy link
Author

Yeah. I didn't know what to come up with the name for this difference in behavior, so I used more managed.
But here are some of differences.

test.h

#include <stddef.h>

typedef int myint;

#define WINAPI 


typedef struct { 
	int field0; 
	int field1;
	char str_20[20];
	int arr_20[20];
	char* dynamic_szString;
	const char* const_sz_String;
	const char const_fixedstring30[30];
} MyStruct;

typedef MyStruct* MyStructPtr;
typedef MyStructPtr MyStructPtr2;

__declspec(dllexport) void function0(int a, MyStructPtr2 b);

__declspec(dllexport) int EnvAddRouterEx(int (*queryFunction)(void*, char*));
__declspec(dllexport) void function00(const int ar[10]);
__declspec(dllexport) void function1(const  char* a, MyStructPtr2 b);
__declspec(dllexport) void function2(const wchar_t* a, MyStructPtr2 b);
__declspec(dllexport) void function3(char* a, void* b, int* arr);
__declspec(dllexport) void function4(char** a, void* b, int* arr);
__declspec(dllexport) void function5(const char* a, void* b, int** arr);


typedef struct
{
    unsigned int    uiSize;
    void            *pUserData;
    unsigned int    uiVal;                              
    unsigned int    uiByteCount;                        
}   unused_struct ;

test.cs (generated with -mm)

public unsafe partial class test
{
    private const string LibraryName = "test";

    // Function @ test.h:21:28 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern void function0(int a, ref MyStruct b);

    // Function @ test.h:23:27 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern int EnvAddRouterEx(delegate* unmanaged<void*, CString8U, int> queryFunction);

    // Function @ test.h:24:28 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern void function00([In, MarshalAs(UnmanagedType.LPArray, SizeConst = 10)] int[] ar);

    // Function @ test.h:25:28 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern void function1([In, MarshalAs(UnmanagedType.LPUTF8Str)] string a, ref MyStruct b);

    // Function @ test.h:26:28 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern void function2([In, MarshalAs(UnmanagedType.LPWStr)] string a, ref MyStruct b);

    // Function @ test.h:27:28 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern void function3(CString8U a, IntPtr b, long* arr);

    // Function @ test.h:28:28 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern void function4(ref CString8U a, IntPtr b, long* arr);

    // Function @ test.h:29:28 (D:\test.h)
    [DllImport(LibraryName)]
    public static extern void function5([In, MarshalAs(UnmanagedType.LPUTF8Str)] string a, IntPtr b, ref long* arr);

    // Struct @ test.h:16:3 (D:\test.h)
    [StructLayout(LayoutKind.Explicit, Size = 160, Pack = 8)]
    public struct MyStruct
    {
        [FieldOffset(0)] // size = 4, padding = 0
        public int field0;

        [FieldOffset(4)] // size = 4, padding = 0
        public int field1;

        [FieldOffset(8)] // size = 20, padding = 0
        public fixed byte _str_20[20 / 1]; // char[20]

        public string str_20
        {
            get
            {
                fixed (MyStruct* @this = &this)
                {
                    var pointer = &@this->_str_20[0];
                    return Marshal.PtrToStringUTF8((IntPtr)pointer)!;
                }
            }
        }

        [FieldOffset(28)] // size = 80, padding = 4
        [MarshalAs(UnmanagedType.LPArray, SizeConst = 20)] public int[] arr_20; // int[20]

        [FieldOffset(112)] // size = 8, padding = 0
        public CString8U dynamic_szString;

        [FieldOffset(120)] // size = 8, padding = 0
        [In, MarshalAs(UnmanagedType.LPUTF8Str)] public string const_sz_String;

        [FieldOffset(128)] // size = 30, padding = 2
        [In, MarshalAs(UnmanagedType.LPUTF8Str, SizeConst = 30 / 1)] public string const_fixedstring30; // const char[30]
    }
}

test.cs (generate without -mm)

public static unsafe partial class test
{
    private const string LibraryName = "test";

    // Function @ test.h:21:28 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern void function0(int a, MyStructPtr2 b);

    // Function @ test.h:23:27 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern int EnvAddRouterEx(FnPtr_TEST_VoidPtr_CString8U_Int queryFunction);

    // Function @ test.h:24:28 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern void function00(int* ar);

    // Function @ test.h:25:28 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern void function1(CString8U a, MyStructPtr2 b);

    // Function @ test.h:26:28 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern void function2(CString16U a, MyStructPtr2 b);

    // Function @ test.h:27:28 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern void function3(CString8U a, void* b, long* arr);

    // Function @ test.h:28:28 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern void function4(CString8U* a, void* b, long* arr);

    // Function @ test.h:29:28 (d:\test.h)
    [DllImport(LibraryName)]
    public static extern void function5(CString8U a, void* b, long** arr);

    // FunctionPointer @ test.h:23:48 (d:\test.h)
    [StructLayout(LayoutKind.Sequential)]
    public struct FnPtr_TEST_VoidPtr_CString8U_Int
    {
        public delegate* unmanaged<void*, CString8U, int> Pointer;
    }

    // Struct @ test.h:16:3 (d:\test.h)
    [StructLayout(LayoutKind.Explicit, Size = 160, Pack = 8)]
    public struct MyStruct
    {
        [FieldOffset(0)] // size = 4, padding = 0
        public int field0;

        [FieldOffset(4)] // size = 4, padding = 0
        public int field1;

        [FieldOffset(8)] // size = 20, padding = 0
        public fixed byte _str_20[20 / 1]; // char[20]

        public string str_20
        {
            get
            {
                fixed (MyStruct* @this = &this)
                {
                    var pointer = &@this->_str_20[0];
                    var cString = new CString8U(pointer);
                    return Runtime.String8U(cString);
                }
            }
        }

        [FieldOffset(28)] // size = 80, padding = 4
        public fixed uint _arr_20[80 / 4]; // int[20]

        public Span<int> arr_20
        {
            get
            {
                fixed (MyStruct* @this = &this)
                {
                    var pointer = &@this->_arr_20[0];
                    var span = new Span<int>(pointer, 20);
                    return span;
                }
            }
        }

        [FieldOffset(112)] // size = 8, padding = 0
        public CString8U dynamic_szString;

        [FieldOffset(120)] // size = 8, padding = 0
        public CString8U const_sz_String;

        [FieldOffset(128)] // size = 30, padding = 2
        public fixed byte _const_fixedstring30[30 / 1]; // char[30]

        public string const_fixedstring30
        {
            get
            {
                fixed (MyStruct* @this = &this)
                {
                    var pointer = &@this->_const_fixedstring30[0];
                    var cString = new CString8U(pointer);
                    return Runtime.String8U(cString);
                }
            }
        }
    }

    // Typedef @ test.h:19:21 (d:\test.h)
    [StructLayout(LayoutKind.Explicit, Size = 8, Pack = 8)]
    public struct MyStructPtr2
    {
        [FieldOffset(0)] // size = 8, padding = 0
        public MyStructPtr Data;

        public static implicit operator MyStructPtr(MyStructPtr2 data) => data.Data;
        public static implicit operator MyStructPtr2(MyStructPtr data) => new() { Data = data };
    }

    // Typedef @ vcruntime.h:228:28 (C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include\vcruntime.h)
    [StructLayout(LayoutKind.Explicit, Size = 2, Pack = 2)]
    public struct wchar_t
    {
        [FieldOffset(0)] // size = 2, padding = 0
        public ushort Data;

        public static implicit operator ushort(wchar_t data) => data.Data;
        public static implicit operator wchar_t(ushort data) => new() { Data = data };
    }

    // Typedef @ test.h:18:19 (d:\test.h)
    [StructLayout(LayoutKind.Explicit, Size = 8, Pack = 8)]
    public struct MyStructPtr
    {
        [FieldOffset(0)] // size = 8, padding = 0
        public MyStruct* Data;

        public static implicit operator MyStruct*(MyStructPtr data) => data.Data;
        public static implicit operator MyStructPtr(MyStruct* data) => new() { Data = data };
    }
}
  • I dont know why clang removed that unused_struct. We should keep it or provide an options.
  • I still dont likw all explicit layout of structures, it means we have to have 2 different definition for 32bit and 64bit, while with just sequential and using correct types we should be able to get around it. And less clutter with all that FieldOffset(), etc.

@roozbehid
Copy link
Author

roozbehid commented Oct 17, 2021

I also recently discovered Microsoft convertor too : ClangSharpPInvokeGenerator
This is what they generate from that file:

namespace myns
{
    public unsafe partial struct MyStruct
    {
        public int field0;

        public int field1;

        [NativeTypeName("char [20]")]
        public fixed sbyte str_20[20];

        [NativeTypeName("int [20]")]
        public fixed int arr_20[20];

        [NativeTypeName("char *")]
        public sbyte* dynamic_szString;

        [NativeTypeName("const char *")]
        public sbyte* const_sz_String;

        [NativeTypeName("const char [30]")]
        public fixed sbyte const_fixedstring30[30];
    }

    public unsafe partial struct unused_struct
    {
        [NativeTypeName("unsigned int")]
        public uint uiSize;

        public void* pUserData;

        [NativeTypeName("unsigned int")]
        public uint uiVal;

        [NativeTypeName("unsigned int")]
        public uint uiByteCount;
    }

    public static unsafe partial class Methods
    {
        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern void function0(int a, [NativeTypeName("MyStructPtr2")] MyStruct* b);

        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern int EnvAddRouterEx([NativeTypeName("int (*)(void *, char *)")] delegate* unmanaged[Cdecl]<void*, sbyte*, int> queryFunction);

        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern void function00([NativeTypeName("const int [10]")] int* ar);

        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern void function1([NativeTypeName("const char *")] sbyte* a, [NativeTypeName("MyStructPtr2")] MyStruct* b);

        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern void function2([NativeTypeName("const wchar_t *")] ushort* a, [NativeTypeName("MyStructPtr2")] MyStruct* b);

        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern void function3([NativeTypeName("char *")] sbyte* a, void* b, int* arr);

        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern void function4([NativeTypeName("char **")] sbyte** a, void* b, int* arr);

        [DllImport("", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true)]
        public static extern void function5([NativeTypeName("const char *")] sbyte* a, void* b, int** arr);
    }
}

Less lines compared to both of us.

@roozbehid
Copy link
Author

Between all these 3 I would still choose "more managed" one :D
Its usage is just much easier. I don't think there is much compromise on speed too and to me feels more managed and CShapr-ish.

@lithiumtoast
Copy link
Member

The thing about this project, C2CS, is that it was not intended to generate bindings that are necessarily "safe" or standard C#. Rather, the goal was to generate bindings that are exactly to C functions which can be called from C# with little or no overhead including memory allocations. The context is that I created and used C2CS for interoperability with C libraries for real-time applications by specifically avoiding the garbage collector.

Now, the project you were previously using for C2CS was probably not real-time, at least not the same class of real-time applications as I was thinking originally. For this type of application, you are okay with string marshalling, array marshalling, and other non-pass through marshalling between C types and C# types that involve the garbage collector. Concern for allocating memory or speed of executing the C functions in these applications is apparently not that much of concern; what appears to be of larger value is easy to use C# API that wraps the native functions from C. Can you confirm this? If this is the case then we can base this as the context for the "user story" ("epic" type in this case). #27

Now there is obviously some low hanging fruit such as using ref keyboard instead of pointers, which you already discovered. Some more low hanging fruit is to use nint for void*. It would be good to sub-divide the work of the "epic" level in smaller pieces that can be accomplished in smaller pull-requests. I'm okay with having the command line option not be fully working yet when this work is being done.

@lithiumtoast
Copy link
Member

I dont know why clang removed that unused_struct. We should keep it or provide an options.

It's because the way I programmed the ast stage is that any structs only make it to the JSON if they appear in the function signatures or recursively stemming from the function signatures.

@lithiumtoast
Copy link
Member

I still dont likw all explicit layout of structures, it means we have to have 2 different definition for 32bit and 64bit, while with just sequential and using correct types we should be able to get around it. And less clutter with all that FieldOffset(), etc.

You would have two different .dll files on Windows for the C library for 32-bit vs 64-bit. The generated bindings reflect that fact. It's just how native "unmanaged" libraries and programs work.

One alternative for structs would have to be using Sequential layout because the structs must not have the members of a struct be able to be re-organised or re-ordered by the .NET runtime. If the .NET runtime were to reorganise or reorder the members, calling a C function with a struct as a parameter could lead to a crash because the C# API does not match the C API.

However the Sequential layout can not work for all use cases because:

  1. Unions are not possible. Some C APIs make use of unions in their structs. Explicit layout is the only way to get have these C structs in C#.
  2. "Fixed size buffers". Some C APIs make use of fixed C style arrays members in the C struct that are custom types which are not the equivalent C# types: bool, byte, char, short, int, long, sbyte, ushort, uint, ulong, float, or double. The only way to express such C structs in C# is to use one of the C# types just mentioned that is of exact size of the C struct array and use a function/property to "memory cast". However, to use fixed size buffers in C# the exact size must be declared in advance at compiler time which makes Sequential layout impractical.
  3. Struct padding/packing for alignment. Depending on 32-bit vs 64-bit, structs could have extra padding of bytes so that members are word aligned. Consider the following C struct on 32-bit vs 64-bit:
char *p;
char c;
long x;

For 32-bit the layout be like so:

char *p; // 4 bytes
char c; // 1 bytes
char pad[3] // padding of 3 bytes
long x; // 4 bytes

For 64-bit the layout be like so:

char *p; // 8 bytes
char c; // 1 bytes
char pad[7] // padding of 7 bytes
long x; // 8 bytes

Now the problem could be mitigated by using stdint.h types like int32_t and uint64_t, and I recommend all C developers do just as such. But that's not always the case out in the wild. Clang will report the exact size based on the header file for the computer architecture + operating system (ABI).

Now combine problem 2 and 3 together, and you have a mess. It's not much fun to do manual checking of structs size/alignment by hand. I think it's reasonable though when it's automated. So, to be conservative for the effort of being clear and precise I have all structs use explicit layout. Sequential would only really be of value if it was all or nothing so that 32-bit and 64-bit bindings could be generated with one set of C# structs. Otherwise, a mix of Sequential and Explicit would not bring much value. It would only remove information which is already there.

@lithiumtoast
Copy link
Member

Less lines compared to both of us.

Less lines is not always best. This point is moot when a human does not have to write the code manually and when the code is guaranteed to be correct via automation.

@lithiumtoast
Copy link
Member

@roozbehid Thanks for the PR and bringing up the discussion. I captured the bulk of the idea in two issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants