Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework the POSIX API layer in the standard library #2380

Closed
andrewrk opened this issue Apr 29, 2019 · 17 comments
Closed

rework the POSIX API layer in the standard library #2380

andrewrk opened this issue Apr 29, 2019 · 17 comments
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@andrewrk
Copy link
Member

Currently, we have these OS-specific APIs:

  • std.os.linux
  • std.os.windows
  • std.os.darwin
  • std.os.freebsd
  • std.os.netbsd
  • std.os.wasi

std.os.posix is defined like this:

pub const posix = switch (builtin.os) {
    Os.linux => linux,
    Os.macosx, Os.ios => darwin,
    Os.freebsd => freebsd,
    Os.netbsd => netbsd,
    Os.wasi => wasi,
    else => @compileError("Unsupported OS"),
};

So what we have is an identifier posix that is mapped directly to the OS-specific API for all platforms except Windows. This is problematic for several reasons:

  • Each OS-specific API has a different set of functions available.
  • Some functions have different number of or incompatible parameters.
  • It's quite possible to implement a POSIX API layer on top of windows, at least for some things.
  • We have this std.os.posixFoo pattern, which is a code smell, because all the functions are all prefixed with posix but cannot be in the posix namespace.
  • There is a need for "glue code" to work around kernel API footguns. For example when the number of iov_len is greater than some kernel-defined limit, it has to be broken into multiple syscalls. Or when the number of bytes to the write syscall is greater than some number, it has to be broken into multiple syscalls.

This led to 4a8c992 in which, for example, std.os.linux.fork surprisingly does not cause a fork syscall. Further, with the WASI ABI, it's clear that there is a separate "core" API layer as well as a "POSIX" API layer.

I propose to make the following changes:

  • Make the OS-specific files do exactly what they say. E.g. if you call std.os.linux.fork, that is going to do the fork syscall when you observe the program with strace, guaranteed.
  • Instead of std.os.posix mapping directly to OS-specific files, it will provide a "zig flavored POSIX" API on top of the native OS API. This is where kernel limitations will be worked around; fork might call clone on Linux, etc.
  • Functions matching the pattern std.os.posixFoo will be moved to std.os.posix.foo. In Zig-flavored POSIX, errno is not exposed; instead actual zig error unions and error sets are used. When not linking libc on Linux there will never be an errno variable, because the syscall return code contains the error code in it. However for OS-specific APIs, where that OS requires the use of libc, errno may be exposed, for example as std.os.darwin.errno().
  • There will even be a zig flavored POSIX API for Windows. Some functions may cause a compile error if the abstraction does not hold up, but for some things, it will work fine.
@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. standard library This issue involves writing Zig code for the standard library. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Apr 29, 2019
@andrewrk andrewrk added this to the 0.5.0 milestone Apr 29, 2019
This was referenced Apr 29, 2019
@daurnimator
Copy link
Contributor

How far should the "posix" library go? Only include things specified by posix? What about APIs that almost all posix platforms provide?

I imagine the file structure will be:

std/os/linux.zig
...
std/os/freebsd.zig
std/posix.zig
std/posix/linux.zig
...
std/posix/freebsd.zig

Where std/posix.zig dispatches the relevant OS-specific wrappers?

@andrewrk
Copy link
Member Author

How far should the "posix" library go? Only include things specified by posix?

It can have everything that would be convenient to abstract across operating systems at the posix layer. It can go beyond posix specifications. For example there can be a kqueue function. The goal is not to be compliant in any sense; more to provide an API at the same level of abstraction.

What about APIs that almost all posix platforms provide?

They're in 👍

I imagine the file structure will be:

That looks right. The std/posix/linux.zig etc files might not be necessary, depending on how much code there ends up being in std/posix.zig.

@daurnimator
Copy link
Contributor

It can have everything that would be convenient to abstract across operating systems at the posix layer. It can go beyond posix specifications. For example there can be a kqueue function. The goal is not to be compliant in any sense; more to provide an API at the same level of abstraction.

I think using "posix" to mean anything other than what is actually defined as API/ABI in the posix standard itself is dangerous: what happens if POSIX gets updated and gains functions with the same name as whatever we define?

Perhaps we should name our module something else? And have a "posix" module only contain what is actually in posix? That way if your program only used the posix library, then you'd know that your zig program is portable to any OS that claims POSIX compliance (which of course is the whole point of POSIX).

@hryx
Copy link
Contributor

hryx commented Apr 30, 2019

Hijacking the word "posix" does feel weird. Why not have cross-OS abstractions just live in std/os?

@mikdusan
Copy link
Member

mikdusan commented Apr 30, 2019

If std.os.posix is to be convenient and tailored to Zig, that sounds very similar to std.os. In other words, why not just have std.os.kqueue instead of std.os.posix.kqueue?

My main question is what would differentiate std.os vs. std.os.posix? Is it to mean std.os.posix is intended to be "lower-level" API than std.os?

@andrewrk
Copy link
Member Author

Why not have cross-OS abstractions just live in std/os?

We do have those already, and those are the preferred cross-platform abstractions for general use. Most of the implementations of these use the posix API layer for non-windows. This abstraction layer exists today. If it's not called posix then it has to be called something. Right now it's awkwardly called std.os.posixFoo (where Foo is the posix function name). Have a look at std/os.zig to see what I mean. This proposal would strictly be an improvement, even if everybody agreed that "posix" was a bad name for this abstraction layer.

An example "posix" abstraction layer function is to write to a file descriptor. That's too low-level for std.os. An example std.os abstraction layer function is std.os.File.open and std.os.File.write. The File struct is a cross platform abstraction that removes the concept of file descriptors.

what happens if POSIX gets updated and gains functions with the same name as whatever we define?

If POSIX got a sendmmsg then it would work perfectly. What other example are you thinking of that is problematic?

@daurnimator
Copy link
Contributor

If POSIX got a sendmmsg then it would work perfectly. What other example are you thinking of that is problematic?

There have been plenty of times in the past where things have gotten mixed up as they make their way into posix.
A recent example of an almost-posix API that I've had trouble with is pthread_setname_np. It is slightly different on every posix system; and each with their own caveats. See wahern/cqueues#208 (comment)

@hryx
Copy link
Contributor

hryx commented Apr 30, 2019

[std.os] are the preferred cross-platform abstractions for general use

Ah, so this new "posix" is more of an abstraction for the sake of stdlib internals? (But possibly also useful outside of that?)

This abstraction layer exists today.

Right — sorry, I didn't quite get that before. Like this?
https://github.com/ziglang/zig/blob/master/std/os/time.zig#L31

I think it makes more sense to me with that explanation.

@andrewrk
Copy link
Member Author

Ah, so this new "posix" is more of an abstraction for the sake of stdlib internals? (But possibly also useful outside of that?)

Yes, precisely. std.os.posixSleep that you linked is a good example.

@bb010g
Copy link

bb010g commented Apr 30, 2019

An example "posix" abstraction layer function is to write to a file descriptor. That's too low-level for std.os.

Could it go in a std.os.low?

@andrewrk
Copy link
Member Author

Could it go in a std.os.low?

Alternate name suggestions are welcome. I think low is worse than posix on account of being too vague.

@jayschwa
Copy link
Contributor

My suggestion:

  • Cross-platform, high-level abstractions live somewhere that is not std.os. Examples:
    • std.fs.File.open, not std.os.File.open
    • std.time.sleep, not std.os.sleep
  • std.os contains low-level abstractions and fulfills the niche that "Zig-flavored posix" seems to be right now.
  • std.os.foo contains thin wrappers for syscalls and functions specific to the Foo operating system.
    • std.os.posix (if it were to exist) is only wrappers for the POSIX API.

@Rocknest
Copy link
Contributor

Rocknest commented May 3, 2019

Alternate name suggestion: std.os.pozig 🤔

@andrewrk
Copy link
Member Author

andrewrk commented May 3, 2019

@jayschwa I like your idea - I think it does solve some problems. There are 2 things that make me hesitate:

  • One of the differentiating factors of Zig is that its standard library works in freestanding mode. Your suggestion goes in the direction of losing the distinction between what depends on an OS and what does not.

  • The "zig flavored posix" API layer will sometimes be the incorrect abstraction to use. I want it to be clear to programmers what the different abstractions are. Given only these and without reading any documentation, how do you know which one to use?

    • std.os.open
    • std.fs.File.openRead

    The answer is the second one, but I could easily see how someone would choose the first one.

@mikdusan
Copy link
Member

mikdusan commented May 3, 2019

  • I want it to be clear to programmers what the different abstractions are. Given only these and without reading any documentation, how do you know which one to use?

    • std.os.open
    • std.fs.File.openRead

    The answer is the second one, but I could easily see how someone would choose the first one.

elevate std.os to root namespace and just be os . no longer part of zig's standard abstraction, no longer confusion.

@jayschwa
Copy link
Contributor

One of the differentiating factors of Zig is that its standard library works in freestanding mode. Your suggestion goes in the direction of losing the distinction between what depends on an OS and what does not.

I believe it would be better to group abstractions by functionality than by implementation. I might be straying too far from the original topic with the following hypothetical, but it illustrates what I mean.

Right now, the idea of a "file" in Zig is OS-specific, but consider that other modules might want to satisfy Zig's (hypothetical) interface for files (e.g. reading, writing, seeking) or filesystems (e.g. listing, opening, renaming, walking). Examples of implementing modules could be ipfs, zip, or jays-custom-filesystem for freestanding targets. Pseudo-code:

const os_fs = try std.os.file_system() // some kind of `std.fs.FileSystem` interface
const zip_file = try os_fs.open("foo.zip") // some kind of `std.fs.File` interface
const zip_fs = try zip.from_file(zip_file).file_system()
const zip_walk = zip_fs.walk()
while (zip_walk.next()) |path| {}

Note that my original suggestion (std.fs.File.open, not std.os.File.open) doesn't really fit anymore since the implementing module would need to have an entry point for satisfying the interface (std.os.file_system() in my example).

The "zig flavored posix" API layer will sometimes be the incorrect abstraction to use. I want it to be clear to programmers what the different abstractions are. Given only these and without reading any documentation, how do you know which one to use?

Agreed that it would be difficult to know from name alone which is the high versus low-level one. Personally, if I needed a foo and I saw a std.foo and std.os.foo, I'd look at the former first, but I'd read the documentation no matter what.

@andrewrk andrewrk added the accepted This proposal is planned. label May 16, 2019
@andrewrk
Copy link
Member Author

I think @jayschwa's use case outlined above is an important one, which I would like to continue discussing. However for this particular issue, there is a pressing need to do this re-org as I've outlined originally. This issue brings us closer to being able to solve Jay's use case even if it does not bring us all the way there, so I'm going to do it and then we can re-evaluate (1) naming and (2) where to go next. And then finally remember we have the standard library audit (#1629) before 1.0.0 for yet another re-evaluation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests

7 participants