Replies: 19 comments 73 replies
-
This isn't 'request changes', just discussion points; I'm going to separate comments so we get some reasonable reply threading :)
Naming: This is technically correct and matches the builtin, but pretty verbose; in most languages, 'exec' isn't a true POSIX-style 'replace the current process', but implicitly has the fork - or acts like it. It also is describing an implementation detail - right now, we're doing fork-and-exec, but we may wish to use |
Beta Was this translation helpful? Give feedback.
-
IMO the command should be separate to the arguments:
|
Beta Was this translation helpful? Give feedback.
-
I see in the interface that this terminates the process - how forceful is it? Does it do a normal What if the user doesn't close it? Does HHVM clean it up at some undetermined time later like it does with file descriptors? For FDs, this is currently done at the end of the request, but could be done in the future by garbage collectors (we intentionally do not have observable refcounting in FDs) |
Beta Was this translation helpful? Give feedback.
-
Why is this a simulation? if the
... which ties into: is this function needed? |
Beta Was this translation helpful? Give feedback.
-
I have no design feedback in this comment, just more context There's another two reasons: using the shell requires users to escape arguments.If the functions are defined as using the shell, this can't be done transparently - as if it is, everything should be escaped and there's no way of actually using the power of the shell: Argument escaping is also not well-defined or reliable. Most bash-like shells handle escaping in a similar-ish way, but it's not actually identical. Even bash vs zsh have observably different behaviors when variables are involved. PHP and HHVM do a 'best effort' here, but that's not great for a security-sensitive feature. using-shell can be implemented on top of not-using-shell, but not the other way aroundJust call ... but if the shell is being used for the primitives, given escaping isn't reliable, it's not possible to build something that reliably acts shell-less proc_openFor completeness: there is currently no non-private API in HHVM to spawn a subprocess that does not use a shell: |
Beta Was this translation helpful? Give feedback.
-
While at the C level is one true CWD per process, we currently fake a per-request/per-thread one in the implementation of the Hack builtins, which affects |
Beta Was this translation helpful? Give feedback.
-
if the current variables are
|
Beta Was this translation helpful? Give feedback.
-
process vs request again here; though it would need to be process as request out/error/in are not necessarily FDs. Process doesn't feel like a good default though:
IMO:
|
Beta Was this translation helpful? Give feedback.
-
STDIO/FDs, part 2:
|
Beta Was this translation helpful? Give feedback.
-
We generally consider bool arguments an antipattern and bad for readability - perhaps:
|
Beta Was this translation helpful? Give feedback.
-
I agree with the reasoning, but not the wording of this :) Perhaps: "yes, however it is rarely needed, often abused, and should be considered a 'code smell'." |
Beta Was this translation helpful? Give feedback.
-
I have mixed feelings here: in general, The current design would clearly be correct if there was a new |
Beta Was this translation helpful? Give feedback.
-
Switching this behavior based on if the command is part of the arguments vector or a separate argument is also a bit surprising to me.
I'm ~ 60% of the way to: we should have one consistent behavior for using or respecting PATH or not, and one consistent flag to explicitly change the behavior. If so, I'm ~ 70% on the default being to use $PATH to meet user expectations |
Beta Was this translation helpful? Give feedback.
-
We shouldn't count on this, and we should design the best API we can for the language we have, rather than the one we hope we'll have in the future :) If we knew for a fact we were never adding named parameters, would you still want this design? Builder objectsI'm not saying to switch to builder objects, but I think they could do with more consideration:
The last two could be addressed in a function-based approach with nullable enum classes - this would probably be best addressed with a convention like "all process arg enum classes have a 'DEFAULT' member - specifying The disadvantages of functions + enum classes are:
|
Beta Was this translation helpful? Give feedback.
-
So, new version (as of 2021-12-07):
-- Long-lived process high-level API: This is very close to 'the OS\ API, but with IO handles'; this seems very verbose, and is duplicating a lot of functionality that exists in IO\ (or needs to be in OS) with a slightly different framing. I'm not convinced it's the right approach, or that we need one (yet?) - but I don't have a better suggestion. IMO the best thing to do is leave it out entirely for now, then see what abstractions end up making the most sense when we try to use the other APIs in HHAST, FB WWW, docs.hhvm.com test runner (when talking to the typechecker), and that other users use in their projects. |
Beta Was this translation helpful? Give feedback.
-
I might have been misunderstanding this, and need to test: pretending that it's always implemented with fork and exec, does this do:
If it's if it's |
Beta Was this translation helpful? Give feedback.
-
To summarize how I see the current state: Low Level API
OS as a wholeShould it be a thinner layer? I don't think we agree here, but we don't need to to make progress on subprocesses. Being discussed here: hhvm/hsl#169 (comment) High level API:
|
Beta Was this translation helpful? Give feedback.
-
This to me feels like yet another low level API, not a high level API. a good example of high level sub-process API that come to mind:
However, these are the thing that the API missing in my opinion:
|
Beta Was this translation helpful? Give feedback.
-
another thing to point out somehow related to complexity, is that this is promoted as a replacement for PHP function shell_exec. however, the simplicity offered by shell_exec is much better, i suggest, aside from having the Process API, to also have this can act as a more forward replacement for |
Beta Was this translation helpful? Give feedback.
-
This proposal includes an API to create subprocess, as an alternative to existing PHP functions
shell_exec
,proc_open
, etc. The proposed subprocess API is consistent with HSL design and easy to be used safely, supporting asynchronous operators everywhere possible.Getting Started
To start a process, simply call
spawn
with avec
of the executable and its arguments. For example, the following code executes the commandcomposer update
and throws an exception if it failed.Note that
spawn
will return aProcess
, which could be used to monitor the process status. Internally, a PID is associated with theProcess
, and the user has the responsibility toclose
it, or letusing
statement clean it up with the help of thecloseWhenDisposed
idiom.API
Low level API
The low level API includes a thin wrapper of
posix_spawnp
using raw file handles.posix_spawn_file_action_
functions are replaced byvec<PosixSpawnFileAction>
, andposix_spawn_attr_
functions are replaced byPosixSpawnAttr
.High level API for long-lived sub-processes
High level API for short-lived sub-processes
Use cases
Short-lived processes
For short-lived process whose standard I/O can be used as a whole string, the high-level API
execute_and_read_output
can be used as an alternative toshell_exec
.For example, suppose we want to use
afinfo
to extract the data format information from an audio file, previously the task could be done with the help ofshell_exec
:Now, the same functionality can be implemented with the new
execute_and_read_output
API:Long-lived processes
execute_and_read_output_async
is not suitable for a long-lived process because it does not return until the process exited. Usespawn
for long-lived processes where the standard I/O can be handled at real time. The common pattern to handle standard I/O instantly is to create a pair of pipe stream, then redirect standard output or standard error to the write stream. For example, suppose we want to compile a big C++ project frommake
, which would take 1 hour to complete, the following example could log each line from the sub-process' stdout and stderr instantly:Design Considerations
Do we use shell?
No.
The PHP API
shell_exec
executes a command as a shell script for redirecting standard I/O and piping multiple processes. This approach is essentially letting the users write shell script code generators in Hack. As a result, it is easy to create insecure shell script with shell injection loopholes, because the Hack compiler cannot detect issues in the shell scripts. Also the common practice to use shell pipe operator to process texts, with the help of utilities likegrep
orsed
, is inefficient due to the overhead of creating unnecessary processes for simple text processing tasks.For comparison,
ProcessBuilder
in Java does not use shell, while theproc_open
function in PHP and thesubprocess
module in Python provide options to execute the command either via shell, or directly using system calls.The proposed proposed subprocess API is similar to
ProcessBuilder
, avoiding using implicitly use the shell, therefore, users do not have to escape and concatenate arguments into a shell script. It always requires avec<string>
to represent the command to execute.Is the user required to specify full path of the executable
No.
In Python, it is recommended to explicitly resolve the full path of the executable with the help of
shutil.which()
, for best portability between POSIX and Windows. However, explicit path resolution is not necessary in the proposed subprocess API because Hack supports only macOS and Linux. The proposed subprocess API will consider$command[0]
as the executable and internally useexecvpe
orexecvp
to resolve the full path from$PATH
.Will the new API support piping multiple processes?
Yes. However it is rarely needed, often abused, and should be considered a 'code smell'
The proposed subprocess API supports I/O redirection, therefore, it is possible to pipe processes by redirecting their stdin and stdout to a pair of pipe handles. However, most use cases, if not all, of shell pipe operators should be able to be refactored to Hack code manipulating a single process. The usage of text processing utilities, including
grep
andsed
, should be replaced by functions inHH\Lib\Str
andHH\Lib\Regex
.Is there any pipe file handle internally created?
Not in
spawn
.The
proc_open
function in PHP and thesubprocess
module in Python supports either preexisting file handles, or string constants as the instruction to let the API create new file handles, whileProcessBuilder
in Java can only redirect I/O to newly created files or pipes.In contrast, the new
spawn
API only accepts preexisting file handles, requiring the user to create file handles and pass to the newly created process, because this approach is more concise, more orthogonal and more consistent.How does the proposed subprocess API interoperate with other HSL APIs
The proposed subprocess API conforms HSL IO Key Design Decisions. The Hack class
IO\WriteFDHandle
andIO\WriteFDHandle
are accepted as parameters, instead of raw file handles or PHP resources. TheProcess
interface also extendsIO\CloseableHandle
, supporting thecloseWhenDisposed
idiom.The proposed subprocess API is as async as possible. Unlike
pcntl_waitpid
in PHP, we providegetExitCodeAsync
to wait for the exit code asynchronously.How to specify the process creation options?
The
proc_open
function in PHP and thesubprocess
module in Python accepts parameters with default values for process creation options, whileProcessBuilder
in Java is implemented as the builder design pattern, maybe because Java does not support optional parameters.The proposed subprocess API is similar to PHP or Python, accepting process creation options as parameters. It does not accept options in a
shape
, because we consider options inshape
as a workaround of lacking the language feature of named parameters. Hopefully Hack would support named parameters in the future, making it easier to specify optional parameters.Beta Was this translation helpful? Give feedback.
All reactions