Skip to content

Design: Process abstraction

gerdstolpmann edited this page May 2, 2016 · 6 revisions

Current

At the moment, there is no real process abstraction in the omake sources, though there is some very interesting shell functionality.

There are two ways of launching processes:

  • as command pipeline for executing rules
  • from the omake DSL for getting the output of a command: $(shell ...)

Commands may be written in omake itself, or may be external commands. At the moment, omake guarentees parallel execution of pipeline commands and that the commands are really connected with pipes (generally no temp files, except of a few hardcoded special cases, like reading in dependency files). This is no problem when the command is an external command. When the command is an internal command, omake has to spawn a second runner. This is a forked subprocess on Unix and a second thread on Win32.

Also, the pipelines are not necessarily executed on the local host. There can be remote command relays (for distributing the build load to several hosts).

Next

Generally, I'd like to simplify things a lot. I don't think that omake profits a lot from real parallelism. So let's drop this. The complications of parallelism: on Unix, forking the omake process is a very expensive operation, in particular when omake has already allocated a lot of memory. On Win32, it is tried to emulate "fork" with the help of threads, but this emulation is cumbersome and error-prone. The alternative is to resort to temporary files where needed. Analysis of this:

  • An external-only pipeline ext1 | ext2 | ... | extN can still run in parallel with the caller. The commands are connected with pipes, as well as input/output of the while pipeline with the caller. No difference from what is done at the moment.
  • When there is a single internal command part of the pipeline ext1 | int1 | ... | extN we suspend the execution of the caller, and run this internal command instead (in the omake process). This internal command is connected with pipes to the external commands, and runs in parallel to these commands. Any output produced by the pipeline is diverted into a temp file. If the input of the pipeline is not redirected we are done. If the input of the pipeline is redirected we handle the whole things as if the pipeline were of the form caller | ext1 | int1 | ... | extN. The caller code is done when the file handle writing to the pipeline is closed - this triggers the execution of the rest of the pipeline. (=> We need the ability to catch the event of closing file handles.)
  • When there are several internal commands ext1 | int1 | ... | int2 | ... | intM | extN. Assume that the caller never writes into the pipeline (because we can transform this into a longer pipeline caller | ...). We break the pipeline up into sequential pieces consisting of at most one internal command and any number of external commands. The sequential piece k writes into a temporary file, and piece k+1 reads from this temp file.
  • When an internal command intk is running, it is possible that this command invokes another sub pipeline. This means our process model needs to be hierarchical.
Clone this wiki locally