Now for something completely different. In the spiritual vein of One Night in Rio: Vacation photos from Plan9 and AWK for multimedia, here is a tool that is the link that ties almost all the projects within the Arcan umbrella together into one – and one we have been building towards for a depressing number of years and tens of thousands of hours.
Pipeworld (github link) combines ‘dataflow‘ programming (like Excel or Userland) with a Zooming-Tiling User Interface (ZUI). It builds dynamic pipelines similarly to (Ultimate Plumber), and leverages most of the gamut of Arcan features — from terminal emulator liberated CLIs and TUIs to dynamic network transparency. It follows many of the principles for a Diverging Desktop Future, particularly towards the idea of making clients simpler and more composable by focusing on interactive “breadboarding” data exchange.
There is a whole lot of ground to cover, so let’s get started.
The following is a video (youtube) that joins all the clips in this article together, but you will likely understand more of what is going on by reading through the sections.
Shortlinks to the individual sections are as follows:
- Zoomable UI
- Dataflow Computing
- Terminal, CLI and Pipelines
- Advanced Networking, Sharing and Debugging
- Trajectory and Future
Zoomable (Tiling) UI
The core is based around ‘cells’ of various types. These naturally tile in two dimensions as rows and columns. The first cell on each row determines the default behaviour of the row, and moving selection around will ‘pan’ the view to fit the current cell.
Each row has a scale factor, and the same goes for the cell. This means that different sets of cells can have different sizes (‘heterogenous zooming’) with different post-processing based on magnification and so on. Some cells can switch contents based on magnification, while others stay blissfully ignorant.
In the following short clip you see some of this in play, using a combination of keybindings as well as mouse gestures. This might make some feel a bit nauseous as this is not something that generalises well to groups of observers (we have solutions for that too) — but it is a different effect when it is your interaction that initiates and control the “zoom”.
The heterogeneous zooming allows for a large number of cells to be visible at the same time. Even when scaled down, client state such as notifications and alerts can still be communicated through decoration colours and animations.
Befitting of tiling workflows, everything can be keyboard controlled. Just like in Durden, mouse gestures, popup menus, keyboards and exotic input devices simply map to writes into file-system like paths.
When- or why- would tens to hundreds of simultaneous windows be useful? Some examples from my day to day would be monitoring, controlling and automating virtual machines,
botnets remote shell connections, video surveillance, ticketing systems and so on.
Dataflow Computing and the Expression Cell
Each cell has a type – with the default one being ‘expression’, where you type in expressions in a minimalistic programming language. The result of that expression mutates the type- or the contents- of the cell.
Expressions are processed differently based on which scope they are operating in – with the current scopes being:
- Expression – Arithmetic, string processing and other ‘basic types’ operations.
- Cell – Used for event handlers, changing, annotating or otherwise modifying visual or window management behaviour of the current cell.
- System – Global configuration, shutdown, idle timers and so on.
- Factory – Producing new cells or mutating existing ones.
In the following clip you can see some basic arithmetic; changing numeric representation; string processing functions and the ‘cell’ scope as a popup to tag a cell with a reference identifier which is then used in another expression.
In a sense this is 3 different kinds of command-line interfaces wrapped into one. These can modify, import from- and export to- cells of other types. This is where things gets more interesting and powerful.
The following video is an example where I first copy the contents of a file to the clipboard of a terminal (the somewhat unknown OSC;52 sequence). I then create an expression cell where I set its contents to a text message. I then start vim in the first terminal and then run the cell-scope expression:
type_keyboard(concat(a1.clipboard, b1), “fast”)
This reads as ‘send the clipboard content of the A1 cell and the content A2 as simulated keypresses into the currently selected cell using a moderately fast typing speed model. From the perspective of vim itself, this looks just like someone typing on a keyboard.
There are many more cell types than just terminals, command lines and expressions. Video playback; capture devices; images; native Arcan clients, such as our Qemu backend or Xorg fork and many more. In this clip you can see some of those – including Pipeworld running pipeworld. You can also see the current state of autocompletion and expression helper.
In this video I first open a capture device (web camera). I then spawn a terminal and copy the contents of a shader (GPU processing program) to its clipboard. This gets compiled and assigned the name ‘red‘. Lastly I sample a frame from the capture device using this shader.
What all of this means practically is that we can gather measurements, trigger inputs and stitch audio, video and generic data streams from clients together in a fashion that more resembles breadboards as used in electronics, and sequencers as used in sound production, than traditional desktop or terminal work.
Terminal, CLI and Pipelines
Lets go back to the command-line for a bit, as we have yet to poke at pipelines. In this clip I run the system scope function: pipe(“find /usr”, “grep –line-buffered share”)
A lot of magic is happening inside of this one. Each client is spawned separately and suspended. Then their runtime stdin/stdout pipes gets remapped based on the current pipeline before the chain is activated. Note that when I reset the cell representing the “find /usr” cell, the grep one remains intact and unaware that find was actually killed off and re-executed.
The API is lacking somewhat still, but technically there is nothing blocking the ‘pipes’ from being a mix between terminal emulators (isatty() == true), stdin/stdout redirected normal processes (isatty() == false) and fully graphical ones.
Typing # into an expression cell gives us a dusty old terminal emulator. We can also add some command to run afterwards, like #find /tmp. ‘Resetting’ a cell means re-evaluating the expression or command it represents.
In the following clip I first list the contents of /tmp and then setup a timer that will reset the cell ~every second. This is indicated by the coloured flash. Then I spawn a normal terminal and create a file in tmp, and you can see it appearing. To show that the ‘terminal’ behind the first command is still alive, I also swap out its colour scheme and have it resize to fit its contents.
Typing ! into an expression cell switches it into a CLI one. It uses a special mode in afsrv_terminal (the terminal that comes with Arcan) where no terminal protocols are emulated, no ‘pty devices’ are needed, and the CLI is native. As such we are no longer restricted to running a shell like bash that hides (“premature composition”) the identity, state and inputs/outputs of its children and commands.
Note that each discrete command becomes its own window, and the cell itself dictates the layout and scale of new and old command outputs.
Individual commands can be re-executed, clients run simultaneously and states such as clipboards are kept separate. Clients can switch to being graphical or embed media elements; they can announce their default keybindings; handle dynamic open and save from cut/paste and drag/drop; runtime redirect stdio; they can spawn sub-windows that attach on the same logical row and much more.
Advanced Networking, Sharing and Debugging
In the following clip you can see how quickly I go from a cell with external content and a debugger attached to the process associated with it, with lots of other options for process manipulation and inspection. Protections like YAMA are kept enabled, yet there is no sudo gdb -p mess going on. To learn more about this, see the article on Leveraging the Display Server to improve Debugging.
Lets go deeper. In the following clip I have set ‘arcan-dbgcapture’ as the kernel ‘core_pattern’ handler. Whenever a process crashes, the collector grabs crash information and the underlying executable. These gets wrapped around a Textual-UI with some options of what to do with the information. I then add a ‘listen’ cell that exposes a connection point for this TUI (‘crash’ in the video). Anything that connects to this point gets added to the row that this cell controls. To show it off, I run a simple binary that just crashes outright a few times, and you can see how the collections appear.
Since it is built using arcan-shmif, these connection points can be routed over a network – so swapping that ‘crash’ point out with something like a12://my.machine, embed into your base VM image, distribute to your fuzzing cluster or CI and enjoy network transparent debugging.
Moving on to the networking path and streaming/sharing. The following clip shows me migrating a terminal and qemu cell from a workstation to my laptop via the A12 network protocol. Note that the cell colours and font size change automatically as it is the target a client is presenting on that controls the current look and feel.
(The slight delay for the qemu window is a bug in the qemu Arcan display backend that does not properly re-submit a frame on migration, so it does not appear until the next time the guest updated its output).
Almost all cells can have an ‘encoder’ attached to them. This is simply a one-way composition transform that will convert the contents to some other format. A very practical example is recording to a video file or RTMP host, or something more obscure like OCR for pixels to text conversion.
In the following clip I set a cell encoder that act as a VNC server and then connect to that using the built-in VNC client that is part of Mac OS X. I then destroy the encoder and assigns a new one to a terminal. The OS X VNC viewer reconnects automatically, and you can see that input also works over the return path.
A cell type that blends well with this is ‘composition’ which puts the contents of multiple cells together using some layouting algorithm. The following clip shows that being used.
Attach an encoder to the composition cell and you have compartmented partial desktop sharing, another potent building block for interesting collaboration tools.
Trajectory and Future
These projects are not entirely disjunct. Pipewold has been written in such a way that the dataflow and window management can be integrated as tools in these two other environments so that you can mix and match – have Pipeworld be a pulldown HUD in Durden or 360 degrees programmable layers in Safespaces with 3D data actually looking that way.
The analysis and statistics tools that are part of Senseye will join in here, along with other security/reverse engineering projects I have around here.
Accessibility will be one major target for this project. The zoomable nature helps a bit, but much more interesting is the data-oriented workflow; with it comes the ability to logically address / route and treat clients as multi-representation interactive ‘data sources’ with typed inputs and outputs rather than mere opaque box-trees with prematurely composed (mixed contents) pixels and rigid ‘drag and drop’ as main data exchange. With programmable text-to-speech and OCR already available to any Arcan application, when combined with the logically
Another major target is collaboration. Since we can dynamically redirect, splice, compose and transform clients in a network friendly way, new collaboration tools emerge organically from the pieces that are already present.
Where we need much more work is at the edges of client and device compatibility, i.e. modify the bridge tools to provide translations to non-native clients. A direct and simple example is taking our Xorg fork, Xarcan, and intercept ‘screen reading’ requests and substitute for whatever we route to it at the moment – as well as exposing composed cell output as capture devices over v4l2-loopback and so on.
I can editorialise about this for hours, and although the clips here show some of what is already in here, there is so much more already existing and — much more to be done.