phlog

Source code for my blog/gemlog. It used to be on gopher, hence the name
git clone http://shtanton.xyz/git/repo/phlog
Log | Files | Refs

commit a6d5c0553351b671453a15e9558cf64c9b9a2d9b
parent 8f3b764f9479431bc865a780a8dd9c561955b1dd
Author: Charlie Stanton <charlie@shtanton.xyz>
Date:   Tue,  7 Sep 2021 22:04:18 +0100

Completely change the piping post to go in the other direction. Long live pipes!!!

Diffstat:
Mposts/better_than_stdio.gmi | 94++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 79 insertions(+), 15 deletions(-)

diff --git a/posts/better_than_stdio.gmi b/posts/better_than_stdio.gmi @@ -1,24 +1,88 @@ -I recently wrote a blog post about DSLs and also was getting frustrated by the restrictiveness of linux pipes. +I was writing a post where I would design a language that would be better than the shell for running a bunch of processes in a pipeline with data flowing between them. It turns out that you can do a lot more with the a POSIX shell than I had realised. This post does assume a familiarity with the linux shell and C, but I will try to explain everything I do. -For quite a while I've been bothered by how restrictive using stdio with pipes is for doing complex jobs in the shell. Suppose I have an irc client which takes input and sends it as irc messages, and any received irc messages it sends as output. At first glance this seems like a good fit for stdin and stdout. Suppose that I have some irc bot that takes messages as input and outputs responses to those messages, already we can no longer use the pipes as the system of processes is cyclical, but perhaps we also want our script to forward its input to the irc client so the user can still send messages while they are running the bot? This also can't be done with pipes. +My gripes with pipes in the shell were as follows: +- Only for passing string data delimited by newlines. +- Can't do cyclical data passing. +- Data streams can't be combined or duplicated. +- Each process only has 1 input and output stream (stdin and stdout). +- Only supports message passing style i.e. no shared memory or semaphores. -Data flow can reach this desired level of flexibility if we introduce fifos and tee, fifos allow 2 processes to combine their outputs and +I will now dissect and destroy my own arguments against the shell, hopefully saving you the effort of designing a DSL and then scrapping it like I did. -What the shell does well. +## Passing any data I want -I thought these could be combined in a new exciting way with a DSL for connecting executables together in a more versatile way than stdio would allow. -tee is good +I was used to writing C that looked like this (only with way better style), as this is the way stdio is used most commonly in my experience: -Goals: -- Make non-linear pipelines simple and easy -- Support message passing, don't worry about semaphores and shared memory -- Do as little else as possible +main.c +``` +#include <unistd.h> -Elaborate on these 3 ideas, why not use fifos? +int main() { + write(1, "hello world", 11); + return 0; +} +``` -POSIX IPC!!!! Based on System V IPC +receiver.c +``` +#include <unistd.h> +#include <stdio.h> -2 examples demonstrating non-linear and message passing superiority to stdio +int main() { + char buf[80]; + read(0, &buf, 80); + printf("%s\n", buf); + return 0; +} +``` -Potential problems: -- Debugging (keep stderr?) +These can be compiled into executables and run like so: +``` +./main | ./receiver +``` +which will output "hello world" to the shell. This is because the data ("hello world") in main is "piped" into receiver which reads it and then displays it. However, there really is nothing about this that stops us sending other data types over the pipe: + +main.c +``` +#include <unistd.h> +#include <string.h> + +#include "types.h" + +int main() { + struct Person charlie; + strcpy(charlie.first_name, "Charlie"); + strcpy(charlie.last_name, "Stanton"); + write(1, &charlie, sizeof(struct Person)); + return 0; +} +``` + +receiver.c +``` +#include <unistd.h> +#include <stdio.h> + +#include "types.h" + +int main() { + struct Person charlie; + read(0, &charlie, sizeof(struct Person)); + printf("%s %s\n", charlie.first_name, charlie.last_name); + return 0; +} +``` + +types.h +``` +struct Person { + char first_name[30]; + char last_name[30]; +}; +``` + +Which can be compiled and run to output "Charlie Stanton" even though the data passed through the pipe wasn't a string! Just because basically every Unix utility uses string and splits the data by line when reading it, it doesn't mean we have to. There's no reason why we can't use the pipe for a live video/audio stream, cap'n proto, data coming from some input device or anything else we want. Strings separated by newlines is quite a strong convention, but I think it's safe to completely ignore when convenient. + +## Cyclical data passing + +Suppose I want the output of one process feeding into the input of another and vice versa. Perhaps my irc bot is both receiving and sending messages through my irc client. I wasn't able to find a perfect solution to this, but we have a very good friend called the FIFO who can help us out greatly.