I wrote a browser
No, it ain’t a clickbait. I really wrote a browser. For KaiOS, to start with. Now, would you ask, isn’t it pointless because KaiOS already is a browser?
Well, yes, KaiOS (at least its user-facing part) is a browser by itself. But it can’t talk Gopher. And my browser can.
Of course, the entire idea of the Kopher project is not that I’d hope someone who has a KaiOS phone would also happen to know how to install non-store apps onto it and be interested in lighter alternatives to the Web all at the same time. I also don’t use any of the KaiOS phones listed on the project page as my daily driver now, although really considering going back to Crosscall Core-S4 from time to time. I created Kopher purely out of self-challenge (to create a fully functional browser from scratch within 3 to 5 evenings) and for fun and amusement. And the byproduct of this process is a lot of new stuff I learned (or relearned) about JS and KaiOS development specifics, the evolution of my own UI/UX preferences and the Gopher protocol itself and the whole philosophy behind it.
Let me begin my story with the fact that this development didn’t start with the Kopher name and wasn’t even meant to be a Gopher client, but rather a Gemini one. Yes, that Gemini. I really loved the idea of a single line of request and a single response header line. I really loved the Gemtext format because it even is much more parseable and less ambiguous than Markdown. All this was something I really considered being able to implement within a very reasonable timeframe, to achieve great performance even on the slowest KaiOS devices and to have a lot of fun in the process. But then, when almost everything was ready (or so I thought), something really unexpected happened. Well, of course, I should have expected that but I didn’t because, after all this time since my KaiOS activism heydays, I already had forgotten how big of a pile of garbage this platform was. Yep, there turned out to be absolutely no way to ignore SSL certificate errors when using MozTCPSockets with the useSecureTransport
flag. And that’s Gemini we’re talking about, where the TOFU (trust on first use) model with self-signed certificates is not only permitted but actually encouraged in the specification. I asked you-know-who and he confirmed we cannot bypass the errors, and he also dropped some hate on the Gemini protocol itself. Oh well… The only viable alternative would be using some elephant-sized TLS-in-JS implementation like Forge with plain MozTCPSockets, but I rejected that route right away because it would be outright slow and stupid. That’s why I immediately decided to repurpose my browser draft to something where I surely wouldn’t have such troubles: the Gopher protocol.
When talking about Gopher though, it’s quite a debatable thing of what to consider “a protocol”. Because if we omit the Gophermap format (and that really is just a format, like Gemtext, Markdown, XML or HTML), the protocol itself doesn’t consist of anything meaningful and can be described with a single sentence. Here it is. The client connects to the server on a TCP port (usually 70), sends a single CRLF-terminated selector string and reads the response data from the socket until the server closes the connection. That’s actually it. That’s the whole protocol. Connect-request-response. It probably can’t get any simpler than that. Everything else described in RFC1436 is a set of conventions to make clients and servers behave in a more-or-less standard and usable way. But the protocol itself doesn’t enforce any logic, all the logic is built into servers and moreso into clients. It’s entirely up to the server to figure out that a tab-separated selector string consists of the actual selector and the search query (the tab character here is not unlike the ?
sign in Web and Gemini), and it’s entirely up to the client to figure out what to do next with the requested content. In fact, the greatest and simultaneously the weirdest thing about Gopher responses is that they are completely headerless: here, unlike HTTP or even Gemini, the server returns the response body only. Whatever type it is. On one hand, this allows to directly save the response to a file with no additional processing necessary, on the other hand, this requires us to know the file/resource type before we make an actual request for it. And in order to do this, our client needs some kind of source of truth.
And this is where Gophermaps come into play. Again, it’s just some convention everyone agreed upon: if the selector string is empty, the server just returns its root Gophermap and the client is expected to parse it. Something akin to index.html
on the Web. The map itself (or, as they also call it, the “Gopher menu”) is just a plain TSV text where each line contains from 1 to 4 tab-separated string fields. These fields are: type+description, selector, host and port. Type is the first character of the first field, the rest of it is the description. The selector field is optional: if omitted and there are no other fields, the description value is used as the selector. The host and port fields are also optional: if omitted or empty, the host/port of the machine where you requested the Gophermap from are used. Now, depending on the type character value, the client may either display a message (i
for information messages, 3
for error messages) from the rest of the description field, or generate a link to the resource referenced in that line, and the type character usually gives enough information on what to do next: 0
means a plain text file, 1
means another Gophermap, 5
means an archive file, 7
means a full-text search query (just a fancy term for “ask the user for an input and then append it after the tab character to the selector string in the actual request, and then treat the response as type 1”), 9
means a generic binary file, g
means a GIF image, I
means any image, d
means a non-text document, s
means a sound file, ;
means a video file. There also is a newer h
type, which instructs the client to look for the URL:
keyword in the selector field and build a link to whatever non-Gopher URL is found after this keyword. Any other types are so obsolete it’s fairly safe for the client to ignore them entirely or treat such resources as plain text. Modern Gopher servers can also serve plain text lines with no tabulation characters in them in their generated Gophermaps. Such lines are usually expected to be treated as i
type lines and displayed as information messages.
Well… Did you notice that I just have described the entire Gopher specification in two paragraphs to you? This is, in fact, everything you need to know to start implementing your own Gopherspace browser in the programming language of your choice. Of course, there also is such thing as gopher://
URL scheme, but that’s how you describe Gopher resources to the outside world, and the internal link format within your browser can be much more straightforward. For example, my hi01379.js engine (used in Kopher and loosely named after the main supported Gophermap resource types) has hi:[type]|[selector]|[host]|[port]
internal link format that’s much easier to parse (and to generate from a Gophermap entry) than gopher://[host]:[port]/[type][selector]
format, although Kopher can understand the latter as well. But once again, these are just implementation details that affect nothing in terms of interacting with real-life Gopher servers and navigating through the Gopherspace. You have a simple as a stick request-response protocol, you have the agreed-upon Gophermap format with a set of rules on how to process its lines, and that’s it. From that point onwards, everything is up to you and your imagination.
And so I thought, and created a new browsing reality in my own microcosmos. Kopher allowed me to combine the best practices of WAP browsers UX from the past. In fact, the native KaiOS browser or Mocor’s Opera Mini started feeling absolutely slow and cumbersome to navigate when I realized how it really could be done, had it been tailored for specific text-only content with links that are each on its own line. After all, we easily have 20 keys at our disposal (12 digit keys + 5 D-pad keys + 2 soft keys + the Call key, anything else already is device-specific), and it would be silly not to utilize their full potential. First, I let the four arrow keys do what they are best suited to do: visual scrolling, and the Center key would click on whatever link is currently focused. Then, I dedicated the two keys directly below the D-Pad, the 2 and 5, to jumping between the links (wrapping around the page if needed and centering the screen on the link if possible): 2 focuses on the previous link, and 5 focuses on the next one. The two keys even below that, 8 and 0, perform the visual jumping to the beginning and the end of the page, and the keys around them, 7 and 9, perform the same on the horizontal axis. After all, a lot of text in the Gopherspace is preformatted to a much larger width than the mobile screen allows. This is why I also decided to implement an optional line wrapping mode which can be toggled with the 6 key, for easy legibility of normal text. For the same legibility purposes, I implemented a quick dark/light theme switch on the 4 key. Now, what browser can exist without the Back, Forward and Refresh buttons? For Back and Forward, I chose the remaining 1 and 3 digits (with Back also duplicated on the right soft key as the feature phone UI tradition), and for Refresh, the Call key was allocated. To enter the Gopher address, left soft key was thus the most logical choice. And then, the remaining *
and #
keys were used in the fashion of Opera Mini for two-key combos to create a complete 10-bookmark system with the ability to set the homepage as well and quickly return to it. Finally, the last combo, # #
, was used to display the information popup about the client version and the currently open page. Also, no need to have a single address bar where all the information wouldn’t fit on the screen anyway. The top bar displays the hostname only, and the bottom bar displays the response status or the type and selector of the currently viewed resource.
Is any of this design specific to Gopher? No. I’d gladly use the same controls layout and the same UI features for Gemini or any other protocol I’ll manage to implement. They are feature phone-friendly, finger-friendly and performance-friendly (= consuming less energy, just like Gopher itself). And I really enjoy browsing Gopherspace on the Crosscall using Kopher just as much (if not more) as on my nettop using Lagrange. Some might argue that such UI doesn’t meet any modern design guidelines, as it doesn’t even have the softkeys signed. But come on, the user knows that LSK always is “Enter address”, RSK is “Back” and CSK is clicking the current link, why should we have labels taking up the precious screen real estate? By the way, this is the exact reason the softkey labels, when a page was opened, were only shown on demand or hidden altogether in the early-days WAP browsers, especially in Siemens and Nokia phones. I don’t think having 240x320 pixels resolution makes the situation radically different for this matter. Especially when a part of it already is eaten by KaiOS system status bar itself, and making the app fullscreen hiding the clock and other indicators really doesn’t look like a solution for anything but games.
The UI, however, doesn’t end there. Remember that besides text resources, we also have binaries to handle. And these are the implementation details where the real devil was hidden. Because once more, I had to fight the platform itself to do the simplest things. I’m not even ranting about having to switch to ArrayBuffers for MozTCPSockets while binary strings had been working just fine for me for all these years, maybe I was just lucky, maybe KaiOS 2.5.4 performs some internal conversions that ruin everything - who knows but obviously strings are not consistent between versions while ArrayBuffers are. At the end of the day, I was able to download binary blobs of any size that KaiOS can physically handle. The main issue I encountered afterwards was how to actually save them. I tried using MozActivities but they only handle a very limited amount of types. I tried using window.open()
(the same as for opening the images for viewing) on the blob object URLs but they don’t allow specifying the file name to download them as (and the anchor download attribute trick obviously doesn’t work in the app context the same way it works in a normal browser). Moreover, any application/octet-stream
blob is automatically treated by KaiOS as something having a .exe
suffix, which is absolute nonsense and outright heresy. So, I ended up having to use the B2G Storage API but it has its own quirks I had to even describe in the project’s README for anyone to not have any surprises. Like, if you don’t use an SD card or didn’t set the SD storage as the default media storage in the KaiOS settings, you won’t be able to see your downloaded files with the stock file manager, only via MTP or some third party app like Explorer. Or via Gallery/Music/Video, if they are images/sound/video files. Great, right? And if that’s not enough, the Sigma S3500 sKai phone has a very quirky 2.5.1.1 version with some buggy panel implementation that made me actually rework my HTML layout and introduce the "chrome":{"statusbar":"overlap"}
directive into the manifest, and then retest the app everywhere else to ensure its appearance was now consistent. All this posed a real challenge to my motivation, but I finally exhaled when everything had been solved, and now can focus on more important things around this and other projects.
Anyway, although I managed to fit a fully working Gopher client for KaiOS into less than 500 SLOC of HTML+CSS+JS, all this still goes to show how much effort was spent on making it work correctly on this particular mobile system, as opposed to the effort of implementing the protocol and Gophermap processing per se. And because the codebase, while small and reusable, still contains a lot of clutter between the person reading the code and the actual Gopher implementation, I decided to go even further to demonstrate how simple and human-scale Gopher really is. Let’s implement a Gopher browser in pure… Bash!
Now, before we even start, let me first clarify what I mean by “pure Bash”:
- absolutely no external network helper software like telnet, nc, curl, aria2c etc - only Bash’s
/dev/tcp/[host]/[port]
pseudo-devices must be used; - absolutely no external scripting like awk, perl etc;
- also, no sed or grep, as Bash itself has the basic string search and parsing functionality;
- external but essential POSIX commands such as
cat
ormv
are allowed as long as there is no viable builtin alternative to them; - where possible, Bash builtins are preferred to the external standard shell commands of the same meaning (e.g.
[[
is preferred to[
), and overall bashisms are not only allowed but encouraged if they make the code more compact.
And in order to be a completed proof-of-concept, our browser must at least be able to:
- fetch any resource from a Gopher server;
- download all the binary resources directly to the user’s home or current working directory;
- display plain text resources in a readable format;
- display Gophermaps in a readable and navigable format;
- display external (non-Gopher) URLs inside Gophermaps in a readable format;
- allow the user to jump between the links in a Gophermap and click on them (not necessarily using a mouse - Enter key will do);
- prompt the user for some input when navigating to a 7-type resource and pass this input accordingly;
- allow the user to return back from non-navigable (plaintext) resources at least to the closest Gophermap the resource was called from;
- accept the starting host, port, selector and resource type from the command line.
With that said, let’s begin building our client (naturally, I’m calling it Bopher). Fetching a resource (the protocol itself) is the most straightforward thing to implement here and the function to do this actually takes four lines of Bash:
1 | gophetch() { # args: host, port, selector |
However, to make things a little more robust and comfortable for us in the future, let’s first change our file descriptor to 4 (because e.g. macOS can crash our process if we use 3 for some reason) and introduce the fourth “input” parameter that, if present, will be appended to the selector right away after a tab character. This will make our gophetch
function a little longer but still very small:
1 | gophetch() { # args: host, port, selector[, input] |
What we do with the output though, is none of our concern just yet. We only know that this output can be binary (in which case it will go straight to a file we specify) or plaintext, which, in turn, can be a Gophermap (to be processed and then displayed) or non-Gophermap (to be displayed as is). Now, how do we process a Gophermap? For every Gophermap line, we must determine two things based on its type: first, how to display it, up to the point whether it is clickable or not, second, if it is clickable, what to actually do when the user clicks on it. That’s why, after also flattening out all the idiosyncrasies with optional fields and non-standard line types, we convert the Gophermap into a much more rigid structure I called the action menu. Every line of it (further referred to as AM line) consists of exactly five tab-separated components: action, description, host, port and selector string, where action is a single letter that unambiguously determines both how to display the line and what to do when the user clicks on it to navigate further. For the purposes of current project, I decided to define five action types:
E
: echo (display) the description field, ignore the rest and don’t make the line clickable/navigableP
: generate a link, display the contents on clickD
: generate a link, download the contents on click (take the name from the selector’s last part)M
: generate a link, parse the contents on click as a Gophermap and display a new action menuI
: generate a link, prompt the user for an input on click, add it to the selector (tab-separated), parse the response as a Gophermap and display a new action menu
Every Gopher resource type in existence boils down to one of these five actions. Rinse and repeat for every line in the Gophermap, and you have something to already start building your UI upon. Here is the function to convert every line of Gophermap to a corresponding action menu line in relation to the current hostname and port number:
1 | # styling used in the TUI |
Notice how this function only handles a single Gophermap line and not all the lines at once. There is a reason for that, but we’ll get to it later. Now, we need to define two functions to work with an action menu entry: a displayer and a clicker. Let’s start with the clicker, the function that performs the action when the user clicks on the entry. To be honest, the hardest part about this was to decide what this function actually should return in each case. And I decided for nothing better than… new AM lines. Yes, in fact, the entire UI is going to be built on top of the same line format, even the plaintext output will be converted to it. This allows us to unify the controls logic we’ll get to very soon.
1 | amclick() { # args: AM line, output: AM line(s) |
Now you know why we only need to parse Gophermap one line at a time. As you can already see, the E
AM line type is a bit relaxed, as we aren’t going to check for any other field except the description when displaying it. Speaking of which, let’s write a displayer function:
1 | amdisplay() { # args: AM line, output: displayed line |
These four functions (gophetch()
, gmparse()
, amclick()
and amdisplay()
) actually are everything we need to start gluing the UI together. To actually create a navigable UI from what we already have, let’s introduce the concept of screen buffer. It is a globally accessible Bash array of every AM line of the currently open document. The client will render the visible portion of it line by line, using the amdisplay()
method and some additional logic. The second array we need is a navigation vector. It’s an array of the numbers of AM lines that have non-E
entry type, i.e. represent clickable links. Finally, we need a global integer variable to keep track of which link is currently focused, that would hold an index in the navigation vector. This might sound complicated but let’s look at the code of loading an actual resource into these arrays:
1 | declare -a SCREENBUF # screen buffer array placeholder |
Yeah, that simple. Again, we’re still using an AM line as the starting point here. Note that this function also calls the jumplink()
function we haven’t written yet. Therefore, now is the time to look at how link navigation would be implemented here:
1 | jumplink() { # args: delta (usually 1 or -1) |
This function accepts a single parameter, delta, which can be 0, 1, -1 or virtually any other integer number, however these three values are the most practical for browsing purposes (to re-render the current position, jump to the next or previous link relative to the one we’re currently focusing on). Using the wonders of Bash arithmetic builtins, this function calculates the next index in the NAVIGABLES
array and also updates the actual index of the line to focus upon rendering, adjusts the scrolling position and finally calls the amrender()
function that will do all the heavy lifting for us. Now, scrolling handling is also very easy using Bash own arithmetic and arrays API:
1 | scroll() { # args: delta (usually 1 or -1) |
Note that our scrolling and focusing model doesn’t account for the existence of wrapped text lines. For this prototype, we just assume there are none. Which is fine for our prototype in Bash, because nowadays terminals are usually much wider than the Gopher content preformatted on the server side for them. Now, everything is actually set for us to define our rendering method:
1 | amrender() { |
Here, more explanation might be required about what’s really going on. First, we actualize the current terminal dimensions (although we are more interested in row count only at this point). Then, we determine the lower bound of the screen buffer contents we’re actually going to render. Then, we add the total amount of rows minus 2 (because the lowermost row is always going to be unused) to this position and get the upper rendering bound (which, of course, also can’t exceed the screen buffer size itself). Then, we do our best to clear the terminal screen and put the cursor to the upper left corner. And then, for every line within our determined boundaries, we either run our amdisplay()
function directly or, if it’s a link that also happens to be currently focused, prepend it with a special terminal attribute that swaps the background and foreground colors in order to make a nice selection appearance. Finally, we reset the cursor to the first character of the last line on the screen. Additionally, and I don’t know whether it actually works correctly, we bind this rendering function to the SIGWINCH
signal to react to terminal window change automatically and not just on the user interaction. All this happens purely using Bash internals, the only external command being the POSIX-compliant stty
.
To ensure no accidental click logic is run when a document doesn’t have any links per se, let’s wrap our loader into a clicklink
function:
1 | clicklink() { # click the currently focused link |
Now, the user interaction part is just as fun as I had expected. First, we grab the command line parameters and default them to some sensible values somewhere at the beginning of our script:
1 | START_HOST=$1 |
Then, at the start of the actual execution, we need to convert this data to a valid AM line to proceed with loading. The easiest way to do this would be to first shape a virtual Gophermap line (sans the description as we don’t need one) and then use the gmparse()
and amload()
functions accordingly, also resetting the scroll position as necessary:
1 | START_GMLINE="$(printf "%s\t%s\t%s\t%s" "$START_TYPE" "$START_SEL" "$START_HOST" "$START_PORT")" |
So, our buffers are populated and all the initial rendering is done. Now, we can start the main input loop. It uses the ability of the read
Bash builtin to read exactly N characters from standard input at a time. Since our renderer always returns the visual cursor to the last line of the terminal, all the input will be done there. For escape sequences, we capture two more characters as needed to process arrow keys. Some keys also generate the third character, ~
, but it can be automatically dropped at the next loop iteration. At the bare minimum, we can define keys for line scrolling (Arrow up/down and k
/j
in the Vim fashion), jumping between the links (w
/s
), clicking on a link (Enter/Return key), going back from plaintext to the closest Gophermap (b
), and quitting the browser (q
). For my own convenience, I have also added Page Up/Page Down functionality duplicated on the h
/l
keys as well. Let’s see what we’ve got:
1 | cmdbuf='' # allocate command buffer |
And… that’s it. That’s the entire Gopher browser written from scratch in under 185 source lines of pure Bash code, with only external dependencies being cat
and stty
commands. This browser should work in Bash 4.2 and up (I only tested it in 5.1), and it allows you to do everything outlined in our MVP requirements above: browse and navigate Gophermaps, make search requests, view plain text documents, download binaries, return from a non-navigable document to the closest Gophermap available. It’s in no way optimal (I’m not a Bash expert by any means) but it works for me and I actually prefer browsing with it to using Lagrange or other clients as well. The complete bopher.sh
source code (as implemented in this post + additional comments) is available right here and you can try it out yourself if you want to. Just start it with, for instance, bash bopher.sh gopher.floodgap.com
and be amazed at what the Gopherspace actually has to offer.
And so, I have started the post with the statement I wrote a browser, and have written the second one within several hours. Gopher is really fun. And this journey surely will continue.