Resolving FileSystems

This kinda relates to the web OS project, but it can be red by itself as well...

What is a resolving filesystem?

Well, I don't have the specs yet, but it's basically a special kind of synthetic filesystem I came up with for my OS project. It's a way to interact with servers over network protocols with just file I/O operations.

It's primarily a type of filesystem and I will write some sort of informal specification later, but there is also a reference implementation, which will be part of the OS.

The implementation consists of a Go module, which developers can easily combine with any kind of network protocol they please... Well... I don't have any way to handle continuous connections at the moment, but that might get added at some point as well. I'm currently planing to implement the following protocols:

Why Go?

For the filesystem part, I want to use FUSE. I have only used FUSE in Python before, but I didn't want to write the servers in Python. Mainly because I don't want to have multiple Python instances running all the time, mostly doing nothing, and secondarily, I just don't enjoy coding in Python all that much. Like, it's not the worst, but I'd rather not.

I decided to go for something 'mid-level'. Compiled, statically typed, but still with garbage collection. Go seemed like supported well enough (it actually has it's own implementation of FUSE, not just bindings) and it just generally seems like a nice language.

Design

For now, let's imagine having some form of web rfs mounted at '/mnt/web/'. If you 'ls' it, it won't show much, just a ':c' directory, which stands for ':config'. In the config, there might be multiple files, but there will always be a 'flush' file. The config is used to give instructions to the filesystem itself. Flush tells it to flush cached data.

Now, say I want to see my homepage. I will write:

cat /mnt/web/ctrl-c.club/~de_alchmst:

This will fetch my homepage via HTTP GET and return it's contents. If this page didn't exist, it will return an I/O error, as to distinguish from an empty page.

After this, there will still be only ':c' visible in '/mnt/web/', but the file is cached for some time.

This is one of the important design choices of resolving filesystems. Directories and files only exist, if they are requested. Even with web crawlers, it is impossible to get a list of all resources on a web server, because some might just not be linked to from anywhere. Web crawling also takes time, and RFS is not really supposed to be a browser, but an application API, so we expect the user to already know what resources they need.

Another fun thing is the trailing colon. This is a mark of limitations of my time (that being Linux). In Linux, file can either be a directory, or not. You cannot read from directory, nor can you access files in a on-directory.

When you resolve a path, each element is ask for information separately, so I have no way of knowing whether I should state that given part of the path is a directory or not. For those reasons, the user must help there a bit by providing this information for me.

You can still access directories ending in colon, but you will have to use the percent escape syntax.

Now, say you want to send some data to a server. You would do:

echo '{"hello":"world"}' > /mnt/web/:json/api.example.com:
cat < /mnt/web/:json/api.example.com:

There are few things to notice. First, after writing to a resource, you can get the response next time (and only next time) you read from it. Responses are bound to a source PID, so you cannot just use cat, the reading must be done by the same process.

Next interesting part is the ':json'. any leading directories starting with colon (except for ':c') are treated as modifiers given to the request. In this case, we are specifying to use the 'text/json' MIME type.

Modifiers can be stacked and their order generally doesn't matter. Note however, that data is cached based on the entire path, including the specific order of modifiers.

This can be abused a bit with the ':nop' modifier. This modifier does nothing, but can be used to differentiate multiple parallel requests to the same resource done by the same process.

One last thing to note. All the requests are coming from the same source, so it works best with stateless APIs. Usage with stateful APIs should work, but only if it's used by just one process at the time.

So yea, those are basically resolving filesystems from the user perspective.

Go

This being my first Go project, I have thoughts.

Go is nice. it's easy to learn, at least with my knowledge level, and it mostly just works. It's relatively minimal, but not to the extreme. No funny syntax to remember, no BS, just a language to get the job done.

Packages are handled in an interesting way. You just include the library from the code, including the URL if it's repo, and Go takes care of the rest. Unless there is some system I don't know of, It can cause a lot of problems if you decide to migrate to a different git host, but it might be handled somehow, IDK.

There is even a way to handle packages in development via workspaces, so that's nice.

Errors are handled via multiple return values. I think that it's way better that any form of try-catch. This might be subjective, but at least for me, this design leads me more towards handling the error, instead of just letting it happen.

Given that Go is derived from Pascal (well, it's derived from Limbo, which is a mix of Pascal and C), I would expect it to have a complex type system. And it delivered.

You cannot define range types, which is a bit sad, but whatever. What you do get, tho, are interfaces. Interfaces are the closest you get to OOP in Go, but I might think that they are even better that traditional OOP (more research is still needed, however).

You can define methods on any type you like, including primitives. Then, you can define an interface, which is basically a set of required methods for a type. Then, you can use that interface as an argument type, to which you can pass any type that implements said interface.

This is nice from a Library development perspective, as you can let users to define their own types.

Interfaces are also a nice way to pass functions as an argument to a library or something.

One last thing to mention is that I finally had a need to use generics. I never needed to use them, probably because all situations where I would need them happened to me in a dynamically typed language. Anyways, Go has them, and they work about what you'd expect. There is an example file implementing cache logic. I have normal 'entries', which are 'map[string]*struct' and 'pidEntries', which use a struct containing PID and string as a key.

package rfs

import (
  "time"
)


var (
  // 5 minutes, flushed every 5 seconds
  CacheFlushTimeout = 5 * time.Second
  DefaultTTL int64 = 5 * 60 / 5
)


func cacheFlushing() {
  for {
    time.Sleep(1 * time.Second)
    flushStep(entries)
    flushStep(pidEntries)
  }
}


func flushAll() {
  for path, _ := range entries {
    delete(entries, path)
  }
}


func flushStep[K comparable](ent map[K]*pathEntry) {
  for key, e := range ent {
    if e.Status == entryStatusProcessing {
      continue
    }

    if e.TTL <= 0 {
      delete(ent, key)
    } else {
      e.TTL -= 1
    }
  }
}

Also yea, I guess the 'var', 'const' and 'import' blocks, also inspired by Pascal, should be mentioned. But unlike Pascal, you use the ':=' operator to declare and assign new variable of automaitc type, all at once. That one is also nice.

Go generally stripped a lot of the additional Pascal syntax.

One flaw of Go is, that it's one of those languages that don't like unused symbols. This is espacially annoying when you want to add few test prints, so you have to import 'fmt', but then you want to remove them again, so you also need to remove 'fmt'. Oh, well... Maybe I should finally learn some form of debugger, tho I'm not sure how nice would it play with the FUSE library. (bazil.org/fuse BTW)

(repop)

nil

I don't have any public repo yet. I'm quite close to finishing (about one b-log), so I'll just release it then all at once, including the library, specification and gopher and web RFSes.