GlitterFS, the synthetic FUSE filesystem

What is a 'synthetic filesystem'

Well, it is a filesystem containing synthetic files. Not very helpful, is it. Think '/dev/' and '/proc/'.

These are filesystems that are mounted somewhere in your system and they do contain files, but they are not files in your traditional sense. Instead of being a bunch of data somewhere on your disk, they are representations of your hardware devices and running processes. You can still interact with them via your normal file operations and I think that makes them very convenient.

I like this idea and I want to play with it for a bit.

The plan

So a while ago, I found a website that can convert text to glitter GIFs (pronounced gifs) and I made a simple web app that would use this to generate glitter text for you. I think it would be cool if you had a FS, where you would create a file and every time you read from it, you would get a different glitter representation of it's name.

As far as synthetic filesystems go, there are two popular ways to make them. For multiplatform network filesystems, there is the 9p protocol. (well, the 'p' already stands for 'protocol', but whatever) It can also be used locally, but on *nix machines, FUSE is more popular for local file systems. You might know FUSE as the thing you use for mounting more obscure filesystems, but it's also well suited for synthetic filesystems.

The implementation

Source can be found here.

To run this program, you will need the 'fusepy' library. I chose to use python, because I wanted to use a high-level language and I know that python has a working FUSE library that should be up-to-date and well documented.

First thing I've done was to ask AI to generate some working FS in python. I don't usually use AI to write code in my hobby projects, but it's very good for learning new things and writing examples.

'fusepy' provides an 'Operations' object, from which you can inherit. It does contain a bunch of methods for file operations, but they all by default return error. How you represent files is your thing, but this implementation has dictionary for file data and for file info, which uses file paths as keys.

Remember, that directories are also files. This is *nix after all.

Many of these file IO operations are meant to be called repeatedly. 'readdir' is written as a generator, which is nice I guess.

The one method that I'm interested in is 'read'.

My initial plan was simple: Get an image from the server and return it.

The website I used in the original webapp is glittertextonline.com. It didn't seem to work when I tried using it now tho, so I used gigaglitters.com instead.

Both of these websites run off of forms, which means you can use them as APIs. First of all, you want to open 'Developer Tools' by pressing F12. Now you can see the source of the webpage, so you can find the form. I recommend you use the small icon in the top left corner to select some element in the form and have your brovser locate it for you.

Anyways, now you can look where does the form submit to and how are it's elements called. Here we can see that gigaglitters.com redirects generation requests to 'procesing.php' page. We can also see that not all elements need to be filled out, so you can make a valid request using the following URI:

https://www.gigaglitters.com/procesing.php?text=testing

Now we want to retrive the image from the server. You can get the html via the curl(1) command. (Don't think that not doing BSDs anymore stops me from linking the manpage!)

The request redirects you back to 'glitter.php', so you have to use the '-L' or '--location' flag

$ curl -L 'https://www.gigaglitters.com/procesing.php?text=testing'

Now, we want to find the image with the glitter and get its filename. I have found that it's the only '<img>' tag that is immediately followed by 'src' and points to a GIF in the 'created' directory. I can filter it out using sed(1) like so:

$ curl -L 'https://www.gigaglitters.com/procesing.php?text=testing' 2>/dev/null\
       | sed -nE 's/.*<img src=\/created\/(.*.gif)>.*/\1/p'

The '-E' flag means to use extended regex. The '-n' flag means to ignore lines with no match and the 'p' argument means to print the output. '\1' points to first capture group (parenthesis).

Now you can retrieve the file using curl, but as it does not return a text file, so you must specify '--output' or '-o' with a file name to store the output in. If you still want to use stdout, you can give '-' as a file name.

$ curl --output - 'https://www.gigaglitters.com/created/<image-name>'

I can get the file contents in python using the 'subprocess' module like so:

filename = os.path.basename(path)
filename = filename.replace(" ", "%20")

imgname = subprocess.run(
    "curl -L " +
    f"'https://www.gigaglitters.com/procesing.php?text={filename}' " +
    r"| sed -nE 's/.*<img src=\/created\/(.*.gif)>.*/\1/p'",
    shell=True, capture_output=True).stdout.decode('utf-8').strip()

out = subprocess.run(
    f"curl --output - 'https://www.gigaglitters.com/created/{imgname}'",
    shell=True, capture_output=True).stdout

'.stdout' returns bytearray, so I have to decode it to text like so. Note that I use r-string, so I don't have to escape all the backslashes.

You might think that now you just return this file and it's over, right? Well, not really. You see, many programs read file data in parts. This is why the 'read' method takes 'size' and 'offset'.

Ok, so I have to load the file only on 0 'offset' or new 'fh' (file handle) and all is good, right?

Well, kinda. Some programs like feh or kitty's 'icat' kitten work, even tho feh takes it's time, But other programs like qView go crazy.

You see, if I print 'offset', 'size' and 'fh' on each 'read' call, we learn something scary. When feh loads a image. It opens it once, reads from it twice from the start, then closes it and opens it again and then reads it once more. qView is even more crazy. It first reads the file a bunch of times, and then it constantly reads from it in case it changes, all with new file handles, so it changes a LOT.

This all makes my idea of generating file each time app opens it not very practical, so I only generate the image once, when the file is created. This means that the file creation takes some time, but reading after that is fast.

What did we learn?

File systems are complicated, image viewers work in mysterious ways and python works well for simple local synthetic filesystems.

I would like to look into networked 9p filesystems next.

Fuse seems to be installed on ^C, so that could lead to some fun...