JPEG HTML Shell Polyglot

What is a polyglot?

Merriam-webster defines "polyglot" as one who is polyglot which is not all that useful. It also defines it as speaking or writing several languages which is a bit better. Another definition is composed of elements from different languages to which if you add 'programming', is quite a good description of what I'll be doing today.

Ok, now seriously. A polyglot file is a file that passes as multiple valid file types at the same time.

The file I've crafted can be found here.

When you open it in image viewer, it opens as an image. When you run it with 'sh 9mem.html', it runs a simple script. (note that simple './9mem.html' does not work) When you open it in the browser, it not only loads a html site, but it also includes itself as an image.

But how does it work

You start with an image file. Images might feel scary, as they are not encoded in a human-readable format, but they are still just a bunch of bytes in specific format.

The nice thing about common image formats is that they:

are mostly image data, which can be changed without breaking the file (at least functionally)
they are terminated by certain byte sequence, after which you can put any arbitrary data

One nice script that plays with this is stegstract, which extracts data appended to images. That is not enough for a polyglot, though.

First, you need to be able to edit the image file. Most decent text editors, such as neovim, can open binary files just fine. If your doesn't, just search through different ones. One will work eventually. (Neo)vim will probably render UTF-8 characters from the data. You can reopen the file in ASCII with ':e ++enc=ascii'. This is useful, as you don't want to add or remove bytes from the image data, just alter them. If everything fails, just use a hex editor.

In JPEG, the first six bytes should not be touched (usually, they are followed by 'JFIF'). If your you're unsure, just replace bytes one by one until it opens and looks good.

These immutable bytes will stay there. Shell will usually pass through them and html will display them, but they can be removed with JavaScript. The rest of the image will be commented out. You will want to replace the data with the following:

('X' represents existing image data)

XXXXXX
<<\EOF
<!-- XXXXXXXXXXXXXXXXXXXXXXXXXXXXX...

The '<<\EOF' requires it's own line. This essentially starts a shell comment until the next occurrence of 'EOF'. If there is 'EOF' in the image data, just use something else. '<!--' is just a html comment.

Now, the image should still look good.

If you go to the end of the file, congratulations! You can write whatever you want here. You start by ending the html comment with '-->'. Then put all your html here. It will get a bit mangled when rendering, but it should mostly work.

After the html, open another html comment and close the shell one. Then write your shell and end it with '# -->'.

Now rename the file to have the '.html' extension. It should be runnable as a shell, interpretable by an image viewer and should open as a site in the web browser.

There are few more things you can do with the html, tho. First thing is removing the leading immutable bytes, which will get into the document.

<body id="body">
  <script>
    const b = document.getElementById("body")
    b.removeChild(b.firstChild)
  </script>
</body>

The byte sequence will place itself at the beginning of '<body>', so you just have to remove its first child.

You can also link to the image stored within the document by just linking to it from the '<img>' tag.

<img src="./<filename>">

And now you have a cool polyglot image to flex on your friends with. Neat!