~::BAST::~

 λ λ    λ λ
(ΘωΘ)  (ΦωΦ)

 λ λ    λ λ
[ΘωΘ]  [ΦωΦ]

repo, current commit: 5a04652800222c1f8236e9137e5257371b3647e1

It was just a matter of time, until I decided to make my own language. (ex:forth doesn't count) But why now? Well, I just kinda decided to start learning OCaml on the side, but then I learned it's supposed to be real good at compilers, so...

The language

The final product is not yet fully decided on, but my current notes on the language are available here.

The original idea was to just make a fun little language optimised for aesthetic value of the final sourcecode, since I like pretty looking code and not many language optimise for that, so there isn't all that much competition. As I was writing down my ideas, however, some interesting syntax patterns and possibilities came up, and so I think it might even turn out at least a bit decent.

The main 'out-there' idea is 'binding'. Basically, various parts of code can have further modifiers glued to them with ':'. This means that you can easily modify operators, function arguments, etc., all with unified syntax.

As I really enjoyed coding cart-0 of melancholy in WASM-4, I decided to primarily target wasm. Since I don't know any wasm, or how language internals work in general, I decided to compile to another language with similar features. I chose MoonBit for that. I mean, if some languages can compile to JavaScript or Lua, then I can surely compile my hobby language to MoonBit. I, however, don't really know any MoonBit either, so yes, I'm learning OCaml and MoonBit at the same time, while using them to create a whole new language. Fun!

As far as features go, you really have to read the notes for that, but I can say that it mixes ideas from both ALGOL and LISP languages. So basically it's a language is what I'm saying.

Tooling

Well, outside of the already mentioned MoonBit, I'm using ocamllex for lexing and Menhir for parsing, as they seem to be the current standard.

The basics

As I wouldn't know where to even start, I began by asking Claude to generate a simple demo that compiles a very simple language to python and add a plenty of comments to the code, so that I know what's going on. This was originally meant just as a reference code to learn the stuff, but as there wasn't even that much code, and it was all comprehensible, I just took it as a base and continued on top of it.

For those who might not know how compilation works, there are the basics:

You get a source code as a string.

You throw the source code at the lexer, which turns it to a list of tokens. Tokens are representation of the specific parts of the code, such as 'if', '{', or ';'. Some tokens, such a strings or variable names can also hold values.

You throw your tokens at a parser, which turns it to Advanced Syntax Tree (AST). AST represents specific parts of the actual program, such as expressions, loops, etc. AST is often represented as S-expressions, so if you ever worked in a LISP of some sorts, if you ignore macros for a moment, you were basically writing AST. (kindof)

Lastly, you throw your AST at the code generator, which, well, generates code. In my case MoonBit.

If this all sounds complicated, you are not alone. That's why ocamllex and Menhir exist. I don't have to care about the actual implementation. I just write a bunch of regex rules for the lexer and token matching rules for the parser and I'm mostly done. It's actually rather fun, highly recommended.

Progress so far

Well, I have variables, which can currently only be numbers, or nil. I have logic operators. I have function calls (but not definitions), and I'm currently working on operator modifiers. They are mostly done, but I'm having trouble with prebinding them to variables. Oh, well, shit happens.

I also highly recommend writing tests. Dune (OCaml build system) has nice test integration, and it's real nice to know when new feature breaks something. I've already been saved by the tests multiple times. Could not have done it without them.

Some might wonder, how I managed to trick MoonBit, a strictly typed language, to support dynamic types? Well, that's quite easy: enums!

enum Value {
  Nil
  Num(Double)
  Str(String)
  Boo(Bool)
  Arr(Array[Value])
  Fun((Array[Value]) -> Value, Int) // func, arity
  Cons(Value, Value)
  Err
} derive(Show)

struct Var {
  name: String
  mut val: Value
} derive(Show)

MoonBit truly is a higher-level functional language, which makes the development very easy. (so far)

Operators are implemented as functions, which are type-checked with some fun wrapping, which utilises MoonBit's pattern matching. As might be seen from the provided example, function arity is implemented by functions just taking lists of values as arguments. Quite nice solution if you ask me.

I originally expected binding to allow for some universal implementation, but it's more of a syntax thing, and it still needs to be parsed in case-by-case basis. It does look nice, however.

Conclusion

To be honest, I'm not even entirely sure what I want to talk about here. It's not meant to be an ocamllex/Menhir tutorial. There are plenty of those and Claude also seems quite competent in this regard. It's not meant to be a language creation guide, I have no idea what I'm doing.

I probably just want to inform all the hypothetical RSS subscribers that I'm still alive and I'm doing a language, which might even get usable at some point. Once I have some progress to show off or some knowledge to write about, I'll write another b-log. Until then, stay subscribed.