langdev — musicmatzes blog

My functional programming language hacking

March 18, 2023

I asked people whether I should post about my experimental #langdev ... And they asked me to do it, so I'll do it.

Disclaimer: First and foremost I am doing this for FUN!!! There's no “I have a groundbreaking idea” thing here. I want to write my own programming language because I like programming, not because I can make “the next big thing” or anything like that. Also I don't actually want to reinvent stuff. Using libraries for things like a borrow checker is in my opinion better than writing one myself. This is a hobby and nothing more. The end result of this whole thing will never be void even if I cannot get a working MVP out of my efforts because the whole propose of this is having fun.

Another disclaimer: in this article, I will say a few things that might not be 100% technically correct (or even be complete BS) especially with regards to the Haskell language. I am making no claim for technical correctness. If you stumble upon something that is not technically correct, just remind yourself of the context of this article. And especially or the disclaimer above.

Link to the repository: github.com/vunk-lang/vunk-lang.

Where I come from

So where am I coming from? For those who don't know, I started writing #Rust in 2015 and never looked back. Rust is the perfect programming language right now, in my opinion. It's semantics, especially the borrow checker, but also how it makes you think, just clicked with me and I feel extremely productive when writing Rust code. Of course, what Rust brought to the table is nothing new in terms of programming language research. But how it brought it to the table definitely is. Because the language is only one bit, its ecosystem and community are as important or even more important.

And because I want to continue program Rust and was in need for something to do, I wanted to start another hobby project.

The next big point in the “why” question is that I am highly “functional curious”. By that I mean that I am really interested in the functional programming paradigm, though I really have to underline that I am not interested from the mathematical side of things, not at all. In fact, I have little knowledge of all these mathematical things... I could probably not even explain lambda calculus to someone (don't tell the Professors where I studied, although I am not sure they could either).

I am more interested in functional programming from the point of programmer experience. One thing I really learned, or rather engraved into my heart, while getting more and more experience with Rust is that mutable access on memory is bad. Not inherently bad, but bad nonetheless. And yes, that former sentence does not miss a “multiple” before the “mutable”, I really mean that modifying memory in place is bad, although of course it is absolutely necessary. Rust abstracts that danger away pretty nicely via its borrowing mechanisms. Either you can access that piece of memory mutably from one piece of code, or nonmutable from multiple. If you need mutable access from multiple locations, you must use synchronization primitives (modulo unsafe, but let's keep that out of our heads for now).

When I talk about functional programming languages mean Haskell because that's the only functional language I kind of learned (or rather tried to learn – I think I understood the bits, yes also Functors, Applicatives, Monoids and Monads, but never actually used it for something meaningful, which I think is required if you want to say you “learned” a language). From my experience, the mutable access problem fades in (pure) functional languages. Not because it is somehow solved in these languages, but rather because it is of less concern. In Haskell there's no mutation in place (IIRC, remember disclaimer 2), but the language makes you think that things get copied all the time, and the runtime optimizes everything nicely so that you don't have to care too much.

And if we think that thought a bit further, we see that the “mutable access” problem I've been talking about above is nothing more than side effects. Side effects are bad. Rust makes them explicit, which helps a lot, of course. Haskell makes them also explicit, although completely different than Rust (from a programmers perspective).

These are the technical points where I am coming from. The next point is something else entirely: motivation. I have problems with motivation. And by that I do not mean problems as in “someone asked me to do something and now I am just slacking”, no not at all. If I feel like I have a responsibility to do something, I will do it and I will do it to the best of my abilities! If I have an appointment, I will be there ten minutes early. If I have to do the dishes, I will do them and although I rather use new unused ones, I won't re-use already used ones, if you get what I mean. If I get a task on my day job, I will of course do it in reasonable time and not slack off. No “my code is compiling” XKCD on my watch!

By problems with motivation I rather mean things like “I should try implementing a tool for X” and then my brain does the thinking part, thinks it all through and models all necessary abstractions... But then my body cannot get off the couch to actually write that code down. Another instance is reading blog articles. If I find something in my RSS reader where I really am interested in from the heading and catchline, I often fail to read the actual article. I think these examples stem from the same kind of problems with motivation, at least they feel equal to me.

And I actually do not mean that I am too lazy to get my brain into thinking mode, or in case of programming, search for all necessary dependencies, set up the CI or stuff like that – I actually enjoy these bits of work quite a lot – no, I really mean writing the code that implements my idea (and you can actually see that by looking at my GitHub profile).

I have also to note that I think a lot of people have problems like the one I just failed to describe decently (IMO).

But when I started thinking about implementing a compiler for a language I made up, I had this strange tingling in the back of my head that kept coming back. And so far it hasn't gone away, which I also think I must give attribution for to the nice ecosystem of Rust. Previous attempts at programming stuff often times failed because I failed to find decent libraries for implementing my idea!

So, I have to take advantage of that motivation right now and just get that code written, right?

Where I want to go

With all the above in mind, I thought: Why can't we have a language like Rust, with all the bits to do low level programming (and by that I mean interfacing with C libraries easily, not writing a Microcontroller OS), with the borrow semantics, control over references and lifetimes and such, but as a purely functional language? And that's essentially where I want to go.

My idea in one quote would be “What if Rust was purely functional?”

I started writing some example code files how I wanted the language to look (and you can find them in the repository) and then started to look for how I can get them into a compiled form that can be executed.

I first thought of implementing a VM or bytecode interpreter for that, but soon got the idea that LLVM is there and should be used, especially because I want to be able to interface with a C ABI. Now that I heard of HVM, I am also curious whether we cannot have two backends, one compiling to a binary using LLVM and another compiling to HVM bytecode. Not sure whether this would be possible at all, but still an interesting idea.

Because the language should be low level, the programmer should be able to distinguish between pass-by-value and pass-by-reference (just like in Rust of course). There should also be the possibility to do pointer arithmetic, although the same as with Rust should apply: you need to need unsafe for that.

I am not yet sure whether I want to have a macro system like with Rust. I did not think too much about it yet, but there might be the possibility to cover the needs that are fulfilled with macros in Rust with higher order functions. Maybe.

Another idea that I have is that function composition and currying/partial function application should be possible.

The last bit I did not yet think about at all is whether Monads should be the way to abstract side effects or Effects. I learned about effects just a few days or weeks ago and did not yet fully understand the implications. Although I am not sure whether I understood the implications of Monads either. Again, I do not have a mathematical background!

I think I do not have to mention that good (Rust-style or Elm-style) error messages and such are also a goal, do I?

Current state

As of today, I have a working prototype of a lexer/tokenizer and the first bits of a parser implementation. The parser is still missing bits for defining types and enums and the implementation for defining functions is also not ready yet. But we'll get there.

The syntax does not yet have the concept of unsafe, which I will need at some point, because interaction with C code is of course unsafe.

Let me give you an example of code now, but be warned: This could be outdated by tomorrow!!!

use Std.Ops.Add;

answer: I8;
answer = 41;

add A T: (A, A) -> A
    where A: Add T;
add = (a: A, b: A) -> a + b;

addOne A: (A) -> A
    where A: Add I8;
addOne = add 1

main = Std.println $ addOne answer

Let's go though that example line by line.

In the first line, we import something from the “Ops” module from standard library “Std”. What we import is “Add”. It is written uppercase because it is either a type, an enum or a Trait. Modules are also written uppercase and they map directly to directories on the filesystem (also written uppercase). Modules have to be declared before they can be used (pub mod Helpers; mod Private helpers;, not in the example). Functions are always lowercase, although I am not yet sure whether I want to go with CamelCase, pascalCase or snake_case.

Next in the example, we declare something called answer to be of type I8, an 8 bit integer. After that we define it. The declaration could be omitted though, as the type is clear here and can be inferred by the compiler.

Next, we declare and then define a function addI8. That function is generic over A and T where A has to implement the Add trait we imported earlier over some unbounded T. add is then defined to be a function with two parameters of that type A and returns also an instance of that type. It's implementation should now be self describing.

Note that we have to write down the function declaration because then function is generic. If the function is not generic or the types can be figured out by the complier, we can omit the declaration. That would be true for add = (a: I8) -> a + 1, and possibly also for the above implementation of addOne, although I am not yet 100% sure about this one.

addOne from above is now declared and defined using partial function application. The compiler should be smart enough to figure out that the expression add 1 results in a function like this:

add' A T: (A) -> A
    where A: Add T;
add' = (b: A) -> 1 + b;

Last we define the main function to use the println function from Std (without prior importing) to print the result of applying addOne to answer.

Note that the $ character here that you might know from Haskell is only syntax sugar for parentheses, nothing more.

For types, I am even thinking about having impl blocks like in Rust, and them being syntax sugar for free standing functions:

type Person =
    { name: Str
    }

impl Foo = 
    { getName = (&self) -> &self.name;
    }

# above function is equal to
getName = (person: &Person) -> &person.name;

Lifetimes in above example are inferred of course.

Trait notation and implementation of traits on types would work the same way syntax-wise.

If you're curious for the bits that are not in these examples, you can always browse the code examples in the repository. Examples are actually ran through the lexer and parser, so they have to be up to date.

Closing thoughts

This is an experiment. An experiment in what I am able to do, but also an experiment in how to keep me motivated. I hope it works out. But i don't know whether it will or whether I lose interest tomorrow. Let's hope I don't.

And just to note, because it might come up: the question of “Why not {Roc, Elm, Elixir, SomeOtherLesserKnownFPLanguageFromGithub}” is answered in the “Where I come from” section, if you read close enough!

If you want to contribute, which I would like to see, please keep in mind that I am learning things here. I am not a functional programming language expert, I have no mathematical background. If you contribute, an explanation of what you're trying to accomplish and how is required, because I won't learn things otherwise. I value programmer experience and simplicity more than mathematical elegance or such. So be prepared to explain linear types, effects, HKT, etc to me if you want to contribute code for these things!

That said, you're welcome to send patches! Just ping me on mastodon or write a short issue on GitHub about your idea and then start hacking. I am normally fast in responding, so if you just open an issue like “Hey I want to add infix functions” (which we do not have right now), that'd be great for me to know so I can give you feedback on your idea fast! Although I am not sure whether I want to have infix functions.

Comments on all bits in this article are warmly welcome! writefreely has no option for responding or even getting notified of replies, so please post comments with @musicmatze@social.linux.pizza mentioned if you want to comment via mastodon. Email for a more private conversation is fine as well, of course!