musicmatzes blog

Rust

I am happy to announce that I am officially a (part-time/side-job) #freelancer for #rust #rustlang and #nixos #nix now!

You noticed the recent spike of Rust in your timelines? That's because Rust is the best thing since sliced bread! But Rust is not easy to learn!

That's where I come in! I am the guy you want for the time after the training of your developers! Because no training makes you a rockstar #rustlang programmer!

#hireme if you want to try out Rust in one of your projects to help your developers succeed! Whether it is #codereview, #consulting for your Rust experiments or for actual #softwaredevelopment, I can help your team get up to speed with #rustlang!

You also heard about #nix or #nixos and want to try it out? Let me help you exploring the ecosystem of functional package management to speed up your CI, development workflow and making your deployments reproducible!


For the readers of my blog: Above is clearly an advertisement. I won't post more ads on my blog, so be assured that this blog won't transform into an advertising machine!

I asked people whether I should post about my experimental #langdev ... And they asked me to do it, so I'll do it.

Disclaimer: First and foremost I am doing this for FUN!!! There's no “I have a groundbreaking idea” thing here. I want to write my own programming language because I like programming, not because I can make “the next big thing” or anything like that. Also I don't actually want to reinvent stuff. Using libraries for things like a borrow checker is in my opinion better than writing one myself. This is a hobby and nothing more. The end result of this whole thing will never be void even if I cannot get a working MVP out of my efforts because the whole propose of this is having fun.

Another disclaimer: in this article, I will say a few things that might not be 100% technically correct (or even be complete BS) especially with regards to the Haskell language. I am making no claim for technical correctness. If you stumble upon something that is not technically correct, just remind yourself of the context of this article. And especially or the disclaimer above.

Link to the repository: github.com/vunk-lang/vunk-lang.

Where I come from

So where am I coming from? For those who don't know, I started writing #Rust in 2015 and never looked back. Rust is the perfect programming language right now, in my opinion. It's semantics, especially the borrow checker, but also how it makes you think, just clicked with me and I feel extremely productive when writing Rust code. Of course, what Rust brought to the table is nothing new in terms of programming language research. But how it brought it to the table definitely is. Because the language is only one bit, its ecosystem and community are as important or even more important.

And because I want to continue program Rust and was in need for something to do, I wanted to start another hobby project.

The next big point in the “why” question is that I am highly “functional curious”. By that I mean that I am really interested in the functional programming paradigm, though I really have to underline that I am not interested from the mathematical side of things, not at all. In fact, I have little knowledge of all these mathematical things... I could probably not even explain lambda calculus to someone (don't tell the Professors where I studied, although I am not sure they could either).

I am more interested in functional programming from the point of programmer experience. One thing I really learned, or rather engraved into my heart, while getting more and more experience with Rust is that mutable access on memory is bad. Not inherently bad, but bad nonetheless. And yes, that former sentence does not miss a “multiple” before the “mutable”, I really mean that modifying memory in place is bad, although of course it is absolutely necessary. Rust abstracts that danger away pretty nicely via its borrowing mechanisms. Either you can access that piece of memory mutably from one piece of code, or nonmutable from multiple. If you need mutable access from multiple locations, you must use synchronization primitives (modulo unsafe, but let's keep that out of our heads for now).

When I talk about functional programming languages mean Haskell because that's the only functional language I kind of learned (or rather tried to learn – I think I understood the bits, yes also Functors, Applicatives, Monoids and Monads, but never actually used it for something meaningful, which I think is required if you want to say you “learned” a language). From my experience, the mutable access problem fades in (pure) functional languages. Not because it is somehow solved in these languages, but rather because it is of less concern. In Haskell there's no mutation in place (IIRC, remember disclaimer 2), but the language makes you think that things get copied all the time, and the runtime optimizes everything nicely so that you don't have to care too much.

And if we think that thought a bit further, we see that the “mutable access” problem I've been talking about above is nothing more than side effects. Side effects are bad. Rust makes them explicit, which helps a lot, of course. Haskell makes them also explicit, although completely different than Rust (from a programmers perspective).

These are the technical points where I am coming from. The next point is something else entirely: motivation. I have problems with motivation. And by that I do not mean problems as in “someone asked me to do something and now I am just slacking”, no not at all. If I feel like I have a responsibility to do something, I will do it and I will do it to the best of my abilities! If I have an appointment, I will be there ten minutes early. If I have to do the dishes, I will do them and although I rather use new unused ones, I won't re-use already used ones, if you get what I mean. If I get a task on my day job, I will of course do it in reasonable time and not slack off. No “my code is compiling” XKCD on my watch!

By problems with motivation I rather mean things like “I should try implementing a tool for X” and then my brain does the thinking part, thinks it all through and models all necessary abstractions... But then my body cannot get off the couch to actually write that code down. Another instance is reading blog articles. If I find something in my RSS reader where I really am interested in from the heading and catchline, I often fail to read the actual article. I think these examples stem from the same kind of problems with motivation, at least they feel equal to me.

And I actually do not mean that I am too lazy to get my brain into thinking mode, or in case of programming, search for all necessary dependencies, set up the CI or stuff like that – I actually enjoy these bits of work quite a lot – no, I really mean writing the code that implements my idea (and you can actually see that by looking at my GitHub profile).

I have also to note that I think a lot of people have problems like the one I just failed to describe decently (IMO).

But when I started thinking about implementing a compiler for a language I made up, I had this strange tingling in the back of my head that kept coming back. And so far it hasn't gone away, which I also think I must give attribution for to the nice ecosystem of Rust. Previous attempts at programming stuff often times failed because I failed to find decent libraries for implementing my idea!

So, I have to take advantage of that motivation right now and just get that code written, right?

Where I want to go

With all the above in mind, I thought: Why can't we have a language like Rust, with all the bits to do low level programming (and by that I mean interfacing with C libraries easily, not writing a Microcontroller OS), with the borrow semantics, control over references and lifetimes and such, but as a purely functional language? And that's essentially where I want to go.

My idea in one quote would be “What if Rust was purely functional?”

I started writing some example code files how I wanted the language to look (and you can find them in the repository) and then started to look for how I can get them into a compiled form that can be executed.

I first thought of implementing a VM or bytecode interpreter for that, but soon got the idea that LLVM is there and should be used, especially because I want to be able to interface with a C ABI. Now that I heard of HVM, I am also curious whether we cannot have two backends, one compiling to a binary using LLVM and another compiling to HVM bytecode. Not sure whether this would be possible at all, but still an interesting idea.

Because the language should be low level, the programmer should be able to distinguish between pass-by-value and pass-by-reference (just like in Rust of course). There should also be the possibility to do pointer arithmetic, although the same as with Rust should apply: you need to need unsafe for that.

I am not yet sure whether I want to have a macro system like with Rust. I did not think too much about it yet, but there might be the possibility to cover the needs that are fulfilled with macros in Rust with higher order functions. Maybe.

Another idea that I have is that function composition and currying/partial function application should be possible.

The last bit I did not yet think about at all is whether Monads should be the way to abstract side effects or Effects. I learned about effects just a few days or weeks ago and did not yet fully understand the implications. Although I am not sure whether I understood the implications of Monads either. Again, I do not have a mathematical background!

I think I do not have to mention that good (Rust-style or Elm-style) error messages and such are also a goal, do I?

Current state

As of today, I have a working prototype of a lexer/tokenizer and the first bits of a parser implementation. The parser is still missing bits for defining types and enums and the implementation for defining functions is also not ready yet. But we'll get there.

The syntax does not yet have the concept of unsafe, which I will need at some point, because interaction with C code is of course unsafe.

Let me give you an example of code now, but be warned: This could be outdated by tomorrow!!!

use Std.Ops.Add;

answer: I8;
answer = 41;

add A T: (A, A) -> A
    where A: Add T;
add = (a: A, b: A) -> a + b;

addOne A: (A) -> A
    where A: Add I8;
addOne = add 1

main = Std.println $ addOne answer

Let's go though that example line by line.

In the first line, we import something from the “Ops” module from standard library “Std”. What we import is “Add”. It is written uppercase because it is either a type, an enum or a Trait. Modules are also written uppercase and they map directly to directories on the filesystem (also written uppercase). Modules have to be declared before they can be used (pub mod Helpers; mod Private helpers;, not in the example). Functions are always lowercase, although I am not yet sure whether I want to go with CamelCase, pascalCase or snake_case.

Next in the example, we declare something called answer to be of type I8, an 8 bit integer. After that we define it. The declaration could be omitted though, as the type is clear here and can be inferred by the compiler.

Next, we declare and then define a function addI8. That function is generic over A and T where A has to implement the Add trait we imported earlier over some unbounded T. add is then defined to be a function with two parameters of that type A and returns also an instance of that type. It's implementation should now be self describing.

Note that we have to write down the function declaration because then function is generic. If the function is not generic or the types can be figured out by the complier, we can omit the declaration. That would be true for add = (a: I8) -> a + 1, and possibly also for the above implementation of addOne, although I am not yet 100% sure about this one.

addOne from above is now declared and defined using partial function application. The compiler should be smart enough to figure out that the expression add 1 results in a function like this:

add' A T: (A) -> A
    where A: Add T;
add' = (b: A) -> 1 + b;

Last we define the main function to use the println function from Std (without prior importing) to print the result of applying addOne to answer.

Note that the $ character here that you might know from Haskell is only syntax sugar for parentheses, nothing more.

For types, I am even thinking about having impl blocks like in Rust, and them being syntax sugar for free standing functions:

type Person =
    { name: Str
    }

impl Foo = 
    { getName = (&self) -> &self.name;
    }

# above function is equal to
getName = (person: &Person) -> &person.name;

Lifetimes in above example are inferred of course.

Trait notation and implementation of traits on types would work the same way syntax-wise.

If you're curious for the bits that are not in these examples, you can always browse the code examples in the repository. Examples are actually ran through the lexer and parser, so they have to be up to date.

Closing thoughts

This is an experiment. An experiment in what I am able to do, but also an experiment in how to keep me motivated. I hope it works out. But i don't know whether it will or whether I lose interest tomorrow. Let's hope I don't.

And just to note, because it might come up: the question of “Why not {Roc, Elm, Elixir, SomeOtherLesserKnownFPLanguageFromGithub}” is answered in the “Where I come from” section, if you read close enough!

If you want to contribute, which I would like to see, please keep in mind that I am learning things here. I am not a functional programming language expert, I have no mathematical background. If you contribute, an explanation of what you're trying to accomplish and how is required, because I won't learn things otherwise. I value programmer experience and simplicity more than mathematical elegance or such. So be prepared to explain linear types, effects, HKT, etc to me if you want to contribute code for these things!

That said, you're welcome to send patches! Just ping me on mastodon or write a short issue on GitHub about your idea and then start hacking. I am normally fast in responding, so if you just open an issue like “Hey I want to add infix functions” (which we do not have right now), that'd be great for me to know so I can give you feedback on your idea fast! Although I am not sure whether I want to have infix functions.

Comments on all bits in this article are warmly welcome! writefreely has no option for responding or even getting notified of replies, so please post comments with @musicmatze@social.linux.pizza mentioned if you want to comment via mastodon. Email for a more private conversation is fine as well, of course!

I have been playing with functional languages a few times now. I have tried Haskell, Elm and Elixir so far, but also played a bit with a LISP-like language that is implemented in Rust for being easily embeddable into Rust programs.

Lately, I also played with “roc” (https://www.roc-lang.org/), which is a new approach for a functional programming language that can compile to binary (using LLVM) that also has some new ideas.

One of these ideas, that I quite like to be honest, is the idea of “platforms”. A developer has to write a minimal platform the program they want to write runs on. This platform is nothing more and nothing less than the interface between roc and the outside world – which is stateful! So the platform deals with allocation and deallocation, talks to the operating system to, for example, read from the commandline or handle incoming network traffic. The platform has to be very minimal, and must ensure that there is no state leaked through to the roc program!

roc itself has a syntax that is rather nice, although I think there can be improvements. The idea of the authors is that roc should be easy to use, like Elixir or Elm, spark joy in the programmer and still be fast. In fact, it has been shown (in a very unscientific benchmark IIRC, but that's more due to the fact that roc is in very early development as in due to other reasons) that roc can in fact outperform imperative languages in some cases! Leveraging the power of LLVM is one of the reasons for that, but also that roc enables performance optimizations that are not possible or very hard in other circumstances.

To conclude, I am really looking forward to roc and hope that it will be more “you can play with this”-ready than it is today. Right now, compiling everything from source all the time (roc is written in Rust) and having only very few pieces of documentation available, makes playing with roc hard. I hope that will improve soon.

To quote myself:

So my ideal #functional #language #programminglanguage would be a pure functional language with traits/interfaces, ergonomics like #ruby or #elm or #elixir, performance like #rust because it compiles to binary with #llvm, with the “platform” approach #roc #roclang currently has, so one can provide minimal interfaces to the OS ...

Which would make it a perfect language for #containers and #microservices but also #CLI tooling and even websites via #wasm (if it can compile to wasm)!

... and I think roc can be that language!


Please note that roc is not released as opensource codebase at the time of writing this article, but it will be as soon as the author thinks it is time. I wrote “roc” in lowercase letters in this article because – and only because – I think that's how it is intended.


“Thoughts” is (will be) a weekly roll-up of my mastodon feed with some notable thoughts collected into a long-form blog post. “Long form” is relative here, as I will only expand a little on some selected subjects, not write screens and screens of text on each of these subjects.

If you think I toot too much to follow, this is an alternative to follow some of my thoughts.


This week (2021-05-29 – 2021-06-04) I got very angry about this “you need an app for this” bullshit and some things died.

App madness

I am very angry (german) about every other service forcing me to install an app of some sort or another on my devices. This time it was my insurance that wanted me to install either a desktop application (Windows and Mac only) or a Smartphone app (Android or iOS only) just to update my bank account or address. How mad can they possibly be?

I reported them to #digitalcourage.

github actions

I probably said it before, but the more I play with #githubactions, the more I like it.

This is mainly due to the fact that the features are well-designed. You can make dependend jobs and you can even boot up #docker containers, if your application has to be tested against, for example, a database or some other service. I know that #github will never #opensource this, that's why I hope someone implements it as a FLOSS alternative!

Awesome Rust

It always amazes me how good the #rust standard library actually is. I was able to solve an issue in my codebase with a two-line patch that would have been way more complex in another language!

Dying things

First audacity, now stackoverflow (german) died.

What's next? I really hope we can develop alternative FLOSS platforms. For audacity, there are several alternative tools around that one can switch to. For stackoverflow, not so much. Especially because the software is only one part, the other part is the data. There are dumps of stackoverflow somewhere on the internet (I'm not linking because I don't know how legal these dumps actually are), so maybe someone can implement a FLOSS alternative (please make it federated or distributed) and import that dataset?

This would be awesome!

Recently, I voiced my discomfort... no, lets face it: my anger with people that cannot obey the simplest commit message rules:

Why can't people obey these simple #commit message rules?

  • Use an uppercase letter for the start of your subject line
  • EXPLAIN what you did, not “Fixes”

#git

(toot)

This really bothers me. I (co-)maintain a few crates in the #rust ecosystem. There are contributions rolling in every other week and I love that, because it makes me happy to see that other people care about the same things that I care about. Still, I am constantly asking people to rewrite commit messages or clean up their branches because they did strange things – for example merge the master branch instead of rebasing their pull-request to fix merge conflicts. And sometimes even change things in this merge commit, making a review utterly impossible!

Most of the time I do not bother if people just don't capitalize the first letter of their commit message, but it bothers me to no end, still. That's why I teach others to write proper commit messages when I teach them how to use #git, and I really try to be a pain in their ... youknowwhat, so they are annoyed by me telling them “No, rephrase that!” all the time!

I am not angry if people fail to use git trailers the right way (and yes, these are kernel commit conventions. Does not mean they cannot be applied to other workflows as well)! These rules are, of course, not carved into stone. Still, it is a matter of good behaviour in the community to give attribution to people involved in the process of applying the patch (using “Signed-off-by”, “Acked-by” or “Reviewed-by”), crafting the patch (using “Co-authored-by”, “Suggested-by”, “Signed-off-by”) or others (“CC”, “Reported-by”, ...).

I hope I don't have to repeat that commit messages like “Fixes” or “Refactor” are bullshit!

How to NOT do better

There are projects out there that try to make you a better committer. Most known is conventional commits.

I don't like these things at all. “Why?” you may ask? The answer to that is really simple: It makes you think less about what you've done, and, and that's propably the worst thing, it gives you the ability to auto-generate a changelog from your commit logs. But commit logs are not changelogs. Commit logs are logs of steps how your software was developed. A changelog is a list of things your users need to know about when upgrading from one version to another. They don't need to know the steps that where taken to provide new features, fix bugs or refactor your codebase, they need to know about what changes for them, how using the product has changed!

Luckily I have managed to stay away from projects using conventional commits.

How to do better

There are tons and tons of guides out there how to write proper git commit messages. I leave searching for them as a task for the reader here (one thing I want to link here is Drew DeVault's article on a disciplined git workflow). The very basics are:

  • The subject line must not exceed 50 characters
  • The subject line should be capitalized and must not end in a period (really, who on earth would end it with a period? I mean... do you end your email subjects with a period?)
  • The subject line must be written in imperative mood (“Fix”, not “Fixed” / “Fixes” etc.)
  • The body copy must be wrapped at 72 columns
  • The body copy must only contain explanations as to what and why, never how. The latter belongs in documentation and implementation.

There are, of course tons of great examples out there. And because people get annoyed if I tell them that the best examples can be found in the linux kernel community (“These people are GODs, I cannot compare to them” – why not?), I can only show you some less GODish commits (by me)!

Have a look at my

  • contribution to shiplift. The commit message is not that long and the change is atomic. The subject line is a short summary what the change is about, the body explains why this is/was done. The trailer notes that I submitted this according to the developercertificate.
  • contribution to config-rs. Nobody said that the commit subject has to explain all the things – as long as there is reasonable explanation in the body, the subject can be “Simplify implementation” (like here).
  • commit bringing order to the galaxy. There can be the occasional joke, of course!

These are all rather short commit messages for simple patches. Longer messages with more explanations also exist in my projects! For example in the butido project there are changes like this one, or this one or even this very long one. Or, to go crazy, this enormous one here.

These commits have one thing in common: They explain why things were done.

And you can do that to! One really simple idea that is worth trying out is not to use the -m flag of git-commit at all. This way you are presented with your favourite editor and can pause for a moment to think about what to write.

Don't be that guy that appears on the front page of commitlogsfromlastnight.com!

Today, I wrote a mastodon bot.

Shut up and show me the code!

Here you go.

The idea

My idea was, to write a bot that fetches the lastest master from a git repository and counts some commits and then posts a message to mastodon about what it counted.

Because I always complain about people pushing to the master branch of a big community git repository directly, I decided that this would be a perfect fit.

(Whether pushing to master directly is okay and when it is not okay to do this is another topic and I won't discuss this here)

The dependencies

Well, because I didn't want to implement everything myself, I started pulling in some dependencies:

  • log and env_logger for logging
  • structopt, toml, serde and config for argument parsing and config reading
  • anyhow because I don't care too much about error handling. It just has to work
  • getset for a bit cleaner code (not strictly necessary, tbh)
  • handlebars for templating the status message that will be posted
  • elefren as mastodon API crate
  • git2 for working with the git repository which the bot posts about

The Plan

How the bot should work was rather clear from the outset. First of all, it shouldn't be a always-running-process. I wanted it to be as simple as possible, thus, triggering it via a systemd-timer should suffice. Next, it should only fetch the latest commits, so it should be able to work on a working clone of a repository. This way, we don't need another clone of a potentially huge repository on our disk. The path of the repository should of course not be hardcoded, as shouldn't the “upstream” remote name or the “master” branch name (because you might want to track a “release-xyz” branch or because “master” was renamed to something else).

Also, it should be configurable how many hours of commits should be checked. Maybe the user wants to run this bot once a day, maybe once a week. Both is possible, of course. But if the user runs it once a day, they want to check only the commits of the last 24 hours. If they run it once a week, the last 168 hours would be more appropriate.

The message that gets posted should also not be hardcoded, but a template where the variables the bot counted are available.

All the above goes into the configuration file the bot ready (and which can be set via the --config option on the bots CLI).

The configuration struct for the setup described above is rather trivial, as is the CLI setup.

The setup

The first things the bot has to do is reading the commandline and the configuration after initializing the logger, which is a no-brainer, too:

fn main() -> Result<()> {
    env_logger::init();
    log::debug!("Logger initialized");

    let opts = Opts::from_args_safe()?;
    let config: Conf = {
        let mut config = ::config::Config::default();

        config
            .merge(::config::File::from(opts.config().to_path_buf()))?
            .merge(::config::Environment::with_prefix("COMBOT"))?;
        config.try_into()?
    };
    let mastodon_data: elefren::Data = toml::de::from_str(&std::fs::read_to_string(config.mastodon_data())?)?;

The mastodon data is read from a configuration file that is different from the main configuration file, because it may contain sensitive data and if a user wants to put their configuration of the bot into a (public?) git repository, they might not want to include this data. That's why I opted for another file here, its format is described in the configuration example file (next to the setting where the file actually is).

Next, the mastodon client has to be setup and the repository has to be opened:

    let client = elefren::Mastodon::from(mastodon_data);
    let status_language = elefren::Language::from_639_1(config.status_language())
        .ok_or_else(|| anyhow!("Could not parse status language code: {}", config.status_language()))?;
    log::debug!("config parsed");

    let repo = git2::Repository::open(config.repository_path())?;
    log::debug!("Repo opened successfully");

which is rather trivial, too.

The Calculations

Then, we fetch the appropriate remote branch and count the commits:

    let _ = fetch_main_remote(&repo, &config)?;
    log::debug!("Main branch fetched successfully");

    let (commits, merges, nonmerges) = count_commits_on_main_branch(&repo, &config)?;
    log::debug!("Counted commits successfully");

    log::info!("Commits    = {}", commits);
    log::info!("Merges     = {}", merges);
    log::info!("Non-Merges = {}", nonmerges);

The functions called in this snippet will be described later on. Just consider them working for now, and let's move on to the status posting part of the bot now.

First of all, we use the variables to compute the status message using the template from the configuration file.

    {
        let status_text = {
            let mut hb = handlebars::Handlebars::new();
            hb.register_template_string("status", config.status_template())?;
            let mut data = std::collections::BTreeMap::new();
            data.insert("commits", commits);
            data.insert("merges", merges);
            data.insert("nonmerges", nonmerges);
            hb.render("status", &data)?
        };

Handlebars is a perfect fit for that job, as it is rather trivial to use, albeit a very powerful templating language is used. The user could, for example, even add some conditions to their template, like if there are no commits at all, the status message could just say “I'm a lonely bot, because nobody commits to master these days...” or something like that.

Next, we build the status object we pass to mastodon, and post it.

        let status = elefren::StatusBuilder::new()
            .status(status_text)
            .language(status_language)
            .build()
            .expect("Failed to build status");

        let status = client.new_status(status)
            .expect("Failed to post status");
        if let Some(url) = status.url.as_ref() {
            log::info!("Status posted: {}", url);
        } else {
            log::info!("Status posted, no url");
        }
        log::debug!("New status = {:?}", status);
    }

    Ok(())
} // main()

Some logging is added as well, of course.

And that's the whole main function!

Fetching the repository.

But we are not done yet. First of all, we need the function that fetches the remote repository.

Because of the infamous git2 library, this part is rather trivial to implement as well:

fn fetch_main_remote(repo: &git2::Repository, config: &Conf) -> Result<()> {
    log::debug!("Fetch: {} / {}", config.origin_remote_name(), config.master_branch_name());
    repo.find_remote(config.origin_remote_name())?
        .fetch(&[config.master_branch_name()], None, None)
        .map_err(Error::from)
}

Here we have a function that takes a reference to the repository as well as a reference to our Conf object. We then, after some logging, find the appropriate remote in our repository and simply call fetch for it. In case of Err(_), we map that to our anyhow::Error type and return it, because the callee should handle that.

Counting the commits

Counting the commits is the last part we need to implement.

fn count_commits_on_main_branch(repo: &git2::Repository, config: &Conf) -> Result<(usize, usize, usize)> {

The function, like the fetch_main_remote function, takes a reference to the repository as well as a reference to the Conf object of our program. It returns, in case of success, a tuple with three elements. I did not add strong typing here, because the codebase is rather small (less than 160 lines overall), so there's not need to be very explicit about the types here.

Just keep in mind that the first of the three values is the number of all commits, the second is the number of merges and the last is the number of non-merges.

That also means:

tuple.0 = tuple.1 + tuple.2

Next, let's have a variable that holds the branch name with the remote, like we're used from git itself (this is later required for git2). Also, we need to calculate the timestamp that is the lowest timestamp we consider. Because our configuration file specifies this in hours rather than seconds, we simply * 60 * 60 here.

    let branchname = format!("{}/{}", config.origin_remote_name(), config.master_branch_name());
    let minimum_time_epoch = chrono::offset::Local::now().timestamp() - (config.hours_to_check() * 60 * 60);

    log::debug!("Branch to count     : {}", branchname);
    log::debug!("Earliest commit time: {:?}", minimum_time_epoch);

Next, we need to instruct git2 to create a Revwalk object for us:

    let revwalk_start = repo
        .find_branch(&branchname, git2::BranchType::Remote)?
        .get()
        .peel_to_commit()?
        .id();

    log::debug!("Starting at: {}", revwalk_start);

    let mut rw = repo.revwalk()?;
    rw.simplify_first_parent()?;
    rw.push(revwalk_start)?;

That can be used to iterate over the history of a branch, starting at a certain commit. But before we can do that, we need to actually find that commit, which is the first part of the above snippet. Then, we create a Revwalk object, configure it to consider only the first parent (because that's what we care about) and push the rev to start walking from it.

The last bit of the function implements the actual counting.

    let mut commits = 0;
    let mut merges = 0;
    let mut nonmerges = 0;

    for rev in rw {
        let rev = rev?;
        let commit = repo.find_commit(rev)?;
        log::trace!("Found commit: {:?}", commit);

        if commit.time().seconds() < minimum_time_epoch {
            log::trace!("Commit too old, stopping iteration");
            break;
        }
        commits += 1;

        let is_merge = commit.parent_ids().count() > 1;
        log::trace!("Merge: {:?}", is_merge);

        if is_merge {
            merges += 1;
        } else {
            nonmerges += 1;
        }
    }

    log::trace!("Ready iterating");
    Ok((commits, merges, nonmerges))
}

This is done the simple way, without making use of the excelent iterator API. First, we create our variables for counting and then, we use the Revwalk object and iterate over it. For each rev, we unwrap it using the ? operator and then ask the repo to give us the corresponding commit. We then check whether the time of the commit is before our minimum time and if it is, we abort the iteration. If it is not, we continnue and count the commit. We then check whether the commit has more than one parent, because that is what makes a commit a merge-commit, and increase the appropriate variable.

Last but not least, we return our findings to the caller.

Conclusion

And this is it! It was a rather nice journey to implement this bot. There isn't too much that can fail here, some calculations might wrap and result in false calculations. Possibly a clippy run would find some things that could be improved, of course (feel free to submit patches).

If you want to run this bot on your own instance and for your own repositories, make sure to check the README file first. Also, feel free to ask questions about this bot and of course, you're welcome to send patches (make sure to --signoff your commits).

And now, enjoy the first post of the bot.

tags: #mastodon #bot #rust

Today, I challenged myself to write a prometheus exporter for MPD in Rust.

Shut up and show me the code!

Here you go and here you go for submitting patches.

The challenge

I recently started monitoring my server with prometheus and grafana. I am no-way a professional user of these pieces of software, but I slowly got everything up and running. I learned about timeseries databases at university, so the basic concept of prometheus was not new to me. Grafana was, though. I then started learning about prometheus exporters and how they are working and managed to setup node exporters for all my devices and imported their metrics into a nice grafana dashboard I downloaded from the official website.

I figured, that writing an exporter would make me understand the whole thing even better. So what would be better than exporting music data to my prometheus and plotting it with grafana? Especially because my nickname online is “musicmatze”, right?

So I started writing a prometheus exporter for MPD. And because my language of choice is Rust, I wrote it in Rust. Rust has good libraries available for everything I needed to do to export basic MPD metrics to prometheus and even a prometheus exporter library exists!

The libraries I decided to use

Note that this article was written using prometheus-mpd-exporter v0.1.0 of the prometheus-mpd-exporter code. The current codebase might differ, but this was the first working implementation.

So, the scope of my idea was set. Of course, I needed a library to talk to my music player daemon. And because async libraries would be better, since I would essentially write a kind of a web-server, it should be async. Thankfully, async_mpd exists.

Next, I needed a prometheus helper library. The examples in this library work with hyper. I was not able to implement my idea with hyper though (because of some weird borrowing error), but thankfully, actix-web worked just fine.

Besides that I used a bunch of convenience libraries:

  • anyhow and thiserror for error handling
  • env_logger and log for logging
  • structopt for CLI parsing
  • getset, parse-display and itertools to be able to write less code

The first implementation

The first implementation took me about four hours to write, because I had to understand the actix-web infrastructure first (and because I tried it with hyper in the first place, which did not work for about three of that four hours).

The boilerplate of the program includes

  • Defining an ApplicationError type for easy passing-around of errors that happen during the runtime of the program
  • Defining an Opt as a commandline interface definition using structopt
#[actix_web::main]
async fn main() -> Result<(), ApplicationError> {
    let _ = env_logger::init();
    log::info!("Starting...");
    let opt = Opt::from_args();

    let prometheus_bind_addr = format!("{}:{}", opt.bind_addr, opt.bind_port);
    let mpd_connect_string = format!("{}:{}", opt.mpd_server_addr, opt.mpd_server_port);

The main() function then sets up the logging and parses the commandline arguments. Thanks to env_logger and structopt, that's easy. The main() function also acts as the actix_web::main function and is async because of that. It also returns a Result<(), ApplicationError>, so I can easily fail during the setup phase of the program.

Next, I needed to setup the connection to MPD and wrap that in a Mutex, so it can be shared between request handlers.

    log::debug!("Connecting to MPD = {}", mpd_connect_string);
    let mpd = async_mpd::MpdClient::new(&*mpd_connect_string)
        .await
        .map(Mutex::new)?;

    let mpd = web::Data::new(mpd);

And then setup the HttpServer instance for actix-web, and run it.

    HttpServer::new(move || {
        App::new()
            .app_data(mpd.clone()) // add shared state
            .wrap(middleware::Logger::default())
            .route("/", web::get().to(index))
            .route("/metrics", web::get().to(metrics))
    })
    .bind(prometheus_bind_addr)?
    .run()
    .await
    .map_err(ApplicationError::from)
} // end of main()

Now comes the fun part, tho. First of all, I have setup the connection to MPD. In the above snippet, I add routes to the HttpServer for a basic index endpoint as well as for the /metrics endpoint prometheus fetches the metrics from.

Lets have a look at the index handler first, to get a basic understanding of how it works:

async fn index(_: web::Data<Mutex<MpdClient>>, _: HttpRequest) -> impl Responder {
    HttpResponse::build(StatusCode::OK)
        .content_type("text/text; charset=utf-8")
        .body(String::from("Running"))
}

This function gets called every time someone accesses the service without specifying an endpoint, for example curl localhost:9123 would result in this function being called.

Here, I can get the web::Data<Mutex<MpdClient>> object instance that actix-web handles for us as well as a HttpRequest object to get information about the request itself. Because I don't need this data here, the variables are not bound (_). I added them to be able to extend this function later on easily.

I return a simple 200 (that's the StatusCode::OK here) with a simple Running body. curling would result in a simple response:

$ curl 127.0.0.1:9123
Running

Now, lets have a look at the /metrics endpoint. First of all, the signature of the function is the same:

async fn metrics(mpd_data: web::Data<Mutex<MpdClient>>, _: HttpRequest) -> impl Responder {
    match metrics_handler(mpd_data).await {
        Ok(text) => {
            HttpResponse::build(StatusCode::OK)
                .content_type("text/text; charset=utf-8")
                .body(text)
        }

        Err(e) => {
            HttpResponse::build(StatusCode::INTERNAL_SERVER_ERROR)
                .content_type("text/text; charset=utf-8")
                .body(format!("{}", e))
        }
    }
}

but here, we bind the mpd client object to mpd_data, because we want to actually use that object. We then call a function metrics_handler() with that object, wait for the result (because that function itself is async, too), and match the result. If the result is Ok(_), we get the result text and return a 200 with the text as the body. If the result is an error, which means that fetching the data from MPD somehow resulted in an error, we return an internal server error (500) and the error message as body of the response.

Now, to the metrics_handler() function, which is where the real work happens.

async fn metrics_handler(mpd_data: web::Data<Mutex<MpdClient>>) -> Result<String, ApplicationError> {
    let mut mpd = mpd_data.lock().unwrap();
    let stats = mpd.stats().await?;

    let instance = String::new(); // TODO

First of all, we extract the actual MpdClient object from the web::Data<Mutex<_>> wrapper. Them, we ask MPD to get some stats() and wait for the result.

After that, we create a variable we don't fill yet, which we later push in the release without solving the “TODO” marker and when we blog about what we did, we feel ashamed about it.

Next, we create Metric objects for each metric we record from MPD and render all of them into one big String object.

    let res = vec![
        Metric::new("mpd_uptime"      , stats.uptime      , "The uptime of mpd", &instance).into_metric()?,
        Metric::new("mpd_playtime"    , stats.playtime    , "The playtime of the current playlist", &instance).into_metric()?,
        Metric::new("mpd_artists"     , stats.artists     , "The number of artists", &instance).into_metric()?,
        Metric::new("mpd_albums"      , stats.albums      , "The number of albums", &instance).into_metric()?,
        Metric::new("mpd_songs"       , stats.songs       , "The number of songs", &instance).into_metric()?,
        Metric::new("mpd_db_playtime" , stats.db_playtime , "The database playtime", &instance).into_metric()?,
        Metric::new("mpd_db_update"   , stats.db_update   , "The updates of the database", &instance).into_metric()?,
    ]
    .into_iter()
    .map(|m| {
        m.render()
    })
    .join("\n");

    log::debug!("res = {}", res);
    Ok(res)
}

Lastly, we return that String object from our handler implementation.

The Metric object implementation my own, we'll focus on that now. It will help a bit with the interface of the prometheus_exporter_base API interface.

But first, I need to explain the Metric type:

pub struct Metric<'a, T: IntoNumMetric> {
    name: &'static str,
    value: T,
    description: &'static str,
    instance: &'a str,
}

The Metric type is a type that holds a name for a metric, its value and some description (and the aforementioned irrelevant instance). But because the metrics we collect can be of different types (for example a 8-bit unsigned integer u8 or a 32-bit unsigned integer u32), I made that type abstract over it. The type of the metric value must implement a IntoNumMetric trait, though. That trait is a simple helper trait:

use num_traits::Num;
pub trait IntoNumMetric {
    type Output: Num + Display + Debug;

    fn into_num_metric(self) -> Self::Output;
}

And I implemented it for std::time::Duration, u8, u32 and i32 – the implementation itself is trivial and I won't show it here.

Now, I was able to implement the Metric::into_metric() function shown above:

impl<'a, T: IntoNumMetric + Debug> Metric<'a, T> {
    // Metric::new() implementation, hidden here

    pub fn into_metric<'b>(self) -> Result<PrometheusMetric<'b>> {
        let instance = PrometheusInstance::new()
            .with_label("instance", self.instance)
            .with_value(self.value.into_num_metric())
            .with_current_timestamp()
            .map_err(Error::from)?;

        let mut m = PrometheusMetric::new(self.name, MetricType::Counter, self.description);
        m.render_and_append_instance(&instance);
        Ok(m)
    }
}

This function is used for converting a Metric object into the appropriate PrometheusMetric object from prometheus_exporter_base.

The implementation is, of course, also generic over the type the Metric object holds. A PrometheusInstance is created, a label “instance” is added (empty, you know why... :–( ). Then, the value is added to that instance using the conversion from the IntoNumMetric trait. The current timestamp is added as well, or an error is returned if that fails.

Last but not least, a new PrometheusMetric object is created with the appropriate name and description, and the instance is rendered to it.

And that's it!

Deploying

The code is there now. But of course, I still needed to deploy this to my hosts and make it available in my prometheus and grafana instances.

Because I use NixOS, I wrote a nix package definition and a nix service defintion for it, making the endpoint available to my prometheus instance via my wireguard network.

After that, I was able to add queries to my grafana instance, for example:

mpd_db_playtime / 60 / 60 / 24

to display the DB playtime of an instance of my MPD in days.

I'm not yet very proficient in grafana and the query language, and also the service implementation is rather minimal, so there cannot be that much metrics yet.

Either way, it works!

A basic dashboard for MPD stats

Next steps and closing words

The next steps are quite simple. First of all, I want to make more stats available to prometheus. Right now, only the basic statistics of the database are exported.

The async_mpd crate makes a lot of other status information available.

Also, I want to get better with grafana queries and make some nice-looking graphs for my dashboard.

Either way, that challenge took me longer than I anticipated in the first place (“I can hack this in 15 minutes” – famous last words)! But it was fun nonetheless!

The outcome of this little journey is on crates.io and I will also submit a PR to nixpkgs to make it available there, too.

If you want to contribute patches to the sourcecode, which I encourage you to do, feel free to send me patches!

tags: #prometheus #grafana #rust #mpd #music

Holy crap, I haven't written on my blog for a long time. And I almost missed that the Rust community asked for blog posts about Rust in 2021 - but I am in time I guess, so here it goes.

Most Rustaceans won't agree with this blog post, I guess. But I also think that's fine, because that's the whole point of the Blog-Post-For-The-Roadmap thing, right? Asking people for different opinions and starting a constructive discussion about the topic. I also must say that I haven't read a single one of the other Rust-2021 Blog posts just yet.

I also think this will be rather short, but I hope I express my feelings in the best way possible for you all to understand.

Don't Change!

I got into Rust at about Rust 1.5.0. After the first half of 2020, I felt like January was years ago, so I feel like Rust 1.5.0 was in another lifetime. So much happened this year, and still, so little was accomplished by me and my friends. The world turned upside down, essentially.

Rust changed a lot between 1.5.0 and the current compiler I have installed on my system:

$ rustc --version
rustc 1.46.0 (04488afe3 2020-08-24)

The RELEASES.md file is a whooping 9167 lines long. We got cargo workspaces, we got awesome things like the ? operator (which I definitively was not a friend of in the beginning), we got associated constants, incremental compilation, impl Trait, we got const functions and most importantly we got async/await.

Fairly, that's where I started to struggle to keep up. I definitively see the value in async-await and what it actually enables us to do with Rust, and how to do it. But I just couldn't keep up with the change anymore. It was too much. I couldn't cope learning all these new things just in time they arrived. I, to this day, struggle to write a simple Program with async/await if there's too much iterators involved. I don't know where my actual problems are, because I cannot see through the whole concept enough to understand what I am doing wrong.

The last five years were full of change. Good change, of course. But this last (almost)year just drowned me. Too much to handle.

My hope is, that Rust does not change anymore when it comes to features. I see that there is a lot of demand for const generics, especially by the embedded community. I understand why. I hope it doesn't have any impact on me as a commandline-program-writing Rustacean.

But...

But. There's always a “but”, isn't there?

I still have high hopes for some things concerning Rust. But they do not at all have to do with the language Rust, but the environment around it. As stated before, I'm a commandline-program writing person. I do not write web services (yet?), I do not write embedded stuff, I do not write high-performance/performance-critical stuff.

Essentially: I write programs in Rust that others would write in Python, Ruby or Node. I write them in Rust because I am a lazy programmer, because I do not care enough. I write Rust, because the compiler YELLS at me to get it right. If I would do the same thing in Ruby, my go-to-language for everything below 100LOC, I would get myself into a wheelchair because I would use every footgun available.

Deep inside, I'm a bad programmer and rustc forces me to be a good one.

That was a bit of a rant, I hope you're still with me. The paragraph above was for you to understand where I come from. I don't care if my program runs in 1 second or 10, because the domain I write for does not care most of the time. But what is important to me, is that I actually can write my programs. Often, I cannot. And that's simply because of one thing:

Libraries are missing.

Those who know me knew that this was coming. Libraries for domains that I care about are missing. That is calendar (icalendar) reading/writing, vcard reading/writing, email reading/writing (the format, not the networking stuff), ... There are already libraries out there for these things, although they are far from being complete, usable or even correct. Writing a simple TUI MUA for notmuch is pain right now, because parsing email is really hard in Rust, and there are no high-level libraries available. The “mail” crate ecosystem is closest, but they do not yet have a parser.

There are, I am sure, more things in this part of the ecosystem (that is libraries for basic formats) where Rust could shine, but does not yet.

My request for Rust in 2021 is: Make things shiny. Make them available, make them work, make them correct, make them nice to use (E.G. parsing mails into tokens and handing them to me is okay, but having a high-level interface is much nicer).


To sum it up in one sentence:

Don't change rust itself, but improve the library ecosystem.

That's my hope for Rust 2021. Thank you for having me in this awesome community and thank you for reading.

tags: #rust #programming

Finally, I managed to implement a proof of concept of serde-select. But lets start at the beginning.

The Problem

The problem I tried to solve with this crate is rather simple: You need to be able to get values from a serde-compatible document (e.g. toml, json, yaml, ...) but you don't know the full schema of the document at compiletime of your crate.

The origin of the idea of serde-select was when I first started working on my imag project, where a lot of seperated crates coexist in one ecosystem, but all of them should be configured in one big configuration file. Of course I did not want to have one central crate just for defining the schema, especially since a user might not want to use all functionality from the ecosystem, thus not having a “full” configuration file, but only the parts they needed.

So I started writing “toml-query”, a crate which lets the programmer query a toml::Value with a “path”. For example:

[calendar]
list_format = "{{lpad 5 i}} | {{abbrev 5 uid}} | {{summary}} | {{location}}"
show_format = """
{{i}} - {{uid}}
"""

[ref]
[ref.basepathes]
music = "/home/user/music"
contacts = "/home/user/contacts"
calendars = "/home/user/calendars"

The document looks like this, but in the program code we only need calendar and its sub-values. So we can do

let r = document.read("calendar.list_format");

in the code and get a Result<Option<&'document Value>> value back.

toml-query evolved over the time, now featuring more flexibility by implementing “Partials”, how I call them. These are structs that are Serialize + Deserialize and have a path attached to them, so deserializing the partial document is possible right away:

let r: Result<Option<CalendarConfig>, _> = document.read_partial::<CalendarConfig>();

where CalendarConfig: Serialize + Deserialize + Debug + toml_query::Partial (see here).

The evolution

toml-query works perfectly fine and I use it in my other projects a lot. It is fast and easy to use. Its error reporting is nice.

But an idea formed in the back of my head and I did not stop to think about it.

Can toml-query by generalized to work with all formats serde can handle?

So I started to experiement with a more general implementation: serde-select was born.

And today I managed to get the first bits working.

Meet serde-select

serde-select implements a “read” functionality for both JSON and TOML, depending on what features you enable. The inner implementation of the resolve-algorithm is agnostic of the actual format.

For a quick overview how to use the crate right now, have a look at the tests for toml for example.

I strongly advice against using this crate, though. It is only an experiement for now and shouldn't be used in production code. Nevertheless, I published the first preview on crates.io.

tags: #rust #programming

The call for blogs was just issues a few days ago – and here I am writing about my biggest pains this year... because that's what the call for blogs basically is for me... I write down my pains with Rust and hope things get better slowly next year.

Don't misunderstand what I want to say here though: Rust is awesome, has an awesome community, awesome tooling, awesome everything... well not completely (because otherwise I wouldn't have to write this article, right?), but almost.

Pain #0: Libraries

Pain Number Zero (because that's where computer programmers start to count) is the library ecosystem. “What?” you say? “Rust is known for a really good library ecosystem although the language is not even five years old (counting from 1.0.0)“, you might say! And of course you're right... but not in my domain, unfortunately.

Rust has excellent libraries for developing web services backend as well as backend, gaming engines and games and of course microcontroller stuff. But these are not my domains. My domain is commandline user stuff. My domain might become TUI applications or even GUI applications in the future. My domain is data formats, especially icalendar and vcard, because I write journal applications, calendar applications, contact management stuff, todo applications and diary tooling and even Email processing/handling stuff and possibly even a CLI/TUI mail reader – of course I'm talking about imag here, the text-based, commandline personal information management suite I'm developing over four years now.

The tooling in this domain is not nonexistent, no way! But, despite the efforts some people in this awesome community started, the number and especially the quality of those libraries is nowhere as satisfying as the support for other domains. No offense to the libraries authors of course! It is not their fault at all. It is just that only a few people have started initiatives in this direction yet. I try to contribute! I am actually working on a libical high-level frontend which I started to extract from khaleesi, a work by two wonderful people which includes a wrapper around libical and libical-sys that I started to extract into a library crate.

But there could be so much more and better support for these things! I can only do so much. So I call out: Help developing libraries for these standards! Especially help developing high-level Rust libraries for these things, because handling mime as a way to work with emails is just the beginning. Parsing mail into something that can be worked with on a high level in Rust would be a wonderful goal for 2020. And of course all the other domains!

I remember from my days with Ruby that code could be written at a high level when working with mail and other such formats. Lets have that in Rust!

Lets have world-class support for handling data formats at a high level, so we can write “Speaking code” like it would be plaintext.

Pain #1 – CLI

My next pain are frontends. What I mean by that is CLI, TUI and GUI frontends, not WUI (web user interface) frontends. But I'll break this down into several sections here, so let's talk about CLI first...

One thing here is of course the wonderful clap crate. Lets make clap v3.0 happen next year! It would be a huge step forward!

But this is only one minor pain point, because clap is already a wonderful thing. Lets also make the interactive commandline user interface story better!

I remember that the people from the Node community have commandline applications that you can use interactively that just amaze me because they are so comfy to use (and I'm not even talking about TUI applications here, just interactive CLI apps)! I think we can have this in Rust!

Lets have the best libraries to implement interactive commandline applications!

Pain #2 – TUI

TUI is the next thing i want to point out. Short disclaimer though: I never wrote a TUI app, but I certainly plan to do so. Maybe not in 2020 but after that, imag should get a TUI interface at some point. And for that, of course, I would love to have a headache after thinking about which library or framework to chose.

Right now, there's cursive – and holy swearword this thing looks amazing! But there could be so much more, still! Of course there are already a few extension crates out there:

(btw: @deinstapel you're a hero – they implemented half of the crates above!)

But I bet there could be more... I could think of an embedded terminal for cursive, I can think of an editor-view for embedding vim or another TUI editor into a cursive application, I could think of an editor-like view embedding the Xi editor...

Lets make Rust the go-to choice for writing TUI applications!

Pain #3 – GUI

And of course, the GUI domain. I could write up a long text here, or just point you to other Rust-2020 articles that expand on the topic... but I don't. Why? Because I never implemented a GUI application, I don't see myself implementing one in the near (or even far) future (at least not for imag) and so I don't take the liberty to reiterate what others said more eloquently: The Rust ecosystem for writing GUI applications is not good.

Lets improve our GUI-writing experience!

Summary

All in all, I hope that 2020 will be the year of the Rust Language as application language. We have an awesome tooling and frameworks available for web stuff, the game-domain is expanding constantly and low-level programming is possible and done out there all the time.

Writing applications in Rust is not yet as awesome as it could be, though. So my hopes, dreams and wishes are...

Lets make Rusts high-level application writing experience the best out there!

tags: #rust