musicmatzes blog

The following piece was written months ago, right after the “We Are Developers World Conference” took place in Berlin. So far I have not been reached out to for providing feedback to the conference, and I am not expecting it will ever happen, so I am going to publish this (unedited) here on my blog instead.

There are some (at least one) swearword in here, for those of you who care about that.

Here we go...


I was really astonished about how bad such a conference can be organized, sorry to say.

First of all, the badges we're not even checked at the entrance. The person there was completely overwhelmed and couldn't even look at badges while people were streaming in. There was also too little space to actually check badges. At least 10 people and 5 queues (with a queuing system) would have been a good start. The way it was “organized” I could have just walked in while the person there looked somewhere else. I also heard that you could simply walk into the “premium area” that was reserved for expensive tickets. I didn't try, but given the other checks were basically nonexistent, I can imagine how easy it was.

The keynote talk was another issue. I am very sorry for the condition of Sir Tim Berners Lee and I wish him all the best and hope he gets better soon. He shouldn't speak at conferences in my opinion until he gets better. He was barely understandable, although I consider my English rather good. Other people I was talking to didn't understand a single word. Also his presentation was basically a “I want to sell my product” show, which is not what a keynote should be like, does it?

Next, the conference food was way to expensive. Not sure how much control the organizers have about that. But 20 euros for burger sounds like a joke. I went to the city to get food.

Next, the venue is too small for the amount of people. The hallways were way too crowded. Possibly an issue of a conference that grew too fast.

The venue map is really a joke, isn't it? I did not even figure out how many halls are there actually. Please provide a sane map, with a top-down view and also provide it printed out. There were like 10 locations were I found the map, but at least 50 would have been necessary. Also the map did never show the current position, which was a huge problem.

The app is something completely unnecessary as well. There are enough conference apps out there, for scheduling there's giggity, for chat there's open protocols like matrix. No need to dump money/effort into an app that I have to uninstall again right after the event! Yes I found that it is also available as website and I used that instead. Just wanted to give feedback about that.

The seats in the main hall/stage are an insult. I wasn't able to check the other seats because I wasn't able to catch one.

To the exhibitors: 90% of them had slides/booths that did say a lot of buzzwords, but no content. If you're doing “advanced cloud enterprise data connectivity”, that tells me exactly NOTHING about what you do. I could imagine that this buzzword bingo is normal for exhibitors for such conferences. Does not make it sane anyways.

All of the exhibitors were some form of “cloud” company. No other industries as far as I could tell. Except maybe the car manufacturers. That's sad.

Also, Vodafone should be punished for this fucking robot that blocked the hallways all the time.

In general I would ask exhibitors to send engineers to a developer conference rather than sales people. What good is it if I cannot even ask technical questions? All people at exhibitions I talked with were a disappointing! Maybe I just got the wrong people, I don't know.

I also found it very annoying and intrusive that people just took photos without asking and even the people that seemed to be from the press or similar didn't bother to ask. I am used to higher professionality in that regard.

The conclusion for me is that I urge my company to invest the money for the ticket to WeAreDevelopers rather into tickets to chaos communication congress, which has a ten times higher return of investment. I'd urge the organizers to attend at least one chaos communication congress. There are no exhibitors there (at least there weren't the last few times) but there's a huge opportunity to learn how an IT conference should be organized!

I recently encountered an issue with cargo check vs. cargo build. The former did succeed on my codebase, but the latter did not.

“Wait, what?”, you may ask. “Isn't check supposed to be build, but without actually creating the binary?“, you may say. Yes, it is. But in fact check and build have some differences which caused the latter to fail while the check succeeded.

Here's why.

Reproducing the issue

The first thing I tried was to reduce the issue to a minimal working (or rather “breaking”) example. I was able to reduce my example to the following piece of code:

struct Check;
impl Check {
    const CHECK: () = assert!(1 == 2);
}

fn main() {
    let _ = Check::CHECK;
}

The above piece of code declares a struct named “Check” and implements a associated constant on that type. The associated constant is of type empty tuple(, which compiles to nothing). Its right hand side is an assert macro call which asserts that 1 is equal to 2.

Further down, the code example implements the main function where it assigns an anonymous variable to the value of the associated constant of the Check type.

This example is really simple, and obviously this should not compile. The associated constant should be evaluated at compiletime and make the compiler yell at me, because after years of research, scientists have found out that 1 is indeed not equal to 2.

The problem

Of course I reported the problem upstream – turns out it was known and my report was a duplicate (first report).

The problem here is that the associated constant gets evaluated only at monomorphization time. And cargo check does not do that step during checking of the codebase – although one might argue that it should, because users would expect it to catch such compilation errors.

The solution

The solution for this issue seems to be pending, as the bug in the rust compiler is not yet resolved. Ralf Jung suggested that we could fake monomorphization and ask the compiler to emit metadata during cargo check (if I understood the comment correctly). This way, the issue could potentially be found.

I did not test this though.

Takeaway

The takeaway for me here is that CI should never just cargo check the code. There must be at least one cargo build in your CI setup that builds all of the code.

Besides that I learned about the awesome stuff you can build with const assertions. But that's only slightly related here.


I wrote the following post almost in one sitting. After powering through this, I do not find the energy to review it and fear that I will never post it because of that. That's also why the “Final thoughts” paragraph is rather short.

I don't want it to bit-rot in my editor, so here it goes...


Lately I have started again thinking about writing a distributed social network application. I've been thinking about this since 2017 on and off, but back then the available libraries for Rust were unusable for me, at least it was too much hassle for a free time side project.

But lately I started to think about it again, and some toots I have sent out to the world got some responses that made me think harder.

So let me (again) tell you about that idea.

The idea

The outcome I am targeting is the following: you can start up a GUI application on your desktop where you can post text, audio, video, much like you can with mastodon. The difference is that you don't need to be connected to a server or even to the internet at all. The application is an interface to a fully distributed network, where you can connect directly to your followers or the people you follow.

The application is focused on “macro blogging”, so more like diaspora or Lemmy/kbin, not so much like mastodon (User Interface wise) and tree-style discussion are supported (speaking of the interface/UI/UX).

You can boot that application while being air-gapped, you can boot it on several devices and post to the same account (which you can't with for example scuttlebutt, which is a “distributed social network” app).

That sounds not so fancy? Well, the UI and UX shouldn't be “fancy” in that regard at all. The underlying technology should be, though. So let's talk about it.

Core implementation

It is always hard to describe an implementation of something in text. This is no exception.

First of all, let's have a look at the actors in the system I think of.

Actor: user

A user wants to post text, images, videos, maybe polls, calendars, etc to their profile. They want to tag their content. Maybe they want to have different “streams” in their profile. Maybe they even want several profiles, but that's not very much important since it would only be an implementation detail.

They want to be able to post to that profile from multiple devices. Maybe they have two notebooks, maybe they have another workstation, maybe a tablet, a smartphone and maybe (just maybe) they also have a server from where they want to post via a bot, all to the same profile.

They want to be able to reply to posts from other users, they want to boost them (republish), like them, hide them, block other users, follow other users to see their content.

Maybe they also want to follow a single “stream” of one user, maybe they want to follow single tags.

Actor: discover server

Because of the distributed nature of the system (as we see further down this article), users cannot simply search for “TheAwesomeCat” user and get their profile. Each profile lives on the devices of the user of the profile, at first.

To solve this issue, there should be “discover server” instances. These instances can be hosted by power users or organizations. These servers do nothing but “announce” to connected instances of the application about profiles they know. Users can then decide (via interaction or configuration) whether to fetch these profiles.

Actor: Pin servers

Content lives, at first, only on the devices of the users that post it. If a users follow a profile, they automatically replicate the content that was posted on that profile, which helps distributing that content.

Of course, some might don't want to replicate content, which is totally fine and should be configurable (per followed profile, actually). Reposting content would be a option to ensure content is replicated.

Content replication by users might or might not be enought. To improve on this, there should/could be “pin servers”, that are always online, which a user can tell to pin their content for a decent amount of time. That would help distribution of the content. There could be multiple or a whole network of such servers. They are not really federated, but are more or less distributed as well.

These servers could be run by power users or organizations as well. Power-users can chose to pin only their own content or also that of friends, or even more stuff from other users they select or they get paid by, even.

Data Structures / The “Model”

The above might be a bit confusing because we haven't yet talked about how the system would work from a technical point.

That was on purpose, because I wanted to talk about this top-down rather than bottom-up, since I have better experience describing it this way.

Let's now look at the data structures.

There are two data types that we have to concern ourselves with in this system. The first one is, obviously, the “Profile” which a user posts to.

Profiles

A profile would be a DAG and Merkle Tree. Think about a profile like a git repository. A user posts to their profile by creating a entry in their “repository” (much like a commit in git). Binary content like audio or video would also be possible. “Commits” would contain metadata about the post, such as time, maybe location, tags, mime types of the content, etc.

In detail, there would be three “layers” in that repository. The lowest would be the actual content: text blobs, audio, video, files. The middle layer would be the “metadata” layer, pointing to the content (if there is any, as we'll see later there could be posts without content), the uppermost layer would be nothing more than a DAG of objects that link to their parent objects (multiple, as we need merges in such a system, or none for the first object in a profile) and contain some minimal metadata about the format (basically just a version number).

The uppermost layer may also contain a list of public keys of the instances that post to that profile. This way, discovering a full profile from just one object in the uppermost layer is possible. For size reasons, that list of public keys may also be embedded in the “metadata” layer.

The objects from the uppermost layer are really small and users could potentially cache these for other users. As they do not contain much data, but are vital to discovering profiles, they should be highly available in such a system.

Loading a profile of another user would now mean that the application would fetch the newest object on the uppermost layer and then traverse the DAG until it hits its root node. As an optimization, “pack nodes” could be added, which refer to a larger number of ancestor nodes, to speed up traversal. That would make deletion of stuff more complicated though. More on that later.

Device authentication

If a user wants to post from multiple devices, the application must be able to trust other devices.

If a user posts from device A, and device B sees that new post on the DAG, device B could potentially fast-forward its HEAD to the new node. But it must be able to verify that this new node is actually from a device that is owned by the user.

For such a thing, public-private key crypto is there and solves the problem just fine. Each node (in the uppermost layer) could get a signature and there we are. There could also be another way of solving this, by having the two devices communicate directly (off the DAG), which is something we talk about later.

Multi Device functionality/ Merging

To be able to use multiple devices, the application must be able to merge diverging DAGs.

If a user has two (or more) devices that are not connected, but posts on both, their profile dag advances on both devices and effectively leads to a diverging “HEAD” (in git-speak). For such a scenario, merges must be possible.

Because we're not actually merging content here, creating a merge is trivial. A new node has to be added to the profile, referring to the current two head nodes.

The concerning part here is that the two devices must agree on which one executes the merge, the other device must follow. Such problems are solved with concensus algorithms. As we're talking about a low number of actors here (if a user has more than 10 devices they post from at the same time, that's much, but not for a concensus mechanism), it shouldn't be that much of a problem.

But it is important that the merge is cheap, so it is vital that the objects that represent the DAG are lightweight.

Reposts

Reposting content would be a rather simple matter. The “metadata layer” we talked about eariler would simply note that the content it points to is actually a repost. It would also point to the metadata object (or even the DAG object that points to it), for discoverability.

Comments on posts

Commenting on a post is a bit more complicated. Well, the actual comment is not, but making that comment visible is.

First of all, a comment would just be another post to the profile DAG, with an entry in its metadata linking to the post it replies to (which of course could itself be a reply).

In a system where Alice, Bob and Clara post content, and Bob and Clara follow Alice. Bob and Clara don't know of each other in that system. Now if Alice posts something and Bob publishes a comment on that post, that does not mean that Clara sees that reply.

Also, Alice may not necessarily see Bobs post! But Bobs device(s) should tell the device of Alice that there has been a reply (see “Gossipping Applications” below). Alice is now able to configure her device for different behaviours if a reply is encountered:

  1. Replies are not allowed, the device ignores the reply
  2. Replies are allowed, Alices device posts a new object to her DAG that notes the reply on her profile
    1. And the device also replicates (for a configurable amount of time) the reply content
    2. The device does not replicate the content

That setting could even be per post, and there could even be a note in the “metadata” object telling followers what would happen with replies (although that is then set in stone, and I don't really like that, because this configuration should be mutable).

Replies on replies would need to be propagated to “upstream” as well, so that trees of replies are possible. They are not necessary from a technical standpoint, because if an instance loads a profile, finds “reply metadata nodes” and loads the profile that posted the replies, it would encounter replies on that reply as well (yeeees I know that sounds confusing).

As an optimization, replies should “bubble up” the chain until they hit the original post. This would give the author of the original post the opportunity to moderate the replies on their post, but it wouldn't give them the possibility to deny people speaking. They can only reduce the replication and visibility of replies.

This is also why metadata objects and uppermost layer objects must be designed to be small! Consider this: Alice publishes a post. Bob and Clara reply. And on each of these replies, 10 other people reply. If Alice has her device configured to re-announce all replies, and replies to replies, that would mean that she now has 23 new objects in her profile:

1 Post Alice made -> 1 DAG object and 1 metadata object = 2 objects
+ 1 Reply from Bob -> 2 objects
+ 1 Reply from Clara -> 2 objects
+ 10 Replies from Bobs followers -> 20 objects + 20 objects in Bobs profile
+ 10 Replies from Claras followers -> also 40 objects
= 46 objects in the profile DAG of Alice

As more replies are posted further down the tree of replies, more objects land in Alices profile DAG. That's why this has to be optimized and configurable.

Gossipping Applications

The second data structure we need is some form of gossipping protocol. Instances of the application, especially instances which are posting to the same profile, must be able to communicate directly with eachother. Not only for the DAG merges we had a look on above, but also for announcing their latest state.

Also, a gossipping protocol can be used to tell other users about the own profile, about profiles one instance “knows” and so on.

We did not yet talk about how an instance can find the current HEAD of other instances. That's exactly what the gossip protocol is (also) for: It sends, periodically, information about itself to that channel.

I considered a system like IPNS for that, but that seems to be not the right tool for the job, especially when we talk about updating that state every other second. Gossipping seems to be a better solution to this problem.

Such a gossipping protocol should basically be a global channel application instances send information into:

  • “Hey I am Profile ID
  • “My current HEAD is Post ID
  • “Hey I know these instances: list of Profile IDs
  • I just posted Post ID

Configuration sync

Another concern, especially in scenarios where a user has multiple devices, is synchronization of configuration.

Configuration is not only what color scheme they prefer for their application instance, but also which profiles they “follow”, which ones they “block” and so on. This data has to be synchronized between devices. That's not too complicated, as this can easily be solved by CRDTs. We do not have to concern ourselves with preservation of history here, only the current state is of relevance.

Network of Trust

We already talked about having public-private key crypto in there for device authentication. Having public-private key crypto could also serve to build a network of trust.

Users might want to make sure that profiles they follow are actually humans. A network of trust would be ideal for that. Each post (or each node on the uppermost layer) should be signed with a private key anyways. Users might want to sign other users keys, so that they can build a network of trust.

Words on networks of trust have been written extensively, so I won't lose any more words about them here. I think they are a viable tool that should be explored for this application.

Non-concerns

I've written a lot about how the system would look like and some may think about certain problems that they can imagine such a system to have.

Here I want to address some of them.

Deleting content

First of all: deleting content is not of importance to me, although technically somewhat possible. To make that clear: If devices replicate content that is essentially content-addressed, there's no way to fully delete content from the network. That is the same as with putting a git repository open-source. If someone has a copy, there's just no way to remove it anymore.

It would be possible, though, to re-write the full profile chain and remove a single piece of content. As other instances may still have the “old” version of the profile DAG, they may see the content still, but devices that discover the content newly may never see it.

Moderation

Moderation is “possible”. I've put that in quotes because it is and is not at the same time. So first of all, everyone can post everything and nobody can forbid them to.

But users are able (and should) to configure their instances for what to replicate and what not. Replies to posts can be replicated for other followers to see, or not be replicated to make discoverability of them harder. But nobody can prevent someone from posting a reply to a post to their own profile.

For Spam and malice, block-lists could potentially exist and there could even be a feature in the “discover servers”. I am not really a fan of that, but I still want to mention that it is possible.

What I think would be a more valuable alternative would be a network of trust, essentially resulting in some form of reputation-system. If I sign another users profile as “is trustworthy for posting content, not spam” for example, my followers would see that and could themselves trust the other users profile better.

A spammer could still create hundreds of profiles and create a network of trusted accounts among them. I think there's no perfect solution to this problem anyways, but I beleive that a network of trust is a really good step towards a solution.

Chat

(Real-Time-)Chat is not a concern I have here. I think Matrix is a solution to that problem and I also think that matrix will become fully distributed at some point and matrix servers will go away at some point.

Encrypted communication

Encrypted (one-to-one or one-to-many) communication could be possible in such a system, given that public-private key crypto would be part of the system anyways.

That said, though, it is not really a good fit, because messages in such communication would still be posted to the profile and be potentially replicated forever, meaning that if crypto breaks some time in the future, the communication wouldn't be private anymore.

Of course that's true for all online communication that's encrypted. I am not sure whether I would implement such a thing, but it is possible.

Technology

Now that I've written a lot about what the ideas for the social network application are, I also have to write a bit about the technology I have in mind for implementing that application.

As said, I've been thinking a long time (since 2017) about this, and I have refined my idea several times during that time. The core of the idea is still the same though: there should be no need for servers (the server software mentioned above is simply an optimization and not really necessary technically speaking), a user may post from several devices to the same profile and they should be able to post content without being connected to the internet.

IPFS

IPFS was the technology that sparked that idea in the first place. IPFS and especially IPLD where the pieces of technology that I thought could provide the functionality that would be necessary for implementing the application.

Also, IPFS/IPLD would be a “common donominator” and linking to git-repositories or other data that is served via IPFS (think of the wikipedia, IIRC there has been an project to put the wikipedia on IPFS) would be possible easily.

But IPFS never had a decent Rust implementation or client libraries. All libraries that where available in my research were too complicated to use or too bare-bones to be a possible solution to the problems I am facing.

libp2p

I more or less recently discovered libp2p and it seems to be very complex to use, but also powerful enough to solve all the problems. And, and that's really important, there's a decent Rust implementation that seems to be on-par feature-wise.

Developing the application on libp2p is certainly involved, and a good plan must be made before starting such an project. Also I think I would need extensive help from others implementing the networking stuff, as I really do not comprehend where to even start!

Application UI

I am mainly thinking about implementing a GUI application here. I think there also should be CLI tooling, especially for power users, but it is not my main target here. The CLI tooling should not be for actually posting and “production use” but rather as a “plumbing toolbox” for inspecting profiles and instance state.

For GUI technology I am not really happy with the toolkits/frameworks I have used so far in the Rust world, except for Iced. I am not a fan of the react-style of framework and rather fancy elm-sytle frameworks (hence iced).

I also thought about trying Qt for that application, but given that it is mostly C++ only, and I really don't want to use C++ ever, it is not really an option either.

One framework that I did not yet have the pleasure trying out yet is Slint. From what I read online, it seems to be a really valuable alternative and from what I did read on its website, it seems to be a good framework. I will read more about it and try it out.

Final thoughts

I know this might be a major undertaking... I am not sure I am able to do all the things listed here – especially because my motivation drops as soon as I fail to understand docs or see ways to implement certain things.

Still, I think this is whole idea is worth exploring and implementing.


Feel free to contact me via mastodon or mail about this topic, of course!

On the Unix Friends and User Campus Kamp 2023 I held a talk called “101 Kommandozeilen-Werkzeuge” (“101 Commandline Tools”), where I showed the audience a number (actually more than 101) of commandline tools for Linux that are either cool, helpful or in any other way nice to know of. For the Linux-Day in Tübingen, I added some more tools to that presentation just this morning.

The range of tools is rather huge, from CLI utilities for processing structured data to full-fledged TUI applications for finance tracking, everything is included.

Did I say everything?

Well, not completely everything!

There's already five dozen!

There's probably five dozen different TUI applications for system monitoring. The most prominent is probably htop, which is a huge improvement over top (YMMV, just my opinion here). Then, there's also bottom, tiptop, glances, bashtop, bpytop, btop, gtop and probably another dozen more of these. All of these bring something new to the table. Configurable layout, better graphics,... whatever.

The next obvious thing is terminal file managers. When I started learning about Linux and especially the commandline, there was “Midnight Commander”. Later, there was also ranger, which is (again IMHO) a great improvement over “Midnight Commander”. Today, there is fff, nnn, vifm, fm, lf, felix, joshuto and probably another dozen of the same. Guess what? They all look the same, do the same, and have approximately the same feature set – although of course I would think that ranger still has the most features.

Another thing is text editor. I know this is a notoriously dangerous topic among Linux folks and commandline enthusiasts. I am myself guilty of evangelizing neovim at every opportunity and I do it mostly for the lolz, but not only. I honestly think it is the perfect text editor and I cannot for the life of me understand how anyone would ever want to work with these abominations called “IDE”s – or even a browser just for text editing.

But that's another topic.

In the beginning there was vi (yes, Linux greybeards, nowadays we can ignore that there was in fact a time before vi). Then there was vim, which was an improvement (pun intended). For a rather long time now there is neovim (“long” in terms of computer-science time calculation). Then there's also emacs. There's nano and there's probably five more that I do not know of that are similarly old and in use on some machines – I heard that Torvalds himself uses a patched version of microemacs or something like that.

I don't mind all of the above with regards to text editors. But today, there's also amp, helix, kakoune, micro, slap, turbo, zee, pepper, lapce, zed, glyph, hired, kibi, xi, ox – and I do not even include atom, vscode and all the other “big” “gui” editors here, let alone the fifteen different neovim GUI frontends. There's even a wikipedia list about text editors that is remarkably long – but doesn't even list the ones I listed above.

Seriously, almost none of these bring anything new to the table with regards to text editing. And what I do explicitly not want to say here is “just reach for (neo)vim”! No, I don't want to say this. I see that some of these improve the status quo of what older editors bring to the table – for example (IIRC) xi was the one that brought rope datastructures and improved performance of text editing under the hood. But that's not what I mean – What I mean is the five dozen text editors that are just a clone of one another without any improvement to what should be their job: text editing.

Instead, make something new!

Now I have been bitching about people investing their (probably free) time in something that they love doing. Yep, I am the asshole now here.

I don't really mind that, me expressing my thoughts about people wasting their time here. What I want to express, though, is that I am really sad about so much engineering energy and creativity lost into something that does not bring us forward as developers and Linux commandline enthusiasts in terms of tooling.

There are so many great tools out there that are absolutely amazing and bring new concepts for doing things – one of my favorite examples is fselect, which one can use to query the filesystem with SQL queries. You may think that this is a shit way of interacting with your files – but that's not the point I want to make! This tool is something that combines two things in a creative way that maybe someone has thought of before, but nobody implemented it. So it brings something entirely new to the table. And that's what I want to see out there. Not the seventieth iteration of a TUI file manager!

I am happy to announce that I am officially a (part-time/side-job) #freelancer for #rust #rustlang and #nixos #nix now!

You noticed the recent spike of Rust in your timelines? That's because Rust is the best thing since sliced bread! But Rust is not easy to learn!

That's where I come in! I am the guy you want for the time after the training of your developers! Because no training makes you a rockstar #rustlang programmer!

#hireme if you want to try out Rust in one of your projects to help your developers succeed! Whether it is #codereview, #consulting for your Rust experiments or for actual #softwaredevelopment, I can help your team get up to speed with #rustlang!

You also heard about #nix or #nixos and want to try it out? Let me help you exploring the ecosystem of functional package management to speed up your CI, development workflow and making your deployments reproducible!


For the readers of my blog: Above is clearly an advertisement. I won't post more ads on my blog, so be assured that this blog won't transform into an advertising machine!

I asked people whether I should post about my experimental #langdev ... And they asked me to do it, so I'll do it.

Disclaimer: First and foremost I am doing this for FUN!!! There's no “I have a groundbreaking idea” thing here. I want to write my own programming language because I like programming, not because I can make “the next big thing” or anything like that. Also I don't actually want to reinvent stuff. Using libraries for things like a borrow checker is in my opinion better than writing one myself. This is a hobby and nothing more. The end result of this whole thing will never be void even if I cannot get a working MVP out of my efforts because the whole propose of this is having fun.

Another disclaimer: in this article, I will say a few things that might not be 100% technically correct (or even be complete BS) especially with regards to the Haskell language. I am making no claim for technical correctness. If you stumble upon something that is not technically correct, just remind yourself of the context of this article. And especially or the disclaimer above.

Link to the repository: github.com/vunk-lang/vunk-lang.

Where I come from

So where am I coming from? For those who don't know, I started writing #Rust in 2015 and never looked back. Rust is the perfect programming language right now, in my opinion. It's semantics, especially the borrow checker, but also how it makes you think, just clicked with me and I feel extremely productive when writing Rust code. Of course, what Rust brought to the table is nothing new in terms of programming language research. But how it brought it to the table definitely is. Because the language is only one bit, its ecosystem and community are as important or even more important.

And because I want to continue program Rust and was in need for something to do, I wanted to start another hobby project.

The next big point in the “why” question is that I am highly “functional curious”. By that I mean that I am really interested in the functional programming paradigm, though I really have to underline that I am not interested from the mathematical side of things, not at all. In fact, I have little knowledge of all these mathematical things... I could probably not even explain lambda calculus to someone (don't tell the Professors where I studied, although I am not sure they could either).

I am more interested in functional programming from the point of programmer experience. One thing I really learned, or rather engraved into my heart, while getting more and more experience with Rust is that mutable access on memory is bad. Not inherently bad, but bad nonetheless. And yes, that former sentence does not miss a “multiple” before the “mutable”, I really mean that modifying memory in place is bad, although of course it is absolutely necessary. Rust abstracts that danger away pretty nicely via its borrowing mechanisms. Either you can access that piece of memory mutably from one piece of code, or nonmutable from multiple. If you need mutable access from multiple locations, you must use synchronization primitives (modulo unsafe, but let's keep that out of our heads for now).

When I talk about functional programming languages mean Haskell because that's the only functional language I kind of learned (or rather tried to learn – I think I understood the bits, yes also Functors, Applicatives, Monoids and Monads, but never actually used it for something meaningful, which I think is required if you want to say you “learned” a language). From my experience, the mutable access problem fades in (pure) functional languages. Not because it is somehow solved in these languages, but rather because it is of less concern. In Haskell there's no mutation in place (IIRC, remember disclaimer 2), but the language makes you think that things get copied all the time, and the runtime optimizes everything nicely so that you don't have to care too much.

And if we think that thought a bit further, we see that the “mutable access” problem I've been talking about above is nothing more than side effects. Side effects are bad. Rust makes them explicit, which helps a lot, of course. Haskell makes them also explicit, although completely different than Rust (from a programmers perspective).

These are the technical points where I am coming from. The next point is something else entirely: motivation. I have problems with motivation. And by that I do not mean problems as in “someone asked me to do something and now I am just slacking”, no not at all. If I feel like I have a responsibility to do something, I will do it and I will do it to the best of my abilities! If I have an appointment, I will be there ten minutes early. If I have to do the dishes, I will do them and although I rather use new unused ones, I won't re-use already used ones, if you get what I mean. If I get a task on my day job, I will of course do it in reasonable time and not slack off. No “my code is compiling” XKCD on my watch!

By problems with motivation I rather mean things like “I should try implementing a tool for X” and then my brain does the thinking part, thinks it all through and models all necessary abstractions... But then my body cannot get off the couch to actually write that code down. Another instance is reading blog articles. If I find something in my RSS reader where I really am interested in from the heading and catchline, I often fail to read the actual article. I think these examples stem from the same kind of problems with motivation, at least they feel equal to me.

And I actually do not mean that I am too lazy to get my brain into thinking mode, or in case of programming, search for all necessary dependencies, set up the CI or stuff like that – I actually enjoy these bits of work quite a lot – no, I really mean writing the code that implements my idea (and you can actually see that by looking at my GitHub profile).

I have also to note that I think a lot of people have problems like the one I just failed to describe decently (IMO).

But when I started thinking about implementing a compiler for a language I made up, I had this strange tingling in the back of my head that kept coming back. And so far it hasn't gone away, which I also think I must give attribution for to the nice ecosystem of Rust. Previous attempts at programming stuff often times failed because I failed to find decent libraries for implementing my idea!

So, I have to take advantage of that motivation right now and just get that code written, right?

Where I want to go

With all the above in mind, I thought: Why can't we have a language like Rust, with all the bits to do low level programming (and by that I mean interfacing with C libraries easily, not writing a Microcontroller OS), with the borrow semantics, control over references and lifetimes and such, but as a purely functional language? And that's essentially where I want to go.

My idea in one quote would be “What if Rust was purely functional?”

I started writing some example code files how I wanted the language to look (and you can find them in the repository) and then started to look for how I can get them into a compiled form that can be executed.

I first thought of implementing a VM or bytecode interpreter for that, but soon got the idea that LLVM is there and should be used, especially because I want to be able to interface with a C ABI. Now that I heard of HVM, I am also curious whether we cannot have two backends, one compiling to a binary using LLVM and another compiling to HVM bytecode. Not sure whether this would be possible at all, but still an interesting idea.

Because the language should be low level, the programmer should be able to distinguish between pass-by-value and pass-by-reference (just like in Rust of course). There should also be the possibility to do pointer arithmetic, although the same as with Rust should apply: you need to need unsafe for that.

I am not yet sure whether I want to have a macro system like with Rust. I did not think too much about it yet, but there might be the possibility to cover the needs that are fulfilled with macros in Rust with higher order functions. Maybe.

Another idea that I have is that function composition and currying/partial function application should be possible.

The last bit I did not yet think about at all is whether Monads should be the way to abstract side effects or Effects. I learned about effects just a few days or weeks ago and did not yet fully understand the implications. Although I am not sure whether I understood the implications of Monads either. Again, I do not have a mathematical background!

I think I do not have to mention that good (Rust-style or Elm-style) error messages and such are also a goal, do I?

Current state

As of today, I have a working prototype of a lexer/tokenizer and the first bits of a parser implementation. The parser is still missing bits for defining types and enums and the implementation for defining functions is also not ready yet. But we'll get there.

The syntax does not yet have the concept of unsafe, which I will need at some point, because interaction with C code is of course unsafe.

Let me give you an example of code now, but be warned: This could be outdated by tomorrow!!!

use Std.Ops.Add;

answer: I8;
answer = 41;

add A T: (A, A) -> A
    where A: Add T;
add = (a: A, b: A) -> a + b;

addOne A: (A) -> A
    where A: Add I8;
addOne = add 1

main = Std.println $ addOne answer

Let's go though that example line by line.

In the first line, we import something from the “Ops” module from standard library “Std”. What we import is “Add”. It is written uppercase because it is either a type, an enum or a Trait. Modules are also written uppercase and they map directly to directories on the filesystem (also written uppercase). Modules have to be declared before they can be used (pub mod Helpers; mod Private helpers;, not in the example). Functions are always lowercase, although I am not yet sure whether I want to go with CamelCase, pascalCase or snake_case.

Next in the example, we declare something called answer to be of type I8, an 8 bit integer. After that we define it. The declaration could be omitted though, as the type is clear here and can be inferred by the compiler.

Next, we declare and then define a function addI8. That function is generic over A and T where A has to implement the Add trait we imported earlier over some unbounded T. add is then defined to be a function with two parameters of that type A and returns also an instance of that type. It's implementation should now be self describing.

Note that we have to write down the function declaration because then function is generic. If the function is not generic or the types can be figured out by the complier, we can omit the declaration. That would be true for add = (a: I8) -> a + 1, and possibly also for the above implementation of addOne, although I am not yet 100% sure about this one.

addOne from above is now declared and defined using partial function application. The compiler should be smart enough to figure out that the expression add 1 results in a function like this:

add' A T: (A) -> A
    where A: Add T;
add' = (b: A) -> 1 + b;

Last we define the main function to use the println function from Std (without prior importing) to print the result of applying addOne to answer.

Note that the $ character here that you might know from Haskell is only syntax sugar for parentheses, nothing more.

For types, I am even thinking about having impl blocks like in Rust, and them being syntax sugar for free standing functions:

type Person =
    { name: Str
    }

impl Foo = 
    { getName = (&self) -> &self.name;
    }

# above function is equal to
getName = (person: &Person) -> &person.name;

Lifetimes in above example are inferred of course.

Trait notation and implementation of traits on types would work the same way syntax-wise.

If you're curious for the bits that are not in these examples, you can always browse the code examples in the repository. Examples are actually ran through the lexer and parser, so they have to be up to date.

Closing thoughts

This is an experiment. An experiment in what I am able to do, but also an experiment in how to keep me motivated. I hope it works out. But i don't know whether it will or whether I lose interest tomorrow. Let's hope I don't.

And just to note, because it might come up: the question of “Why not {Roc, Elm, Elixir, SomeOtherLesserKnownFPLanguageFromGithub}” is answered in the “Where I come from” section, if you read close enough!

If you want to contribute, which I would like to see, please keep in mind that I am learning things here. I am not a functional programming language expert, I have no mathematical background. If you contribute, an explanation of what you're trying to accomplish and how is required, because I won't learn things otherwise. I value programmer experience and simplicity more than mathematical elegance or such. So be prepared to explain linear types, effects, HKT, etc to me if you want to contribute code for these things!

That said, you're welcome to send patches! Just ping me on mastodon or write a short issue on GitHub about your idea and then start hacking. I am normally fast in responding, so if you just open an issue like “Hey I want to add infix functions” (which we do not have right now), that'd be great for me to know so I can give you feedback on your idea fast! Although I am not sure whether I want to have infix functions.

Comments on all bits in this article are warmly welcome! writefreely has no option for responding or even getting notified of replies, so please post comments with @musicmatze@social.linux.pizza mentioned if you want to comment via mastodon. Email for a more private conversation is fine as well, of course!

As some people know, I am the maintainer of the config-rs crate. Right now, I am in the process of re-thinking the implementation and features of config-rs and am writing a “new generation” experiment, very creatively named config-rs-ng.

The config-rs-ng project is, as said, an experiment right now. If this experiment succeeds, it will replace the implementation of the config-rs crate.

Disclaimer: Don't use config-rs-ng in production code. Read the “Timeline” section below to understand why.

The Why

You might wonder why I am doing this. There are several reasons, but the main one is the following: The config-rs implementation does what I call eager merging of the configuration sources. To understand what that means, one must understand that the config-rs crate has the feature of “layering” configuration sources. A user of the crate can read different sources for their configuration and merge them so that former sources are shadowed by later sources. The implementation of the config-rs crate, upon instantiation of the configuration object, merges all config sources. That means, the actual information where a specific value comes from, is lost.

And that's an issue for several things: First of all, knowing where a configuration value comes from might be handy. If a user of the library wants to tell their users that a configuration value must be changed, they cannot tell the user where that configuration value must be changed. Also, re-loading parts of the configuration is not easily possible. Reloading the configuration would always mean that the whole set of configuration sources must be reloaded, to construct the full configuration object.

With the new implementation, the configuration sources are known, as the “merged object” is actually never constructed within the implementation of the crate. A user might opt into constructing a merged object if they wish, but they don't have to. The new implementation does not store one single configuration object anymore, but a list of sources that themselves store trees of their values.

This new approach might not be as memory efficient as the former approach, but we get several very nice benefits from that.

First of all, telling a user of an application that some value in the configuration must be changed can now happen with the information where that value is. Also, reloading parts of the configuration is now way less painful.

Current Featureset

The current feature set of the “ng efforts” is nowhere near what config-rs offers right now. Only TOML and JSON are implemented as format backends, very minimal async support is implemented, the API is still a bit noisy, the derive macro does only support structs, and also only structs with named fields.

Still, this could be the first moment I would call the codebase something like an “minimal viable product”. More or less, at least.

How you can participate

What I would like to ask the Rust community is: Please have a look. Tell me, preferably via issues, but also via mail or other channels:

  • what you think is missing (at least for a MVP)
  • what could be improved and how
  • do you think that the current implementation is a good way to go
  • what would you do differently
  • what you don't understand

Of course, pull requests are even more appreciated than issues ;–)

The obvious things I know, which you don't have to tell me:

  • More backends (TOML and JSON are nice, but we might also want YAML, KDL, INI,... we also should think hard about environment-variable support, although I am a bit hesitant because in config-rs that is a feature that is half-working and half-broken and I think is also rather hard to get right)
  • Feature parity with config-rs in terms of deserializing to custom types and other API functionality
  • Documentation
  • Writing configuration. This is a rather hard topic, because we actually don't implement configuration formats, and writing configuration back must be format-preserving and all. I have that topic on my radar, but currently this is no priority for me (patches still welcome, of course)

If you want to dig a bit deeper, you are also welcome to read the vision document. If you can come up with ideas how to test the requirements listed there, that'd be highly appreciated as well!

Of course, patches for functionality and especially for feature-parity with config-rs are very welcome as well!

Timeline

The timeline for the config-rs-ng crate is: There is none. If and when the crate will become something that I feel comfortable with marking as “ready” (whatever that means) to replace the current config-rs implementation, is not decided at all. It is ready when it is ready, so to speak.

Also, I do not plan to release the config-rs-ng project as a dedicated crate on crates.io, because I don't want to mark this project as “official”/“official replacement of config-rs”! It is an experiment and if I think it is on a good way towards feature-partity with config-rs and I feel comfortable with it (and the feedback is not entirely negative), I will think about replacing the config-rs implementation with it.

Last weekend I attended the RustNation23 conference in London. We were asked by the organizers to write blog posts about our experience. So here we go.

Traveling to London

I flew in from Stuttgart, Germany on Thursday. The flight was mid-day, which was really nice because I wasn't too tired on the flight, but also not at a late time in London, so hanging out with a colleague who also attended and visiting one of the famous London Pubs was possible.

The Conference

The conference took place at “The Brewery”, which was the perfect location for such an event! The really nice and tasty snacks before, between and after the talks kept me awake but not too stuffed! And that was dearly needed because the talks were of exceptional quality! The keynotes were absolutely awesome! Seeing Job Gjengset live and in action was a particularly great experience, but also the other speakers, who, by the way, came from all around the world, were absolutely awesome!

The staff at the location took great care of everyone! There was always enough water available and the drinks after the conference (the “socializing” part) were also really good.

Another day in London

My company allowed me (or rather us) to stay an extra day in London, which we really enjoyed. We did a long walking trip through the city, visited Buckingham Palace, Victoria Station, Westminster and Big Ben, Leicester Square, Soho, Piccadilly Circus, Leicester Square, Canary Wharf, Jubilee Park, Kings Cross Station, Camden Town, The Regents Park, the Sherlock Holmes Museum (although we didn't enter). That were almost 30km of walking through the City of London in just one day.

Flying Home

On Sunday, I flew back to Stuttgart and finaly fell back into my own bed again. It was dearly needed, as my feet were wrecked, my brain was overloaded and I was tired from just being in London. A lot to process, really!

Conclusion

London was great. I am really grateful that my company let me go there, experience RustNation23 and gave me an extra day to visit London. I cannot wait for the next Rust conference (which will probably be EuroRust 23 for me). If the Speakers are only half as awesome as the ones at RustNation, it will be worth it a hundred times!

I've been writing cargo-changelog lately and already published the first version (0.1.0) on crates.io.

Here I want to write down some thoughts on why I wrote this tool and what assumptions it makes. This should of course not serve as documentation of the tool, but simply as a collection of thoughts that I can refer to.

Where

Changelog management is hard. Not because it is particularly difficult to do, but because nobody really wants to do it in the first place. Especially because there's no established “place” where it should be done.

Some tools want the programmer to write commits which can serve as changelogs. I wrote about that before. It puts burden on the programmer who does not want to concern themselves with whether a change is user-facing or not and whether it does impact the user at all. That's not their job after all! Not in an open source setting and especially not in a commercial environment. They're hired for working on the software and that's all they should do!

In an open-source world, the programmer of a feature may even contribute changelog entries, because they know that the change will have a certain impact on users when released. But the keyword in the prior sentence is “may”. They are not required to do so and should never be. Opensource projects suffer from having to few contributors. Of course, there are big open source projects out there, like kubernetes, tokio, django, Rust, TensorFlow or, of course, the Linux Kernel. These projects do not have that issue, but I feel comfortable in assuming that these are the Top-1%. Most Opensource projects have one or two contributors or, if lucky, are seeing maybe ten to fifteen regular contributors. If an such a project loses only one contributor, that has significant impact on the overall project. Thus, making contributors happy is somewhat of a key concern. Putting them responsible for adding changelog entries to their changes may not be the best way of making them happy.

Thus, I think, changelogs should be managed by the maintainer or someone in the project that wants to dedicate themselves to that task. The contributors should only do what they do best: Produce code and deliver features, fixing bugs, etc.

Under that presumption, putting changelogs within a commit is not a particularly good idea. It does not matter whether we're talking about commit formats like conventional commits here or about git-trailers for categorizing commits. After all, if a contributor categorizes the commit in the wrong way, they would need to rewrite the commit, even though the code they changed may be optimal. That's a serious hassle.

That leaves us only with producing the changelog entry outside of the actual commits that introduce the change.

The idea may then be to add the changelog entry in a dedicated commit, but still within the pull request that introduces the relevant change. That sounds good at first, but quickly falls apart because of a simple issue: Merging this may not be possible. The changelog entry that lands in a CHANGELOG.md file normally gets appended in some form or another. Whether that is a simple append to the section for the upcoming version of the software, or to a sub-section “Bugfixes”/“Features”/... does not matter, it is still an append. If someone else produced a change to that same section, we quickly run into merge conflicts. Needing a pull request to be rebased just because the changelog entry does not merge is a serious slow-down in progress for the whole project. That should never happen!

After establishing the last point, we see that producing the changelog outside of the commits that introduce a change as well as outside of the pull request that introduces the change does have a number of benefits to the overall pace of the project. Also, having someone dedicated to the issue of producing a changelog instead of burdening the programmers also has a benefit that may be beneficial to the whole project not only as in pace but also as in developer happiness.

The above points do not mean that a programmer who feels dedicated shouldn't be able to produce a changelog for their contribution! Of course they should be enabled to produce that changelog! But they should not have to concern themselves with mergability!

Also, producing changelogs should not slow down the project pace. After all, adding changelogs to a project is still a contribution. It should be as easy as producing code. It should not suffer from merge conflicts if two or more contributors add a changelog for different changes.

How

With all that in mind, I came up with a simple scheme. It turns out that other projects exist that follow a similar scheme – so I cannot take any credit for that. I still opted to start cargo-changelog because these already existing tools do of course not integrate with cargo, as they were written in other ecosystems.

So the general idea here is that we do not produce one large CHANGELOG.md file, but we record changes in individual files, called “fragments”. These fragments get put into a special place in the repository: .changelogs/unreleased/. The filename for each fragment is produces simply from a timestamp. That ensures that adding two fragments from two different pull requests will most certainly not result in a merge conflict.

A fragment contains two sections: A section with structured data and free-form text. That structured data is encoded in YAML or TOML (although normally these tools opt for YAML and cargo-changelog does so as well).

I thought long and hard about what structured data may be recorded here. It turned out: I don't know and of course I shouldn't decide this. So what I did was implement a scheme where the user can define what structured data they want to record! Each project can, in the .changelog.toml file, which serves as configuration file for cargo-changelog, define what structured data they want to record, whether a data entry is optional or whether it has a default value. When generating a new fragment, cargo-changelog can either present the user with an interactive questionnaire to fill that data, or(/and) open the users $EDITOR where they can edit that structured-data header themselves.

Structured data may be the pull-request number that introduced the particular change, a classification of that change (“Bugfix”/“Feature”/“Misc”/... whatever the project defines in the .changelog.toml configuration file) or, if desired, a “Short description” of the change.

The free-form text of the fragment can be used to document that change in a human-readable way. Currently, no format is enforced here, so whether the user uses Markdown or reStructured Text or something totally different is entirely up to the user (although cargo-changelog generates .md files for the fragments).

When a release comes up

As soon as the software is about to be released, the “unreleased” fragments should be consolidated. cargo-changelog helps with that by providing a command that moves all fragments from .changelogs/unreleased/* to .changelogs/x.y.z/ (where x.y.z is of course the next release version, either by asking cargo or by letting the user specify it).

One crucial idea here was that the release will be done on a dedicated release-branch. Of course the tool does not enforce or demand this in any way, but it gives the option of doing that without running into issues later down the road.

So if the release branch gets branched off of the master branch, the person dedicated for making the release would issue the cargo-changelog command for consolidating the unreleased fragments and then commit the moved files. After that, they would issue the cargo-changelog command for generating the CHANGELOG.md file. That file would always be generated and never touched manually. There's no need in doing so: changing changelog entries after the fact (for example if a typo was found) would happen in the fragment files.

Of course, the CHANGELOG.md file should also appear on the master branch of the project! Cherry-picking the commits that consolidated the unreleased fragments as well as the one that generated the changelog file does simply work, even if master progressed with new changelog fragments!

Changelog generation

In the previous section I wrote that the CHANGELOG.md file would be generated and would never be edited manually. Still, the user may want to add some custom text at the end of the changelog file, maybe they would like to use a custom ordering of their changes – Maybe they want to list bugfixes first and features second? Or they want only to have the short description of the individual changelog fragment to be displayed and the long-form text should reside in a <details>-enclosed part, so that when rendering the file a user can get a quick overview!

That's why CHANGELOG.md files are generated with a template file. That template resides in .changelogs/template.md (that path, as everything else with cargo-changelog, can be configured). That template file uses Handlebars templating and can be tweaked as required. In the current version of cargo-changelog, there are some minimal helpers installed with the templating engine to sort the released versions, group changes by “type” and some minimal text handling. More will follow, of course.

Metadata crawling

Another feature that cargo-changelog has is metadata crawling. One may want to fill header fields by issuing some command and using that command output as a value for a header field. cargo-changelog can call arbitrary commands for doing exactly that. Each header field can have a “crawler” configured, for issuing commands. These commands may even be other interactive programs like a script that uses skim (or its more popular counterpart fzf) for interacting with the user.

To sum up

To sum up, these are my thoughts and notes on changelog management with cargo-changelog. Of course, most of this is tailored towards opensource projects (and – if someone noticed – also towards an always-green-master strategy. I may write a blog article about that as well).

cargo-changelog is in 0.1.0 and certainly not feature complete yet. It is a first rough implementation of my ideas and it seems to work great so far, although it is not battle tested at all! I am eager to try it out in the near future and extend it and improve it as need be. One can see the tool in action in the history of the repository of the tool itself!

And as always: contributions are welcome!

End of last year, I published the article “I hate conventional commits”. I received a lot of good feedback on this article and it was even mentioned in a podcast (german) – thanks a lot for that!

Lately, I also grew a decent amount of hate for squash merges. I figured that I could also write an article on that, so I can link to it in discussions I have about the subject.

What are squash merges

When proposing changes to a software project that is hosted on a forge like GitHub, gitlab or gitea, the author of that changeset opens a pull request (in gitlab it is named “merge request”, but I'll stick to the former here). That pull request is, when it is approved by the maintainer of the project, merged. This normally happens via a click on the “Merge” button in the web interface of the forge (although it does not have to).

GitHub offers different methods when merging in pull requests. The “normal” way of merging a pull request is by creating a merge commit between the base branch (for example “master”) and the pull-request branch. This is equal to git merge <branch> on the commandline.

Another method would be the so-called “rebase and merge” method, which rebases the pull request branch onto the target branch and merges it after that. The rationale here is that if the pull request gets rebased before it gets merged, it is “up to date” with the target branch when it is merged. There's also two variants to that method, one were a merge commit is created after the rebase and one where the target branch is just fast-forwarded (git merge --ff-only) to the pull-request branch. I find these two methods problematic as well, but that's not what we're here for.

The third method, and the one I want to talk about here, is the “squash merge”. When a pull request is “merged” by the maintainer of a project, all commits that are in the pull-request branch are put into a single commit and all commit messages are joined together. This commit then is directly applied to the target branch. The (approximate) git command(s) for doing this would be

git checkout pr-branch
git log master..pr-branch --format="%s%n%b" > /tmp/message
git rebase master
git reset --soft master
git commit -a --file /tmp/message
git checkout master
git merge --ff-only pr-branch

Implications of squash merges

What I want to highlight here is what squash merging implies.

First of all, squash merging implies that the diff a pull-request branch introduces is put into a single commit. It does not matter whether the pull-request branch contained one commit or a hundred commits, the end-result is always one commit with one diff and one message.

That's also the second thing that a squash merge implies: There is only one message (even though crafted by simply combining multiple messages) for the whole diff the pull request introduced.

Signatures forged with GPG or some other method are destroyed in that process.

Why I hate this

You can probably already smell why I loath this. By combining the individual changes a pull request introduced, one loses so much information! Consider a pull request that took 10 commits to refactor something. Carefully crafted commit messages, why things were changed the way they were changed. Very detailed analysis in the commit message, why a certain change is needed to further refactor a piece of code somewhere else in the next commit. Maybe even performance characteristics written down in the commit message!

All this is basically lost as soon as the pull request is squashed. The end result is a huge diff with a huge message, where the individual parts of the commit message could potentially be associated with the right parts of the diff. Could be. But the effort to take apart the huge commit is just lost time and maybe a huge undertaking that is completely unnecessary if the changes wouldn't have been introduced to the “master” branch via squash merge in the first place.

One might argue that the commits are still there, in the web interface of the forge. Yes, they might be. But git is an offline tool, I should be able to see these things without having to use a browser. I should be able to tell my editor “give me the commit message for this line here, because I want to see why it is written the way it is” and my editor should then give me that information. If it opens an enormous squashed commit, I'll just rage-quit! Because now I have to review a commit that might contain thousands of lines of changes with a message where I have to search in the commit message why that one line I care about was changed.

I really am hesitating to link an example here. Mostly because blaming someone who doesn't know better does not yield anything valuable and is just destructive. But let me assure you: I've seen projects that do this and it is just ridiculous! If you come across a change that touched 2KLOC of code and has a commit message that is 500 lines of “Change”, “Fix things” and “refactor code”, you could also go back to the old SVN days where we had things like “Check-In #1234 from 2022-03-04”. We can do better than that!

How to do better

So, you might think that the above is all valid and sane. But now you want to know how things could be improved. And, to be honest, it is totally trivial!

First of all, let me shortly talk about responsibilities. Because I feel like the idea of squashing all changes in a pull request comes from the attitude “I have to clean things up before I merge” of maintainers. The idea here being that they take the pull request and squash it, so that things are “clean” on the master branch. But that premise is totally wrong. The maintainer of a project (especially in open source, but in my opinion also in “not open source”) is never responsible for cleaning up a contributors work. After all, it is a pull request. The contributor asks the maintainer to take changes. The contributor is the person that wants something to be changed in the project. Therefore it is the duty of the contributor to bring the changes into a form where the maintainer accepts them. And that obviously includes a clean commit history!

I reckon, though, that some contributors just do not care about committing their changes cleanly and with decent commit messages. In my opinion, a maintainer should just not take these patches – I certainly did reject patches because of badly written commit history. There's always the option for the maintainer to take the patches to a new branch and rewrite the commit messages. For example I once did this with nice changes that were just committed badly. It is, though, not the responsibility of the maintainer to do this.

Another option which I quite like is that a project introduces commit linting (but obviously not conventional commits of course). Commit linting can be used (for example by implementing a CI job with gitlint) to ensure that commit messages have signed-off-by lines, do not contain swearwords, have decent length and more. It is a nice and easy way of automating this and working towards decent commits.

This all does help with improving the commit messages and therefore the change history of pull requests. But of course, squash merging must be disabled/forbidden still!

In my opinion, reviewing commit messages should be part of every normal code review. The GitHub web interface does not particularly support that, because one has to click through several pages until the actual commit is viewed. That's why I like to fetch the pull requests from github (git fetch REMOTE pull/PR_NUMBER/head) and review them commit-by-commit on my local machine (git log $(git merge-base master FETCH_HEAD)..FETCH_HEAD).

To sum up...

To sum up, don't enable squash merging in your repository configuration! Disable it, in fact! It hurts your project more than it provides value (because it doesn't provide any value)! It is a disrespectful and destructive operation that minimizes the value your project receives via pull requests.

I, for one, am stopping to contribute to projects if they squash merge.