musicmatzes blog

git

Recently, I voiced my discomfort... no, lets face it: my anger with people that cannot obey the simplest commit message rules:

Why can't people obey these simple #commit message rules?

  • Use an uppercase letter for the start of your subject line
  • EXPLAIN what you did, not “Fixes”

#git

(toot)

This really bothers me. I (co-)maintain a few crates in the #rust ecosystem. There are contributions rolling in every other week and I love that, because it makes me happy to see that other people care about the same things that I care about. Still, I am constantly asking people to rewrite commit messages or clean up their branches because they did strange things – for example merge the master branch instead of rebasing their pull-request to fix merge conflicts. And sometimes even change things in this merge commit, making a review utterly impossible!

Most of the time I do not bother if people just don't capitalize the first letter of their commit message, but it bothers me to no end, still. That's why I teach others to write proper commit messages when I teach them how to use #git, and I really try to be a pain in their ... youknowwhat, so they are annoyed by me telling them “No, rephrase that!” all the time!

I am not angry if people fail to use git trailers the right way (and yes, these are kernel commit conventions. Does not mean they cannot be applied to other workflows as well)! These rules are, of course, not carved into stone. Still, it is a matter of good behaviour in the community to give attribution to people involved in the process of applying the patch (using “Signed-off-by”, “Acked-by” or “Reviewed-by”), crafting the patch (using “Co-authored-by”, “Suggested-by”, “Signed-off-by”) or others (“CC”, “Reported-by”, ...).

I hope I don't have to repeat that commit messages like “Fixes” or “Refactor” are bullshit!

How to NOT do better

There are projects out there that try to make you a better committer. Most known is conventional commits.

I don't like these things at all. “Why?” you may ask? The answer to that is really simple: It makes you think less about what you've done, and, and that's propably the worst thing, it gives you the ability to auto-generate a changelog from your commit logs. But commit logs are not changelogs. Commit logs are logs of steps how your software was developed. A changelog is a list of things your users need to know about when upgrading from one version to another. They don't need to know the steps that where taken to provide new features, fix bugs or refactor your codebase, they need to know about what changes for them, how using the product has changed!

Luckily I have managed to stay away from projects using conventional commits.

How to do better

There are tons and tons of guides out there how to write proper git commit messages. I leave searching for them as a task for the reader here (one thing I want to link here is Drew DeVault's article on a disciplined git workflow). The very basics are:

  • The subject line must not exceed 50 characters
  • The subject line should be capitalized and must not end in a period (really, who on earth would end it with a period? I mean... do you end your email subjects with a period?)
  • The subject line must be written in imperative mood (“Fix”, not “Fixed” / “Fixes” etc.)
  • The body copy must be wrapped at 72 columns
  • The body copy must only contain explanations as to what and why, never how. The latter belongs in documentation and implementation.

There are, of course tons of great examples out there. And because people get annoyed if I tell them that the best examples can be found in the linux kernel community (“These people are GODs, I cannot compare to them” – why not?), I can only show you some less GODish commits (by me)!

Have a look at my

  • contribution to shiplift. The commit message is not that long and the change is atomic. The subject line is a short summary what the change is about, the body explains why this is/was done. The trailer notes that I submitted this according to the developercertificate.
  • contribution to config-rs. Nobody said that the commit subject has to explain all the things – as long as there is reasonable explanation in the body, the subject can be “Simplify implementation” (like here).
  • commit bringing order to the galaxy. There can be the occasional joke, of course!

These are all rather short commit messages for simple patches. Longer messages with more explanations also exist in my projects! For example in the butido project there are changes like this one, or this one or even this very long one. Or, to go crazy, this enormous one here.

These commits have one thing in common: They explain why things were done.

And you can do that to! One really simple idea that is worth trying out is not to use the -m flag of git-commit at all. This way you are presented with your favourite editor and can pause for a moment to think about what to write.

Don't be that guy that appears on the front page of commitlogsfromlastnight.com!

I love the Rust language. And I love the library ecosystem the Rust community provides.

But let me start at the beginning.

libgitdit

In 2016, I and a friend of mine developed a library for distributed issue tracking with git and a commandline frontend for that library. The project, git-dit got a rather good portion of attention on hackernews, reddit and of course on github. Ranking at about 350 stars on github, this is the most successfull piece of software I (co-)authored so far.

Development sleeps right now, which is really unfortunate. There is a number of unresolved issues, despite the software is usable today.

A failed thesis

My friend and I proposed a bachelors thesis at our university for writing a web-viewer for git-dit. Because we're both master students, we also offered to supervise this thesis.

In the last semester, the thesis was assigned and I was happy when it started. Today (or rather a few days ago) I was not that happy anymore, because I got the end-result. I was able to compile it (yay), but after starting it, I was not even able to open the web page because I did not know which port to use.

After looking at the source, I was able to figure that part out. Unfortunately, everything I tried to do via the web frontend failed (as in nothing happened). I was not able to see any issues or anything else. Only viewing the git configuration was possible – but that's the least thing I cared about.

So I figured: How hard can it be? If a bachelor student has half a year time,... it must be hard? No, I guess not.

Lets do that!

So I started my own git-dit-web viewer. And I tracked the time it took to implement it.

I mean, how hard can it be? I am not a web-dev at all, I have zero experience with Rust web frameworks, I never touched one. I have no experience with CSS (only that view bits I used for this blog) and of course I also have no experience with JS.

But how hard can it be?

Turns out: It is not hard at all. I'm proud to present the first prototype, after 11 hours of implementation time:

$ time sum :week git-dit-wui

Wk  Date       Day Tags           Start      End    Time    Total
--- ---------- --- ----------- -------- -------- ------- --------
W11 2018-03-15 Thu git-dit-wui 14:00:00 16:43:35 2:43:35
                   git-dit-wui 20:31:00 23:52:04 3:21:04  6:04:39
W11 2018-03-16 Fri git-dit-wui 11:23:39 14:12:56 2:49:17
                   git-dit-wui 15:45:16 17:58:47 2:13:31  5:02:48

                                                         11:07:27

What does not work yet

Of course, this is only the first prototype. The following things do not work yet:

  • Filtering issues for open/closed or other metadata
  • Showing issues which were opened by one specific author
  • Show messages as tree (currently linear by timestamp)
  • Graph for issues-per-date (nice-to-have)
  • Showing commits
  • Automatically detecting git hashes in messages and linking to the appropriate issue/commit
  • Abbreviating git hashes in messages and everywhere else
  • Configuration
    • Port
    • Repository path
    • Readonly/RW
  • Error handling: if things go wrong, we should show an error page rather than nothing

What does work

But some things also work, of course:

  • Messages are rendered as markdown
  • Listing all issues (with some metadata)
  • Showing an issue (including replies)
  • Showing single messages
  • Landing page with statistics about issues

And it looks rather good (thanks to the bulma CSS framework) despite me beeing a CLI-only guy without web-dev experience.

Screenshots

Some screenshots showing the issues in the git-dit repository.

The landing page The issue listing page Showing a single issue

Conclusion

Of course I open-sourced the code on github and licensed it as AGPL-3.0.

So it can be done. I'm not quite sure what the student did in 6 months time he had for implementing this.

tags: #network #open-source #software #git

So I started developing an importer for importing github issues into git-dit – the distributed issue tracker build on git.

Turns out it works well, though some things are not yet implemented:

  • Wrapping of text. This is difficult because quotations are wrapped, but the quotation character is not prepended to the new line – which results in broken format
  • Importing only issues. Right now, PRs are imported ... which is not exactly what we want. I really hope I can figure this out to actually attach PR comments to the actual commits of the PR. This would be really nice. Issues shall be imported without parent (orphaned) like git-dit wants it.
  • Mapping of github handles to real names and email addresses.
  • Mapping github labels to git-dit “trailers”.

Have a look at my importer tool here (Just be told: This is WIP and shouldn't be used right now)!

Or at git-dit itself here (I am co-author).

tags: #tools #software #rust #open-source #git #github

This post was written during my trip through Iceland and published much latern than it was written.

When writing my last entry, I argued that we need decentralized issue tracking.

Here's why.

Why these things must be decentralized

Issue tracking, bug tracking and of course pull request tracking (which could all be named “issue tracking”, btw) must, in my opinion, be decentralized. That's not only because I think offline-first should be the way to go, even today in our always-connected and always-on(line) world. It's also because of redundancy and failure safety. Linus Torvalds himself once said:

I don't do backups, I publish and let the world mirror it

(paraphrased)

And that's true. We should not need to do backups, we should share all data, in my opinion. Even absolutely personal data. Yes, you read that right, me, the Facebook/Google/Twitter/Whatsapp/whatever free living guy tells you that all data need to be shared. But I also tell you: Never ever unencrypted! And I do not mean transport encryption, I mean real encryption. Unbreakable is not possible, but at least theoretically-unbreakable for at least 250 years should do. If the data is shared in a decentralized way, like IPFS or Filecoin try to do, we (almost) can be sure that if our hard drive failes, we don't lose the data. And of course, you can still do backups.

Now let's get back to topic. If we decentralize issue tracking, we can make sure that issues are around somewhere. If github closes down tomorrow, thousands, if not millions, open source projects lose their issues. And that's not only current issues, but also history of issue tracking, which means also data how a bug was closed, how a bug should be closed, what features are implemented why or how and these things. If your self-hosted solution loses data, like gitlab did not long ago on their gitlab.com hosted instance, data is gone forever. If we decentralize these things, more instances have to fail to bring the the whole project down.

There's actually a law of some sort about these things, named Amdahl's law: The more instances a distributed system has, the more likely it is that one instance is dead right now, but at the same time, the less likely it is that the whole system is dead. And this is not linear likelihood, it is exponential. That means that with 10, 15 or 20 instances you can be sure that your stuff is alive somewhere if your instance fails.

Now think of projects with many contributors. Maybe not as big as the kernel, which has an exceptionally big community. Think of communities like the neovim one. The tmux project. The GNU Hurd (have you Hurd about that one?) or such projects. If in one of these projects 30 or 40 developers are actively maintaining things, their repositories will never die. And if the repository holds the issue history as well, we get backup safety there for free. How awesome is that?

I guess I made my point.

tags: #open-source #programming #software #tools #git #github

This post was written during my trip through Iceland and published much latern than it was written.

Almost all toolchains out there do a CI-first approach. There are clearly benefits for that, but maybe that's not what one wants with their self-hosted solution for their OSS Projects?

Here I try to summarize some thoughts.

CI first

CI first is convenient, of course. Take github and Travis as an example. If a PR fails to build, one has not even to review it. If a change makes some tests fail, either the test suite has to be adapted or the code has to be fixed to get the test working again. But as long as things fail, a maintainer does not necessarily have to have a look.

The disadvantage is, of course, that resources need to be there to compile and test the code all the time. But that's not that much of an issue, because hardware and energy is cheap and developer time is not.

Review first

Review keeps the number of compile- and test-runs low as only code gets tested which is basically agreed upon. Though, it increases the effort a maintainer has to invest into the project.

Review first is basically cheap. If you think of an OSS hobby project, that might be a good idea, especially if your limited funding keeps you from renting or buying good hardware where running hundreds or even thousands of compile jobs per month can be done at decent speed.

What I'd do

I'm thinking of this subject in the context of moving away from github with one of my projects (imag, of course. Because of my limited resources (the website and repository are hosted on Uberspace, which is perfect for that), I cannot run a lot of CI jobs. I don't even known whether running CI jobs on this host is allowed at all. If not, I'd probably rent a server somewhere and if that is the case, I'd do CI-first and integrate that into the Uberspace-hosted site. That way I'd even be able to run more things I would like to run on a server for my personal needs. But if CI is allowed on Uberspace (I really have to ask them), I'd probably go for Review-first and invest the money I save into my Uberspace account.

tags: #open-source #programming #software #tools #git #github

This post was written during my trip through Iceland and published much latern than it was written.

From my last article on whether to move away from github with imag , you saw that I'm heavily thinking about this subject. In this post I want to summarize what I think we need for a completely self-hosted open source programming toolchain.

Issue tracking

Do we actually need this? For example, does the kernel do these things? Bug tracking – yes. But Issue tracking as in planning what to implement next? Maybe the companies that contribute to the kernel, internally, but not the kernel community as a whole (AFAIK).

Of course the kernel is a bad example in this case, because of its size, the size of the community around it and all these things. And other smaller projects use issue tracking for planning, for example the nixos community (which is still fairly big) or the Archlinux community (though I'm not sure whether they are doing these things only over mailinglists or via a dedicated forum at archlinux.org.

Bug tracking

Bug tracking should be done on a per-bug basis. I think this is a very particular problem that can be easily solved with a mailing list. As soon as a bug is found, it is posted to the mailing list and discussion and patches are added to the list thread until the issue is solved.

Pull request tracking

With github, a contributor automatically has an web-accessible repository. But for the most part it is sufficient if the patches are send via an email-patch workflow, which is how many great projects work. Having web-accessible repositories available is just a convenience github introduced and now everybody expects.

I think pull requests (or rather patchsets) are tracked no matter how they are submitted. If you open a PR on github, patches are equally good tracked as with mailing lists. Indeed I even think that mailing lists are way better for tracking and discussion, as one can start a discussion on each individual patch. That's not really possible with github. Also, the tree-shape one can get into when discussing a patch is a major point where mailing lists are way better than github.

CI

Continuous Integration is a thing where solutions like gitlab or github shine. They easily integrate with repositories, are for free and result in better and tested code (normally). I do not know of any bigger open source project that does not use some form of CI. Even the kernel is tested (though not by the kernel community directly but rather companies like Intel or Redhat, as far as I know).

A CI solution, though, is rather simple to implement (but I'm sure it is not easy to get it right). Read my expectations below.

How I would like it

Issue and bug tracking should be based on plain text, which means that one should be able to integrate a mailing list into the bug tracking system. Fortunately, there is such an effort named git-dit but it is not usable yet. Well, it is useable, but has neither email integration nor a web interface (for viewing). This is, of course, unfortunate. Also, there is no way to import existing issues from (for example) github. And that's important, of course.

For pull request/patch management, there's patchworks. I've never worked with it, but as far as I can see it works nicely and could be used. But I would prefer to use git-dit for this, too.

I would love to have an CI tool that works on a git-push-based model. For example you install a post-receive hook in your repository on your server, and as soon as there is a new push, the hook verifies some things and then starts to build the project from a script, which preferably lives in the repository itself. One step further, the tool would create a RAM-disk, clone the just pushed branch into it (so we have a fresh clone) and builds things there. Even one step further, the tool would create a new container (think of systemd-nspawn) and trigger the build there. That would ensure that the build does not depend on some global system state.

This, of course, has also some security implications. That's why I would only build branches where the newest (the latest) commit is signed with a certain GPG key. It's an really easy thing to do it and because of GPG and git itself, one can be sure that only certain people can trigger a build (which is only execution of a shell script, so you see that this has some implications). Another idea would be to rely on gitolite, which has ssh authentication. This would be even easier, as no validation would be necessary on our side.

The results of the build should be mailed to the author/commiter of the build commit.

And yes, now that I wrote these things down I see that we have such an tool already: drone.

That's it!

That's actually it. We don't need more than that for a toolchain for developing open source software with self hosted solutions. Decentralized issue/PR tracking, a decent CI toolchain, git and here we go.

tags: #open-source #programming #software #tools #git #github

This post was published on both my personal website and imag-pim.org.

I'm thinking of closing contributions to imag since about two months. Here I want to explain why I think about this step and why I am tending into the direction of a “yes, let's do that”.

github is awesome

First of all: github is awesome. It gives you absolutely great tools to build a repository and finally also an open source community around your codebase. It works flawlessly, I did never experience any issues with pull request tracking, issue tracking, milestone management, merging, external tool integration (in my case and in the case of imag only Travis CI) or any other tool github offers. It just works which is awesome.

But. There's always a but. Github has issues as well. From time to time there are outages, I wrote about them before. Yet, I came to the conclusion that github does really really well for the time being. So the outages at github are not the point why I am thinking of moving imag away from github.

Current state of imag

It is the state of imag. Don't get me wrong, imag is awesome and gets better every day. Either way, it is still not in a state where I would use it in production. And I'm developing it for almost two years now. That's a really long time frame for an open source project that is, in majority, only developed by one person. Sure, there are a few hundred commits from other, but right now (the last time I checked the numbers) more than 95% of the commits and the code were written by me.

Imag really should get into a state where I would use it myself before making it accessible (contribution wise) to the public, in my opinion. Developing it more “closed” seems like a good idea for me to get it into shape, therefore.

Closing down

What do I mean by “closing development”, though? I do not intend to make imag closed source or hiding the code from the public, that's for sure. What I mean by closing development is that I would move development off of github and do it only on my own site imag-pim.org. The code will be openly accessible via the cgit web interface, still. Even contributions will be possible, via patch mails or, if a contributor wants to, via a git repository on the site. Just the entry gets a bit harder, which – I like to believe – keeps away casual contributors and only attracts long-term contributors.

The disadvantages

Of course I'm losing the power of the open source community at github. Is this a good thing or a bad thing? I honestly don't know. On the one hand it would lessen the burden on my shoulders with community management (which is fairly not much right now), issue management and pull request management. On the other hand I would lose tools like travis-ci and others, which work flawlessly and are a real improvement for the development process.

The conclusion

I don't really have one. If there would be a way to include Travis into a self-hosted repository as well as some possibility for issue management (git-dit isn't ready in this regard, yet, because one cannot extract issues from github just yet), I would switch immediately. But it isn't. And that's keeping me away from moving off of github (vendor lock in at its finest, right?).

I guess I will experiment with a dedicated issue repository with git-dit and check how the cgit web interface works with it, and if it seems to be good enough I will test how it can be integrated (manually) with emails and a mailing list. If things work out smoothly enough, I will go down this road.

What I don't want to do is to integrate the issue repository in the code repository. I will have a dedicated repository for issues only, I guess. On the other hand, that makes things complicated with pull request handling, because one cannot comment on PRs or close issues with PRs. That's really unfortunate, IMO. Maybe imag will become the first project which heavily uses git-dit. Importing the existing issues from github would be really nice for that, indeed. Maybe I'll find a way to script the import functionality. As I want a complete move, I do not have to keep the issue tracking mechanisms (git-dit and github issues) in sync, so at least I do not have this problem (which is a hard one on its own).

tags: #open-source #programming #software #tools #git #github

This article is a cry for a feature which is long overdue in KDE, in my humble opinion: Syncable, user readable (as in plain-text) configuration files.

But let me start explaining where I come from.

I started with Linux when I was 17 years old. At the time I ran an Kubuntu 9.04 (if I remember correctly) with KDE 3. I disliked Gnome because it looked silly to me (no offense, Gnome or Gnome people). So it was simply aesthetics which made me use KDE. Before switching to Linux I had only experienced the cruel world of Microsoft Windows, XP at the time. When I got a new Notebook for my Birthday, I got Vista and two days later I had this Linux thing installed (which friends of mine kept talking about). Naturally, I was blown away by it.

After some time I got rather comfortable using this black box with the green text on it – the terminal finally had me! Soon, I launched a full blown KDE 3 (themed hackerlike in dark colors) to start a few terminals, open vim and hack away. Then, the same friend who suggested Linux told me about “tiling window managers”. wmii it was shortly after.

A long journey began. After some troubles with Ubuntu 12.04 I switched to Archlinux, later from wmii to dwm and after that to i3 which I kept for a few years.

In 2015 I learned about NixOS, switched to it at the beginning of 2016 and in late 2016 I reevaluated my desktop and settled with XFCE.

And here's the thing: I wanted KDE, but it lacked one missing critical feature: Beeing able to sync its configuration between my machines.

I own a desktop machine with three 1080p Monitors and two Notebooks – a Thinkpad X220 and an old Toshiba Satellite (barely used), if that matters at all.

There are things in the KDE configuration files which shouldn't be there, mainly state information and temporary values, making the whole thing a pain in the ass if you want to git add your configuration files to git push them to your repository somewhere in the web and git pull it on another machine.

Apparently the story is not that much better with XFCE, but at least some configuration files (like keyboard shortcuts, window decoration configuration and menu-bar configuration) can be git added and synced between machines. And it works for me, even with two so different machines. Some things have to be re-done on each machine, but the effort is pretty low.

But with KDE (I tried Plasma 5), it was pure PITA. Not a single configuration file was untouched, reordering values, putting state information in there and so on and so forth.

And I am rather amazed with KDE and really want to play with it, because I think (and that's not just an attempt to kiss up to you KDE guys) that KDE is the future of Linux on the desktop! Maybe it is not for the Hacker kid from next door, but for the normal Linux User, like my Mom, your Sister or the old non-techy guy from next door, KDE is simple enough to understand and powerful enough to get stuff done efficiently without getting in your way too much.

So here's my request:

Please, KDE Project, make your configuration files human readable, editor-friendly and syncable (via tools like git). That means that there are no temporary values and no state information in the configuration files. That means that configuration files do not get shuffled when a user alters a configuration via a GUI tool. That means that the format of the configuration files do not get automatically changed.

The benefits are clear, and there are no drawbacks for KDE as a project (as far as I can see) because parsing and processing the configuration files will be done either way. Maybe it even reduces the complexity of some things in your codebase?


A word on “why don't you submit this to the KDE project as a request”: I am not involved with KDE at all. I don't even know where the documentation, bug tracker, mailinglist(s), ... are located. I don't know where to file these things, whether I have to register at some forum or whatever.

If someone could point me to a place where nice people will discuss these things with me, feel free to send me an email and guide me to this place!

tags: #open-source #software #tools #git #desktop #linux

Right now, github shows you this:

github is down

And these things will happen more frequently in future, I assure you!

In the first half of 2017, we already had 3 major service outages, 3 minor service outages and 21 other problems. Yes, indeed, this is very good service quality. It really is. Anyways, it is not optimal. Github advertises itself with 22M developers, 59M repositories and 117k businesses world-wide, which is a freakin' lot.

I really like github, it is a great platform for the open-source community and individual projects and developers.

But. There's a but.

Github will not scale endlessly. It will vanish at some point. Maybe not in 2 years, maybe not in 5 or even 10 years. But at some point, a competitor will step up and get bigger and better than github. Or it will be taken down by someone. Who knows.

But when that happens we, as a community, have to be prepared. And that's why we need distributed issue tracking like we implemented with git-dit.

Yes, it is unfortunate that git-dit itself is hosted on github. And we do not have the issues from github in the repository, yet, as there is no mapper available. But we will get there.

I won't go into detail how git-dit works here, there's a talk from GPN17 on youtube about that (in german), where you can learn about these things.

With git-dit, we won't be tied to github and if github vanishes from today to tomorrow, we would be able to continue development seamlessly because the issues are mirrored into the repository itself. In fact, we won't even need github in the first place, because the repository itself would contain everything which is needed for development.

But we are not there yet.

If you're feeling brave, you're more than welcome to try out git-dit or contribute to the codebase.

tags: #git #github #open-source #software #tools

imag just got a website, a mailinglist, a git repository setup and an IRC channel.

So I wanted to set up a website for imag for months already, but I finally got to it. I actually developed the website offline and it was almost done, and now it is online.

I wrote the website using nanoc, a static site generator. I used nanoc before and I also contributed some code to nanoc, so I already knew what to do and how to do it. I wrote a minimal theme for the website, I wanted to have it plain-text-ish, as imag itself is a plain text tool.

I managed to register an IRC channel on the freenode irc network, which is awesome. travis was set up to post to this IRC channel if a build succeeds or fails which is really convenient as well.

Of course we also have a mailinglist now. One can register there to contribute patches via mail and to ask questions. Of course there's not that much on the mailinglist yet, as we do not have a community around imag, yet. Anyways, there's the possibility to build one now, which is awesome, I guess!

Background

So what's running in the background here?

I registered a space at uberspace where also this very website is hosted, set up a gitolite and a cgit webfrontend for the git repositories.

The mailinglist is run by ezmlm, which was written by djb and is very well documented how to setup on an uberspace.

The domain was registered on united-domains.de.

Costs

Well, I pay these things from my own money (I can make some money this summer working at my university, so that's not a big problem).

Currently, I pay 19 Euro per Year for the domain and 2,50 Euro per Month for the Uberspace, but I will increase this as soon as there are more contributors on the git hosting or on the mailinglist (as soon as my setup causes actual workload on their servers) or as soon as I have a job, whatever comes first.

So that makes it 49 Euro per year with the current setup, which is affordable for me. As soon as I increase the monthly fee for uberspace (I will go to 5 Euro if I make my own money and no contributors and more if there are contributors), this will cost me 100 Euro per year if I give uberspace 6.75 Euro per month. Still not much, I guess.

As soon as I have to pay more than 100 Euro per year I guess I will add a “support this project” button on the website ... or something like this. Well, we'll see...

tags: #linux #open source #programming #rust #software #tools #imag #git #mailinglists #network #social