musicmatzes blog

github

I deleted the repositories I own(ed) on #github.

After it became clear that Microsoft GitHub Copilot was trained using open source and GPLed work, keeping my (both public and private) github repositories is just wrong. So I deleted all of them, except for the forks I contribute to and maintain (for example config-rs and shiplift).

I hope that others will follow suit and delete their repositories as well. I can understand if people don't mind the vendor lock that “discussions”, “actions” and other features have created for them. But this (copilot) is pure abuse of free software codebases.

The “extend” phase is over, we're in the “extinguish” phase!

(me, on mastodon)

It might be legal for github to do this (IANAL), but nonetheless it is more than just a bad move. If their ToS allows this and we, as a community, can not act upon this because we agreed to these terms, the only sensible thing to do is to move our development away from github to some more open and less abusive services. I'm a big fan of sourcehut, of course, but there are others, most prominently codeberg and of course, self-hosting.

Self-hosting and email patches

I am self-hosting all my sources on git.beyermatthi.as plus I host some of these projects on my sourcehut account for more visibility. I take patches via mail for all my repositories.

If you plan on moving away from github, learning how to send patches via mail and of course also how to accept patches via mail is a viable skill that you will benefit from! Just make sure to use plain text email instead of html emails.

There are tons and tons of tutorials out there how to work with email patches. Just go and read them, it will make you a better developer, even if you then go to one of the other code forges and don't need the skill, you will start to understand why git works the way it works!

I am using git for over a decade now, over eight years in opensource and over two years professionally (so my whole professional career so far), and it is the one tool I cannot exist without! It amplifies the speed I develop code with by a magnitude!

If you don't want email...

If you don't want an email-based workflow for your git repositories, which I can understand (although not approve), and you want a shiny web-interface with all the bells and whistles, you can still go to codeberg (or gitlab.com, fwiw) or self-host one of the great tools that are already out there.

For example, you can self-host gitlab or gitea as a bit more lightweight alternative. Both of them feature issues/discussions and a nice web-UI.

If you don't want to collaborate but just put your code out there, you can use cgit (which is really not hard to host) plus, optionally, some gitolite if you want to host repositories for others as well.

With nginx as reverse-proxy and some mild rate-limiting because web-crawlers are still a thing, you can even host this on a $1-VPS instance somewhere (I'm not recommending any service here because this would be advertisement). I'd even say that a Raspberry Pi can handle hundreds of repositories with cgit and nginx as reverse proxy. I did not test this, though I'm fairly sure because git is very well optimized and cgit is written in C (hence the name), so there's only a very minimal footprint!


“Thoughts” is (will be) a weekly roll-up of my mastodon feed with some notable thoughts collected into a long-form blog post. “Long form” is relative here, as I will only expand a little on some selected subjects, not write screens and screens of text on each of these subjects.

If you think I toot too much to follow, this is an alternative to follow some of my thoughts.


This week (2021-05-29 – 2021-06-04) I got very angry about this “you need an app for this” bullshit and some things died.

App madness

I am very angry (german) about every other service forcing me to install an app of some sort or another on my devices. This time it was my insurance that wanted me to install either a desktop application (Windows and Mac only) or a Smartphone app (Android or iOS only) just to update my bank account or address. How mad can they possibly be?

I reported them to #digitalcourage.

github actions

I probably said it before, but the more I play with #githubactions, the more I like it.

This is mainly due to the fact that the features are well-designed. You can make dependend jobs and you can even boot up #docker containers, if your application has to be tested against, for example, a database or some other service. I know that #github will never #opensource this, that's why I hope someone implements it as a FLOSS alternative!

Awesome Rust

It always amazes me how good the #rust standard library actually is. I was able to solve an issue in my codebase with a two-line patch that would have been way more complex in another language!

Dying things

First audacity, now stackoverflow (german) died.

What's next? I really hope we can develop alternative FLOSS platforms. For audacity, there are several alternative tools around that one can switch to. For stackoverflow, not so much. Especially because the software is only one part, the other part is the data. There are dumps of stackoverflow somewhere on the internet (I'm not linking because I don't know how legal these dumps actually are), so maybe someone can implement a FLOSS alternative (please make it federated or distributed) and import that dataset?

This would be awesome!


“Thoughts” is (will be) a weekly roll-up of my mastodon feed with some notable thoughts collected into a long-form blog post. “Long form” is relative here, as I will only expand a little on some selected subjects, not write screens and screens of text on each of these subjects.

If you think I toot too much to follow, this is an alternative to follow some of my thoughts.


This week (2021-05-08 – 2021-05-14) I thought about vendor lock once more and played a bit with my raspberry.

Github and Vendor-Lock

My discontent about #github continues but I had to admit that github-actions is very nice and I like it more every time I have to work with it. I'm not sure whether this plays into the vendor-lock thing mentioned earlier.

I also voiced my discomfort... no, lets face it: my anger with people that cannot obey the simplest commit message rules.

Sensor stuff

I was able to boot my old Raspberry Pi (1) with raspbian (german) (unfortunately #nixos failed to boot and I don't know why), which makes my plans for sensors in my flat (last week, german) a little easier. It won't be able to run prometheus and grafana because 512MB of RAM is just not enough for these, I guess. Still, I am one step further towards the goal!

Movies

I watched “The Covenant” and asked the fediverse whether there's a database of movies with spiders, so I can avoid them. This would be in fact a really helpful database, and I am sad that no such things exists. I am not actually arachnophobic, but I really don't like them and prefer to watch movies without any spiders or similar creatures (crabs, scorpions, ...).

Idiotic shopping

Well, I had some encounters (german) with overly stressed workers at my local grocery store this week.

The thread linked above tells you everything I guess. I am still baffeled how idiotic some people in our society behave, even though this is not the worst kind of human being running around these days (especially if you consider covid-denialists and so on).

I am known for being not the biggest fan of #github anymore, especially since #Microsoft aquired them for a shitload of money. But when I recently learned about the “Suggested Change” feature, I lost any believe in them. How bad can one mess up a feature?

So, the “Suggested Change”-Feature, as I call it, provides a way to request changes on a pull request. A user can suggest a snippet of code being altered in a way they think is appropriate, by selecting the line from the diff in the PR that needs to be changed, and providing some replacement.

That replacement then is suggested to all who have write access on the source branch (most of the time, also maintainers of the target repository because of a checked-by-default option in PRs) and can be applied by them.

That's everything. There can be discussion on the change, of course, but that's the whole feature. It's even somewhat useful! But the way GitHub implemented this is just a load of pure shit.

The first thing is: they make you write change suggestions for Code in a Markdown-Editor! I mean... its not like code editors in web browsers are a new thing, or even an uncommon thing. But GitHub thinks otherwise and you're completely left alone, with a non-monospaced font, figure out how many spaces (or tabs, anyone?) you need on your own! GitHub does not care! You want to fix indentation of the code in there? Haha! Good luck with that! Oh, you accidentally suggested trailing whitespace! Well, GitHub cannot help you with that, because they don't know what a code editor is! In fact, your change suggestion is actually a markdown formatted comment with the diff being a markdown code block. What the hell?

Had enough already?

Next thing: you cannot provide a sensible change description, elsewhere known as commit message. You've probably never heard of that, GitHub, have you? Well, that's not entirely true though: The person who accepts your suggested change can. Yep, that's right! Not the author of the diff provides the commit message, but the committer. Nontrivial changes with “Update” as message anyone?

But even worse is that github actually thinks that suggested changes should not even be patches. How full of shit can they be? They implemented a feature to suggest changes on a pull request and these changes are NOT patches. There is no patch you can git-fetch, nothing you can git-cherry-pick or even git-merge on your own machine. Everything you can do is go to the website, click the “Apply suggested change” button, which creates new commits on your PR branch and then fetch your own PR branch. There's no way to fetch the changes beforehand and review them locally, using your favorite tooling. This is the known Embrace-Extend-Extinguish bullshit that Microsoft pulled for years!

My suggestion: If you can, run away from GitHub as fast as you can. This ship will sink at some point, either because the community recognizes how badly they are messing things up, or because Microsoft makes the whole thing into some real enterprise: slow, complicated to use and only with paid access. If you cannot, for whatever reason, leave GitHub at this point, I suggest you gradually move away from it: make use of other git hosting providers, learn how to use alternatives, learn how to contribute via email and/or even roll your own git hoster – with gitolite and cgit it is almost trivial, and hosters that allow you to deploy such software exist – I like to suggest you have a look at uberspace for a really good and reasonably priced one (I am not and never have been affiliated with/paid by them for saying/writing this).

How it could have been

You might ask how such a feature would have been implemented properly. Well, given the circumstance that GitHub is a web service and users are wanted on the platform for as long as possible, I would have implemented this as follows:

  • If you want to suggest changes you get a monospace-ready web-based code editor with syntax highlighting and maybe even a minimal autocompletion feature. The editor boots with your cursor at the position you initially clicked on in the changset you try to alter.
  • You annotate your suggested change with your own commit message, or optionally use the “!fixup ” commit message header that can later be used in a git rebase --autosquash.
  • Once you're done adding your suggestions to the diff in the PR, you submit all your individual patches and you get a branch that builds on top of the PR branch, for example named github.com/<your username>/<your repo clone>/<PR branch name>/suggestions-<N>.
  • The PR author gets notified about the suggested changes and can git-pull them from your fork properly, review them locally and push them to their PR if they see fit or filter out what they don't like.

Everyone would be totally happy with that. For your dummy-users, you could have buttons and knobs to do the whole thing. Still, your power-users would be satisfied because they have the power to use their own infrastructure and tooling.

But once again, GitHub fails to deliver.

So I started developing an importer for importing github issues into git-dit – the distributed issue tracker build on git.

Turns out it works well, though some things are not yet implemented:

  • Wrapping of text. This is difficult because quotations are wrapped, but the quotation character is not prepended to the new line – which results in broken format
  • Importing only issues. Right now, PRs are imported ... which is not exactly what we want. I really hope I can figure this out to actually attach PR comments to the actual commits of the PR. This would be really nice. Issues shall be imported without parent (orphaned) like git-dit wants it.
  • Mapping of github handles to real names and email addresses.
  • Mapping github labels to git-dit “trailers”.

Have a look at my importer tool here (Just be told: This is WIP and shouldn't be used right now)!

Or at git-dit itself here (I am co-author).

tags: #tools #software #rust #open-source #git #github

This post was written during my trip through Iceland and published much latern than it was written.

When writing my last entry, I argued that we need decentralized issue tracking.

Here's why.

Why these things must be decentralized

Issue tracking, bug tracking and of course pull request tracking (which could all be named “issue tracking”, btw) must, in my opinion, be decentralized. That's not only because I think offline-first should be the way to go, even today in our always-connected and always-on(line) world. It's also because of redundancy and failure safety. Linus Torvalds himself once said:

I don't do backups, I publish and let the world mirror it

(paraphrased)

And that's true. We should not need to do backups, we should share all data, in my opinion. Even absolutely personal data. Yes, you read that right, me, the Facebook/Google/Twitter/Whatsapp/whatever free living guy tells you that all data need to be shared. But I also tell you: Never ever unencrypted! And I do not mean transport encryption, I mean real encryption. Unbreakable is not possible, but at least theoretically-unbreakable for at least 250 years should do. If the data is shared in a decentralized way, like IPFS or Filecoin try to do, we (almost) can be sure that if our hard drive failes, we don't lose the data. And of course, you can still do backups.

Now let's get back to topic. If we decentralize issue tracking, we can make sure that issues are around somewhere. If github closes down tomorrow, thousands, if not millions, open source projects lose their issues. And that's not only current issues, but also history of issue tracking, which means also data how a bug was closed, how a bug should be closed, what features are implemented why or how and these things. If your self-hosted solution loses data, like gitlab did not long ago on their gitlab.com hosted instance, data is gone forever. If we decentralize these things, more instances have to fail to bring the the whole project down.

There's actually a law of some sort about these things, named Amdahl's law: The more instances a distributed system has, the more likely it is that one instance is dead right now, but at the same time, the less likely it is that the whole system is dead. And this is not linear likelihood, it is exponential. That means that with 10, 15 or 20 instances you can be sure that your stuff is alive somewhere if your instance fails.

Now think of projects with many contributors. Maybe not as big as the kernel, which has an exceptionally big community. Think of communities like the neovim one. The tmux project. The GNU Hurd (have you Hurd about that one?) or such projects. If in one of these projects 30 or 40 developers are actively maintaining things, their repositories will never die. And if the repository holds the issue history as well, we get backup safety there for free. How awesome is that?

I guess I made my point.

tags: #open-source #programming #software #tools #git #github

This post was written during my trip through Iceland and published much latern than it was written.

Almost all toolchains out there do a CI-first approach. There are clearly benefits for that, but maybe that's not what one wants with their self-hosted solution for their OSS Projects?

Here I try to summarize some thoughts.

CI first

CI first is convenient, of course. Take github and Travis as an example. If a PR fails to build, one has not even to review it. If a change makes some tests fail, either the test suite has to be adapted or the code has to be fixed to get the test working again. But as long as things fail, a maintainer does not necessarily have to have a look.

The disadvantage is, of course, that resources need to be there to compile and test the code all the time. But that's not that much of an issue, because hardware and energy is cheap and developer time is not.

Review first

Review keeps the number of compile- and test-runs low as only code gets tested which is basically agreed upon. Though, it increases the effort a maintainer has to invest into the project.

Review first is basically cheap. If you think of an OSS hobby project, that might be a good idea, especially if your limited funding keeps you from renting or buying good hardware where running hundreds or even thousands of compile jobs per month can be done at decent speed.

What I'd do

I'm thinking of this subject in the context of moving away from github with one of my projects (imag, of course. Because of my limited resources (the website and repository are hosted on Uberspace, which is perfect for that), I cannot run a lot of CI jobs. I don't even known whether running CI jobs on this host is allowed at all. If not, I'd probably rent a server somewhere and if that is the case, I'd do CI-first and integrate that into the Uberspace-hosted site. That way I'd even be able to run more things I would like to run on a server for my personal needs. But if CI is allowed on Uberspace (I really have to ask them), I'd probably go for Review-first and invest the money I save into my Uberspace account.

tags: #open-source #programming #software #tools #git #github

This post was written during my trip through Iceland and published much latern than it was written.

From my last article on whether to move away from github with imag , you saw that I'm heavily thinking about this subject. In this post I want to summarize what I think we need for a completely self-hosted open source programming toolchain.

Issue tracking

Do we actually need this? For example, does the kernel do these things? Bug tracking – yes. But Issue tracking as in planning what to implement next? Maybe the companies that contribute to the kernel, internally, but not the kernel community as a whole (AFAIK).

Of course the kernel is a bad example in this case, because of its size, the size of the community around it and all these things. And other smaller projects use issue tracking for planning, for example the nixos community (which is still fairly big) or the Archlinux community (though I'm not sure whether they are doing these things only over mailinglists or via a dedicated forum at archlinux.org.

Bug tracking

Bug tracking should be done on a per-bug basis. I think this is a very particular problem that can be easily solved with a mailing list. As soon as a bug is found, it is posted to the mailing list and discussion and patches are added to the list thread until the issue is solved.

Pull request tracking

With github, a contributor automatically has an web-accessible repository. But for the most part it is sufficient if the patches are send via an email-patch workflow, which is how many great projects work. Having web-accessible repositories available is just a convenience github introduced and now everybody expects.

I think pull requests (or rather patchsets) are tracked no matter how they are submitted. If you open a PR on github, patches are equally good tracked as with mailing lists. Indeed I even think that mailing lists are way better for tracking and discussion, as one can start a discussion on each individual patch. That's not really possible with github. Also, the tree-shape one can get into when discussing a patch is a major point where mailing lists are way better than github.

CI

Continuous Integration is a thing where solutions like gitlab or github shine. They easily integrate with repositories, are for free and result in better and tested code (normally). I do not know of any bigger open source project that does not use some form of CI. Even the kernel is tested (though not by the kernel community directly but rather companies like Intel or Redhat, as far as I know).

A CI solution, though, is rather simple to implement (but I'm sure it is not easy to get it right). Read my expectations below.

How I would like it

Issue and bug tracking should be based on plain text, which means that one should be able to integrate a mailing list into the bug tracking system. Fortunately, there is such an effort named git-dit but it is not usable yet. Well, it is useable, but has neither email integration nor a web interface (for viewing). This is, of course, unfortunate. Also, there is no way to import existing issues from (for example) github. And that's important, of course.

For pull request/patch management, there's patchworks. I've never worked with it, but as far as I can see it works nicely and could be used. But I would prefer to use git-dit for this, too.

I would love to have an CI tool that works on a git-push-based model. For example you install a post-receive hook in your repository on your server, and as soon as there is a new push, the hook verifies some things and then starts to build the project from a script, which preferably lives in the repository itself. One step further, the tool would create a RAM-disk, clone the just pushed branch into it (so we have a fresh clone) and builds things there. Even one step further, the tool would create a new container (think of systemd-nspawn) and trigger the build there. That would ensure that the build does not depend on some global system state.

This, of course, has also some security implications. That's why I would only build branches where the newest (the latest) commit is signed with a certain GPG key. It's an really easy thing to do it and because of GPG and git itself, one can be sure that only certain people can trigger a build (which is only execution of a shell script, so you see that this has some implications). Another idea would be to rely on gitolite, which has ssh authentication. This would be even easier, as no validation would be necessary on our side.

The results of the build should be mailed to the author/commiter of the build commit.

And yes, now that I wrote these things down I see that we have such an tool already: drone.

That's it!

That's actually it. We don't need more than that for a toolchain for developing open source software with self hosted solutions. Decentralized issue/PR tracking, a decent CI toolchain, git and here we go.

tags: #open-source #programming #software #tools #git #github

This post was published on both my personal website and imag-pim.org.

I'm thinking of closing contributions to imag since about two months. Here I want to explain why I think about this step and why I am tending into the direction of a “yes, let's do that”.

github is awesome

First of all: github is awesome. It gives you absolutely great tools to build a repository and finally also an open source community around your codebase. It works flawlessly, I did never experience any issues with pull request tracking, issue tracking, milestone management, merging, external tool integration (in my case and in the case of imag only Travis CI) or any other tool github offers. It just works which is awesome.

But. There's always a but. Github has issues as well. From time to time there are outages, I wrote about them before. Yet, I came to the conclusion that github does really really well for the time being. So the outages at github are not the point why I am thinking of moving imag away from github.

Current state of imag

It is the state of imag. Don't get me wrong, imag is awesome and gets better every day. Either way, it is still not in a state where I would use it in production. And I'm developing it for almost two years now. That's a really long time frame for an open source project that is, in majority, only developed by one person. Sure, there are a few hundred commits from other, but right now (the last time I checked the numbers) more than 95% of the commits and the code were written by me.

Imag really should get into a state where I would use it myself before making it accessible (contribution wise) to the public, in my opinion. Developing it more “closed” seems like a good idea for me to get it into shape, therefore.

Closing down

What do I mean by “closing development”, though? I do not intend to make imag closed source or hiding the code from the public, that's for sure. What I mean by closing development is that I would move development off of github and do it only on my own site imag-pim.org. The code will be openly accessible via the cgit web interface, still. Even contributions will be possible, via patch mails or, if a contributor wants to, via a git repository on the site. Just the entry gets a bit harder, which – I like to believe – keeps away casual contributors and only attracts long-term contributors.

The disadvantages

Of course I'm losing the power of the open source community at github. Is this a good thing or a bad thing? I honestly don't know. On the one hand it would lessen the burden on my shoulders with community management (which is fairly not much right now), issue management and pull request management. On the other hand I would lose tools like travis-ci and others, which work flawlessly and are a real improvement for the development process.

The conclusion

I don't really have one. If there would be a way to include Travis into a self-hosted repository as well as some possibility for issue management (git-dit isn't ready in this regard, yet, because one cannot extract issues from github just yet), I would switch immediately. But it isn't. And that's keeping me away from moving off of github (vendor lock in at its finest, right?).

I guess I will experiment with a dedicated issue repository with git-dit and check how the cgit web interface works with it, and if it seems to be good enough I will test how it can be integrated (manually) with emails and a mailing list. If things work out smoothly enough, I will go down this road.

What I don't want to do is to integrate the issue repository in the code repository. I will have a dedicated repository for issues only, I guess. On the other hand, that makes things complicated with pull request handling, because one cannot comment on PRs or close issues with PRs. That's really unfortunate, IMO. Maybe imag will become the first project which heavily uses git-dit. Importing the existing issues from github would be really nice for that, indeed. Maybe I'll find a way to script the import functionality. As I want a complete move, I do not have to keep the issue tracking mechanisms (git-dit and github issues) in sync, so at least I do not have this problem (which is a hard one on its own).

tags: #open-source #programming #software #tools #git #github

Right now, github shows you this:

github is down

And these things will happen more frequently in future, I assure you!

In the first half of 2017, we already had 3 major service outages, 3 minor service outages and 21 other problems. Yes, indeed, this is very good service quality. It really is. Anyways, it is not optimal. Github advertises itself with 22M developers, 59M repositories and 117k businesses world-wide, which is a freakin' lot.

I really like github, it is a great platform for the open-source community and individual projects and developers.

But. There's a but.

Github will not scale endlessly. It will vanish at some point. Maybe not in 2 years, maybe not in 5 or even 10 years. But at some point, a competitor will step up and get bigger and better than github. Or it will be taken down by someone. Who knows.

But when that happens we, as a community, have to be prepared. And that's why we need distributed issue tracking like we implemented with git-dit.

Yes, it is unfortunate that git-dit itself is hosted on github. And we do not have the issues from github in the repository, yet, as there is no mapper available. But we will get there.

I won't go into detail how git-dit works here, there's a talk from GPN17 on youtube about that (in german), where you can learn about these things.

With git-dit, we won't be tied to github and if github vanishes from today to tomorrow, we would be able to continue development seamlessly because the issues are mirrored into the repository itself. In fact, we won't even need github in the first place, because the repository itself would contain everything which is needed for development.

But we are not there yet.

If you're feeling brave, you're more than welcome to try out git-dit or contribute to the codebase.

tags: #git #github #open-source #software #tools