git-dit github import

January 17, 2018

So I started developing an importer for importing github issues into git-dit – the distributed issue tracker build on git.

Turns out it works well, though some things are not yet implemented:

Wrapping of text. This is difficult because quotations are wrapped, but the quotation character is not prepended to the new line – which results in broken format
Importing only issues. Right now, PRs are imported ... which is not exactly what we want. I really hope I can figure this out to actually attach PR comments to the actual commits of the PR. This would be really nice. Issues shall be imported without parent (orphaned) like git-dit wants it.
Mapping of github handles to real names and email addresses.
Mapping github labels to git-dit “trailers”.

Have a look at my importer tool here (Just be told: This is WIP and shouldn't be used right now)!

Or at git-dit itself here (I am co-author).

tags: #tools #software #rust #open-source #git #github

Panorama Photography is hard

January 16, 2018

As the cautious reader might have noticed, I got myself into photography lately.

Well, I'm by no means a professional, but it is a nice hobby and I get myself out into the nature (not right now, because weather really sucks here in south Germany right now, but in general) to take pictures.

In the summer, during my trip to Iceland, I took a lot of pictures – and also panorama pictures. My camera (a Panasonic G 70) has a functionality to take panorama photos, which is really convenient. I did not use it, though. I took a bunch of photos which later can be combined into one panorama photo, for example. Or so I thought.

Now that I am at home and am trying to combine the photos (with the awesome Hugin Software) I notice that I should've taken even more photos. The resulting image is okayish at best – lots of areas are blurry and the panorama is basically not usable.

So, when taking photos for a panorama – take more! And maybe even use your cameras functionality in addition, so you are safe and if the combined image has issues you at least have the automatically generated panorama.

tags: #photography

2017 in Rust

January 5, 2018

Inspired by the Call for Community Blogposts I want to summarize my experiences and thoughts on Rust in 2017 and what I am excited about for 2018.

Reflecting 2017

2017 was an amazing year for Rust. We got 8 releases of rust itself! We got basic procedural macros allowing custom derive (also known as “macros 1.1”) in the first release last year (1.15.0). This made serde 1.0 possible, if I'm not mistaken? We got 103 stabilized APIs in 2017. This is incredible! The improvements of compiletime and also the tooling got so much better. I mean, it was awesome before. But now it is even better!

On a personal side I got a lot better at programming Rust. I wrote about 37800 lines of rust code in my main project imag and 17380 lines in other crates (authored and contributed, according to a bit git-fooing around). Is that a lot? I don't know.

Hopes for 2018

Now lets talk about 2018. This year will be amazing, I am sure.

Language features

I am really excited about the “impl trait” thing. Beeing able to return an trait from a function will reduce the imag codebase so much, for example. We no longer need to define our own iterator helper types but can simply return Iterator<Item = Whatever>!

I have no other hopes for the language itself, because what we have right now is really amazing and I honestly cannot think of ways it could be improved.

Ecosystem needs / Tooling enhancements

I'm still a bit concerned about cargo functionality for building workspace projects. From what I see, building two different crates in one workspace which share dependencies rebuilds the dependencies. This is not as intended, I guess, but that's what I see. I did not dive deep into this, so I might be wrong, though.

What I am thinking about for several weeks now is a cargo/rust tool for calculating code metrics. I think of things like documentation/code ratio, average function length, simple things... but also about cohesion and coupling metrics and other inter-module/inter-crate metrics.

Also, I tried to set up the rust language server for vim on my workstation and failed hard. I guess this is a packaging problem with my distro (NixOS), though. Either way, installing the rls with a stable toolchain would be nice!

Crates I am still missing / should be improved

There are some crates I would love to have which do not exist yet.

A (high level) email crate. There is the email crate, but it is mainly unstable and does not even have a 0.1.0 yet. There's also lettre_email, which is in 0.7.0, but it doesn't support parsing of emails.
I really hope rust-vobject (which is one of the crates I contributed to in 2017) will improve even more and be the defacto-standard crate for handling vcard and icalendar data.
I follow the development of Cursive and from what I see it is awesome. I really hope people start writing high-level objects for cursive (like a file explorer, a form builder, a text editor like thing, a tab helper and so on) so I have to do less work when implementing a TUI for imag. (To be fair, there are already some crates available).
I hope there will be some awesome crates for handling multi-media files and reading/writing their metadata. Especially audio formats and video formats are important to me with imag.
Rust bindings for pass would be awesome.
Markdown (and other formats, like asciidoc, restructured text, textile and maybe even bbcode) parsers and renderers should be written/improved
A API for IPFS or maybe even a protocol implementation
Qt bindings (yeah, I have high hopes for 2018)

There are possibly thousands more... But I won't list them all.

tags: #open-source #programming #software #rust

34c3

January 1, 2018

34c3 was awesome. I prepared a blog article as my recap, though I failed to provide enough content. That's why I will simply list my “toots” from mastodon here, as a short recap for the whole congress.

(2017-12-26, 4:04 PM) – Arrived at #34c3
(2017-12-27, 9:55 AM) – Hi #31c3 ! Arrived in Adams, am excited for the intro talk in less than 65 min! Yes, I got the tag wrong on this one
(2017-12-27, 10:01 AM) – Oh my god I'm so excited about #34c3 ... this is huge, girls and boys! The best congress ever is about to start!
(2017-12-27, 10:25 AM) – Be awesome to eachother #34c3 ... so far it works beautifully!
(2017-12-27, 10:31 AM) – #34c3 first mate is empty.
(2017-12-27, 10:46 AM) – #34c3 – less than 15 minutes. Oh MY GOOOOOOOOOD
(2017-12-27, 10:49 AM) – Kinda sad that #fefe won't do the Fnord this year at #34c3 ... but I also think that this year was to shitty to laugh about it, right?
(2017-12-27, 10:51 AM) – #34c3 oh my good 10 minutes left!
(2017-12-27, 11:02 AM) – #34c3 GO GO GO GO!
(2017-12-27, 11:16 AM) – Vom Spieltrieb zur Wissbegierig! #34c3
(2017-12-27, 12:17 PM) – People asked me things because I am wearing a #nixos T-shirt! Awesome! #34c3
(2017-12-27, 12:59 PM) – I really hope i will be able to talk to the #secushare people today #34c3
(2017-12-27, 1:44 PM) – I talked to even more people about #nixos ... and also about #rust ... #34c3 barely started and is already awesome!
(2017-12-27, 4:28 PM) – Just found a seat in Adams. Awesome! #34c3
(2017-12-27, 8:16 PM) – Single girls of #34c3 – where are you?
(2017-12-28, 10:25 AM) – Day 2 at #34c3 ... Yeah! Today there will be the #mastodon #meetup ... Really looking forward to that!
(2017-12-28, 12:32 PM) – Just saw ads for a #rust #wayland compositor on an info screen at #34c3 – yeah, awesome!
(2017-12-28, 12:37 PM) – First mate today. Boom. I'm awake! #34c3
(2017-12-28, 12:42 PM) – #mastodon ads on screen! Awesome! #34c3
(2017-12-28, 12:45 PM) – #taskwarrior ads on screen – #34c3
(2017-12-28, 3:14 PM) – I think I will not publish a blog post about the #34c3 but simply list all my toots and post that as an blog article. Seems to be much easier.
(2017-12-28, 3:15 PM) – #34c3 does not feel like a hacker event (at least not like the what I'm used to) because there are so many (beautiful) women around here.
(2017-12-28, 3:36 PM) – The food in the congress center in Leipzig at #34c3 is REALLY expensive IMO. 8.50 for a burger with some fries is too expensive. And it is even less than the Chili in Hamburg was.
(2017-12-28, 3:43 PM) – Prepare your toots! #mastodon meetup in less than 15 minutes! #34c3
(2017-12-28, 3:50 PM) – #34c3 Hi #mastodon #meetup !
(2017-12-28, 3:55 PM) – Whuha... there are much more people than I've expected here at the #mastodon #meetup #34c3
(2017-12-28, 4:03 PM) – Ok. Small #meetup – or not so small. Awesome. Room is packed. #34c3 awesomeness!
(2017-12-28, 4:09 PM) – 10 minutes in ... and we're already discussing pineapples. Community ftw! #34c3 #mastodon #meetup
(2017-12-28, 4:46 PM) – Limiting sharing of #toots does only work if all instances behave! #34c3 #mastodon #meetup
(2017-12-28, 4:56 PM) – Who-is-who #34c3 #mastodon #meetup doesn't work for me... because I don't know the 300 usernames from the top of my head...
(2017-12-28, 5:17 PM) – From one #meetup to the next: #nixos ! #34c3
(2017-12-28, 5:57 PM) – Unfortunately the #nixos community has no space for their #meetup at #34c3 ... kinda ad-hoc now!
(2017-12-28, 7:58 PM) – Now... Where are all the single ladies? #34c3
(2017-12-28, 9:27 PM) – #34c3 can we have #trance #music please?
(2017-12-28, 9:38 PM) – Where are my fellow #34c3 #mastodon #meetup people? Get some #toots posted, come on!
(2017-12-29, 1:44 AM) – Day 2 ends for me now. #34c3
(2017-12-29, 10:30 AM) – Methodisch Inkorrekt. Approx. 1k people waiting in line. Not nice. #34c3
(2017-12-29, 10:43 AM) – Damn. Notebook battery ran out of power last night. Cannot check mails and other unimportant things while waiting in line. One improvement proposal for #34c3 – more power lines outside hackcenter!
(2017-12-29, 10:44 AM) – Nice. Now the wlan is breaking down. #34c3
(2017-12-29, 10:57 AM) – LAOOOOLAAA through the hall! We did it #34c3 !
(2017-12-30, 3:45 AM) – 9h Party. Straight. I'm dead. #34c3
(2017-12-30, 9:08 PM) – After some awesome days at the #34c3 I am intellectually burned out now. That's why the #trance #techno #rave yesterday was exactly the right thing to do!
(2017-12-30, 11:35 PM) – Where can I get the set from yesterday night Chaos Stage #34c3 ??? Would love to trance into the next year with it!
(2017-12-31, 11:05 PM) – My first little #34c3 congress résumé: I should continue on #imag and invest even more time. Not that I do not continue it, but progress is slowing down with the last months of my masters thesis... Understandable I guess.

That was my congress. Yes, there are few toots after 28th... because I was really tired by then and also had people to talk to all the time, so little time for microblogging there. All in all: It was the best congress so far!

tags: #ccc #social

How to improve your open source code (5) – Planning of an Application or Library

December 8, 2017

This post was written during my trip through Iceland and published much latern than it was written.

In this and also maybe in the next few articles we will focus on rather code-related things than on direct code properties. I hope that's okay.

Planning of an application or library is not easy, not at all. But how much planning do we actually do before writing code? And should we do more?

My thoughts on the subject.

What we've learned

One that has studied computer science should know at least some UML types like class diagrams, flow charts, module plans and use case diagrams. They are used in (let's call it) “normal” software development and in the professional world out there.

But when we are developing open source software for our own needs and maybe for our friends, we do that often in our chambers at home. Class diagrams are often not being developed and I can say that I never saw a hobby programmer draw a use case diagram before writing the code of the application.

Why we don't use it

Why is that? Well, because open source software is often done as a hobby type of thing, there is often no need for planning ahead. A hobbyist is able to hold use case, simple class diagrams and flow charts “in his mind” because he has great knowledge of the domain.

In fact. as he defines the domain entirety, he is both stakeholder, project leader, software architect, programmer, tester and marketing guy at the same time. He knows what problems are about to be solved and therefore can adjust every aspect of the application to the needs required.

This holds true for small and medium sized applications or code bases, where the problems is of certain complexity but not too big. Basically one could say that every aspect of the domain has to fit into one head without much effort, in the open-source-programming-at-home-world. With a bit of training, I believe, one can even get to a point where only a few aspects of the domain have to be in a persons mind to be able to work on a solution

But there is certainly a point where the effort needed to solve a specific problem explodes. One can still write software to solve the problem at hand, but not in reasonable time.

So why don't we hobby programmers do not use planning tools like we've learned in university? Why don't we use diagrams to make things clearer, better documented, even before the real programming starts? The answer is quite simple: because it annoys the hell out of us. We don't like to plan ahead. We don't like to adjust plans as soon as we find out that changing a small aspect of our library could be changed to gain more flexibility and overall goodness. We don't like to check our plans before writing down the next module until it works.

Coding is fun, planning is not.

But should we use these things

In my opinion, this is foolish. We really should use the things we learned in university to plan out software and of course also to document it. It would be such a huge improvement of everything to simply think a bit more about it before actually implementing it!

How we do it

What we do and why we do not use tools to plan ahead is explained with one sentence: We program from the user interface to the implementation, because the other way round is to complicated. Or, with other words: We program top-down because bottom-up needs planning and therefore not that easy.

Of course, I'm speaking about the average case. I've programmed bottom-up before but, for me, it seems much more error prone than top-down does, especially without a plan.

Also, I do not say that top-down is not error prone. Not at all. When writing an API without an actual implementation in mind, one easily results in sacrificing cleanness and speed at some points to keep the API nice, which is not always a good idea. So top-down is only good as long as we get it right.

Tooling

Tooling is one big problem in this context. We do not have a toolchain for planning just yet. At least I do not have one that I would like to use. Because we are really good at controlling (versioning, moving around, managing) our source code (for example with git, and to some extend github), we also want to be able to do this with charts diagrams. But we also want the niceness of SVG-rendered graphics. We don't want to play around with layout all day long, but use tools to simply get the job done.

And there are no such tools available.

Sure, one can use graphviz to design such things, but then again we do not have a nice overview on what's going on while editing our work. One could use ascii-art to draw all those things, but hey... ascii-art. We are better than that, aren't we? We could render the ascii-art into SVG... though the tooling there is not yet as good as it should be. And even if it would be, version controlling these things with git is (I fail to believe otherwise) painful.

Conclusion

Well, I can only conclude the obvious here. We need better tooling for the open source programming community to do their planning, if they need to. Clearly, one does not always have to (or want to) plan things before trying out. But when one does, the tooling should be there and be useful and help with the process.

In the next episode we will talk about version control of open source software projects. I'm not going into details about git or other systems used, but rather on the style how they should be used so everyone is pleased with it. This might be strongly biased, but hey, isn't this whole article series biased?

tags: #open-source #programming #software #tools #rust

Preparing the journey

December 6, 2017

Everything starts with an idea.

We had one: Traveling north america for one year. Canada, Unites States, Mexico.

But having an idea is not everything, one has to lay out some plans (despite we are not the “plan every bit beforehand” kind of guys when it comes to traveling) at least.

But, of course, first things first.

VISA?

Well, for traveling to the US, one has to get VISA. We needed a travelors-VISA. There are several kinds of VISA, B and B+ is for tourists. It is actually not as easy as you think to get VISA!

Step 1: Requesting VISA online

The first step is to request VISA online, at the american consulate. Sounds easy, doesn't it? Well, it is not. You have to prepare all your things, including pictures of all travelors, your passport, and some more data one might not know from the top of their head.

Also, the websites where to ask for VISA is rather messy, to be honest. It is not one page, but several different ones. The whole process is not that transparent and simple as it could (or should) be. Also, the “Session expired” notifications one gets several times during the process of putting the data into the formular(s) is rather annoying.

After filling out everything (took us about 1.5 hours for the first try, but less for the second one), one should really safe a copy of the formulars in PDF. You'll never know whether you need it later or not – so better safe than sorry, right?

Step 2: Going to the consulate

The second step is visiting the american consulate (in germany), because they want to ask you questions about your visit.

Beware that driving to the consulate takes time. We drove 2.5 hours, but arrived only 3 minutes early because of rush hour (in Frankfurt a. M.). Highways are packed there!

After some safety checks which took quite some time (understandably), we were allowed to go into the building (which is really nice). Then, we had to wait for about two hours. We had two little interviews were we explained that we want to travel the US with our motorhome for at least six months.

Our expectations of the interviews were completely wrong though. We thought we would meet a nice person with a cup of coffee and they'd ask us several questions what we do for a living and so on. Nothing of that, though. We were asked what we want to do in the US and what our jobs are. That was all.

Rather easy, but still exhausing (because of the fear not to get VISA).

Step 3: Getting back your passport

The last step was easy for us: We got back our passport, which now includes a really nice imprint of the american VISA certificate.

Step 4: VISA in Canada

As complicated and tiresome the process is for the United States, the pleasurable it is for Canada: You go to the canadian website, find the application from and fill it out. Then you put in your Credit Card information and ready you are.

That was rather enjoyable, to be honest.

Planning the Route

Oh my, that's a hard one. We knew a handful of places we really wanted to visit, but we had no general route in mind.

We would arrive at the east coast of Canada, so naturally we would see some places in the east of Canada first. I asked the beautiful people over at reddit which places to visit in Canada and got a lot of replies, actually.

After days (or almost weeks) of discussing the subject, we agreed upon these basic points / questions:

Arrival in Halifax mid-May.
Eastern Canada for about 2-3 weeks.
To western Canada (about 2-3 weeks). Not sure whether we want to drive through the states or through canada. Every other week we change our decision.
Banff national park, Jasper national park: Mid-June.
Up North? To Alaska? Or rather safe the miles and stay in western Canada?
Enter the US (again?) about Mid-August.
North-Western US until mid-September.
mid-September get some visitors from germany for 6-8 weeks. Propably in Salt Lake City or Las Vegas. No gambling though!
6-8 weeks in the south western states of the US.
Early/mid-December to Baja California and down south to Mexico.
Parking the car somewhere and flying home.

That list is more an approximation of how things should happen rather than a list of how it will be, though.

Tech-Equipment

As I'm a nerd, I'm constantly thinking about the equipment to bring. Of course I will take my camera with me. But I'm also thinking about my notebook and at least one external harddrive to backup my photos. Maybe even a second one – you'll never know! I even though about buying another small notebook for the journey. I'm still not sure about that, though. My thinkpad is 6 years old and some parts are already replaced. It was not new when bought, too. If it dies during the journey, I will have severe problems.

I will bring my mobile phone as well, for listening to music mainly.

Shutdown of my projects?

Another point I think about all the time is my open source projects. I have to pause them for one year! I will ask a friend to take over the minor projects for me and merge bug fixes and respond to issues and requests as I cannot do that when beeing off the grid.

I also will remove myself from all nixpkgs I maintain. I will add myself again as a maintainer when I'm back home. But during my journey, I guess I will not be able to maintain any of the packages I maintain at the moment.

And then there's imag. I will continue to develop it as much as possible during the year, but I guess development will slow down anyways.

One thing I really need to do is to put a note on all my projects at github, telling contributors that I might respond rather slowly.

Why I use Linux

November 20, 2017

I'm not sure whether I wrote an article like this before. Either way, the Why I use Linux Project is a nice opportunity to write (again?) about this topic.

History

First of all, some history. I was introduced to “the other operating system” in grade 11 by a close friend of mine. At the time, I didn't know a thing about computers. I just graduated middle school (“Realschule” in Germany) and started high school (“Gymnasium” in Germany) and made new friends. My friend showed me his notebook, which was running Ubuntu Gnome 09.04 or something like that, and I was blown away by the fact that there was something else besides Microsoft Windows – and one also got it for free (as in beer).

Soon, I started with Kubuntu,... KDE 3 was amazing to me. Maybe half a year after my friend suggested, because I was only using terminal applications by that time – learning vim and so on – that I should try Archlinux. So Archlinux it was.

In 2015 I switched to NixOS, using i3 as window manager and was a Linux-only guy.

Today I use XFCE on NixOS and couldn't be happier. All my machines work perfectly well.

The reasons

Now that you know my history with Linux, let me give you some reasons why I use Linux today:

It works. Sounds strange, but for me this is perfectly reasonable: I've become a pragmatist – and because my setup, my machines and my workflow just suits me perfectly well, I absolutely have no reason to switch operating systems.
It is fast. Just today I had the opportunity to compare my workstation to the workstation of someone else running Windows 7. The workstations are approximately the same (so not handheld vs. datacenter): AMD 8 core processor with 16GB RAM on my side vs Intel i7 4-Core + Hyperthreading with 16GB on the other side. Tell you what? Mine is much faster with everything. From loading websites, loading applications, ... everything is just smooth and performs really well. Of course this is not a scientific approach in measuring and comparing these machines... but my point is not that my machine is faster, my point is: My machine is fast. My second machine, a Lenovo Thinkpad X220, is fast as hell as well – I never experienced any lagging or something the like.
I know what is going on in my machine. With Linux under the hood, I know what is running on my machine. With free (as in freedom) software, I actually can read what is going on. How many times have Windows users asked my why some application does not work as intended... How many times have I replied “Wait, I will check the Sourcecode on what it does... Oh, wait, you're running proprietary Software? Well... not my problem then, is it?” (yes, I'm a bit of a jerk in that regard).
I can configure how something should work! If I don't like how my menu works, how switching windows works or on which screen an application appears when starting it... I can change it! Some time ago someone complained that the “Print” dialog on his window machine always opens on the wrong screen. That's a problem I've never experienced with Linux or Linux-running machines. Just configure it how it should work – and then it does that!
If I don't like something, I can use something else! Well, if you purchase software and you don't like it, you cannot return it. Sometimes there are trial versions and that helps a lot – but what if you don't like the UI your operating system ships? Well, for me that is “uninstall XFCE, install KDE” ... ready to go, don't even need to reboot.
I can make it look as nerdy or as hipster-like as I want. Customizing a desktop environment is not one of my favourite ideas of how to spend an evening. But I could do it if I would like to.
I can adapt it to my workflow – I don't have to adapt my workflow to it. I'm a heavy user of keybindings. And I love having vim-bindings in my desktop environment, in my bash, in my tmux and in my vim (of course). I can configure that! Hell yeah, that's the power of free software! I can make it behave like I want to!
I don't have to rely on a company to fix their bugs! I can do it myself if I need to. Or I can pay someone to fix them!
It is free as in freedom and as in beer. For me, as a poor student, both count equally. I love to have free (as in freedom) software at my hands I can mess with. But it is also important that I can use it for free (as in beer). My friends at university pay hundrets of bucks for their Software,... I can use that money to do something else,... for example buying a camera and start learning photography! And because of the free (as in beer) image processing software, I can even post-process the images without purchasing expensive software for it!
I am a programmer. Unix (or in this case Linux) is my IDE! I'm also a techie and Linux is my playground.

Wanna try?

So, there are some good reasons for using Linux. If you want to try using Linux, go ahead with some of the beginner-friendly distros like Linux Mint Ubuntu Mate or Fedora KDE – and remember that you can make each of them look like the other – because with Linux, you have the choice!

tags: #linux#

How to improve your open source code (4) – API Design

November 15, 2017

This post was written during my trip through Iceland and published much latern than it was written.

What is a nice and gold API. How is “nice” defined when it comes to library interfaces? That's a question I want to discuss in this post and also, how you can create a nice API in your open source library without studying a topic like software architecture or similar.

Definition of a “nice”/” easy to use” API

But first, we have to define what makes an API good. And that's not that easy because this topic is very biased.

For me, a good API is one where I can get the job done without thinking much about it. That means that there shouldn't be that much setup code involved in my code just to use the library. So no Factory hell if the only thing I want to have is the current time, for example. This also means that the API has to be decent high level, but without losing the ability to do fine-grained work if necessary. So for the most part, low level (for example implementation details) things are not interesting for me. But when I want to bit-fiddle around with the library, it should let me.

If a builder, factory or some other mechanism is necessary to produce objects in some way, the library should make clear (documentation wise but also code wise) why it is needed. There's no point in making the user call the tenth factory instantiation if it is not necessary and also it makes the users codebase blow up in size and complexity.

The naming of things in the library should be good, appropriate and, for the most part, be consistent. If a function on an object which returns the string representation of that objectbis named “to_string” it should be named that way for all types from that library, not only some parts.

Statelessness

Calling functions of your API should always result in the same values for the same arguments. That does not mean that your API should be pure in a functional programming meaning, but rather that the actions executed when calling a function should not result in some library-internal variables to be set, changed or unset. This is easily achievable by letting the user of the API have an object that holds the state, and functions of your API work based on that value. In short: your library should not have global variables.

This simple design pattern already results in easy to use APIs and a nice user experience.

Error exposure

Good libraries don't hide errors. Indeed, it is even better if errors are exposed to the user as much as possible. Because the user of the library knows best when and how to handle errors, even from your library.

I'm also a big fan of lots of error cases. The more error cases (the better a user of a library can distinguish between different errors) the better. This way, you let the user decide where she doesn't distinguish between two almost-equal error cases and where it is better to handle them independently. If your library does not give that opportunity, the user has to make ugly Spaghetti-code handling things to be able to tell what is going on. Of course, these things have to be documented properly.

Another thing that can come in handy is when your error types or your library exposes functionality to translate error types into text which can be shown to a user of your library. Nothing is worse (from a users point of view) than a “CallOnInconsistenStateObjectBuilderFactory on line 2832” error message shown in an user-facing interface (and trust me, I've seen such things already).

Completeness

Nothing is worse than an API that is not complete. I mean, don't get me wrong – sometimes one does not think of all cases a library could be used for – and that's completely okay. But some things are too obvious for being left out. For example, if you provide functions to transform your time object from local time into GMT, why wouldn't you provide functions for converting it into UTC or EST? These also matter!

Also cleanup routines. In some languages it is necessary to include cleanup routines for your objects. If your library exposes alloc_vacation_location_obj() it should also provide free_vacation_location_obj()! Sure, a user could use free(), but it is not nice API-wise. Even if your function does nothing more than call to free(), it is better to provide a function (and if you want to include some more cleanup in your function later on, in a new version of your library, a user does not have to think about it that much when upgrading their dependencies).

Consistency

We had the naming game already, but it always comes back to us, right? Consistent naming is one of the most important things in an API. If allocating worked with functions prefixed with new_ all the time, it shouldn't be done with alloc_ this time. Also not in later versions if your library. Not even in a major version bump.

Even more important than naming is behaviour. A function that is named with some alloc prefix should only allocate, never initialize or do other fancy stuff (debugging output excluded here, if necessary).

In the next episode we will talk about how one can plan an application.

tags: #open-source #programming #software #tools #rust

Blueprint of a distributed social network on IPFS – and its problems

October 31, 2017

#matrix , #ipfs , #scuttlebutt and now #mastodon – We're living in awesome times! centralization < decentralization/federation < distribution! #lovefortech

(me, April 10, 2017, on mastodon)

The idea

With the rise of protocols like the matrix protocol, activitypub and others, decentralized social community platforms like matrix, mastodon and others gained power and were made real. I consider these platforms, especially mastodon and matrix, to be great steps into the future and am using both enthusiastically.

But can we do better? Can we do more distribution,? I think so!

So far we have a twitter-like thumbleblog platform (mastodon), a chat platform (matrix) and facebook-like platforms (diaspora and friendica) which are federated (some form of decentralization). I think we can make a completely distributed social network platform reality today.

Let me reiterate on that: I think, we can make a facebook/googleplus/etc clone which works without a central component, today. And I would even go one step further and state: All we need for this is IPFS (and related technology like IPLD and IPNS)!

This platform would feature personal profiles, publishing articles/posts/images/videos/voice messages/etc, instant messaging, following others, and all the things one would want in such a platform.

How would it work?

What do we need for this? Well, as stated before: not much! From what I can think of, we would need IPFS, some sort of public/private key functionality (which IPFS already has), a nice frontend-framework and that's basically it.

Let me tell you how I think such a platform would work.

The moment a user starts the application, the application would boot an IPFS node. The username and all other information about the profile are added to IPFS as structured data. If the profile changes because the user edits it, it is added to IPFS again, using IPLD to link to its previous version.

If a user adds a post to her profile, that post is added to IPFS as well and linked from the profile via IPLD. All other nodes are informed about the new content via pubsub and are free to pin the new content (the new profile version) or only cache it for a while (or to not care at all). The post itself could add a link to the IPNS hash of the profile under which the post is published. This way, a link from the post to the current version of the profile would always exist.

Because the profile always links to its previous version as well as to the post content, that would imply that the node the user of the profile runs would always keep all data the user adds to the network. As the data is only kept by links, the user is free to drop published content at any point in time.

This means that basically each operation would “generate” a new profile, which is of course published as an IPNS name. Following others would be a matter of subscribing to their “pub” channel (as in “pubsub”) or their IPNS name.

Chat

A chat application using IPFS is already implemented with orbit, so that's a matter of integrating one application into another. Peer-to-Peer (or rather Profile-to-Profile) messaging is therefore no problem.

Data format

All the data would be saved in a structured format. For example Json (though order of serialization is important, because of cryptographic hashes) or Bson or any other data serialization format that is widely adopted.

Sidenote: As long as it is made clear that any client must support all formats, the format itself doesn't matter that much. For simplicity of this article, I stick to Json (and also because it is most widely known).

A Profile(-version) would look roughly like this (consider 'ipfs hash' to mean “some kind of IPLD link” in this context):

{
  "previous": [ "<ipfs hash>" ],
  "post": {
    "type": "<post type>",
    "nodes": ["<ipfs hash>"],
    "metadata": {
      "date": "2017-12-12T12:00:00+0200",
      "tags": [],
      "category": "kittens",
      "custom": {}
    }
  }
}

Let me explain:

The previous key would point to the previous profile version(s). It would only contain IPFS hashes (Why plural, see below in “Multi-Device Support”).
The post key would contain information about the post published with this profile version.
- The type of the post could be “article”, “image”, “video”... normal stuff. But also “biography” for the biography shown on the profile or other things. Even “username” would be possible, for adding a user name to the profile.
- The nodes key would point to an IPFS hash containing the actual payload; either the text of the article (only one hash then) or the ipfs hashes of the pictures, the video(s) or other binary content. Of course, posts could be formatted using Markdown, reStructured Text or whatever format one likes to use. It would be a clients job to render it properly.
- The metadata field would contain plain meta information, like published date, tags, category and also custom metainformation as key-value pairs.

Maybe a version attribute for protocol version could be added as well. Of course, this should be considered an incomplete example, as I almost certainly forgot things here.

The idea of linking the previous version of a profile from each new version of the profile is very much blockchain-like, of course, with the difference that nobody needs to fetch the whole chain but only the latest one to get a profile. The more content a viewer of the profile wants to see, the more she needs to traverse the graph of profile versions (and automatically caching the content for others). This would automatically result in older content beeing “forgotten” slowly (but the content would not be forgotten until the publisher itself and all other “pinners” drop it). Because the actual payload is not stored in the fetched data, the actual amount of data which is required to simply view a profile is rather small. A client could be configured to fetch all textual content of a file, but not more than 10 versions, or one screenpage, or something like that. The possibilities are endless here.

Federated component

One might think “If I go offline with my node, my posts are not accessible if nobody else is online having them”. And that's true.

That's why I would introduce a federated component, which would run a stripped-down version of the application.

As soon as another instance connects and a new post is announced via pubsub, the instance automatically pins or caches it. Of course, this would mean that all of these federated instances would pin all content, which is surely not nice. One (rather simple and maybe even stupid) option would be to roll a dice and make the chance that a post is pinned a 50-50 thing, or something like that. Also, posts which are pinned for a certain amount of time are most likely distributed well enough so the federated component nodes can drop them... maybe after 90 days, maybe after 10... Details!

Blockchain-Approaches

The fundamental problem with Blockchains is that every peer in the network hosts the complete content. Nobody benefits from that, especially if you think of a social network which should also work on mobile devices. With users loading up images, videos and other large blobs of data, a blockchain is the wrong approach.

That's why I think a social network on Euthereum, Bitcoin or any other crypto-currency/blockchain is not an option at all.

IPLD

IPLD can be used not only to link posts and profiles, but also to link from content to content. Namely to link from one post to another, from a post to an image, a video, a voice message,... but also to link from one post to a git commit, an euthereum transaction or any other IPLD-supported data structure.

Once nice detail is that one does not have to traverse these links. If a user sees a post which links to other posts, for example, she does not have to fetch these links to see the post itself, only if she wants to see the linked content. Caching nodes, on the other hand, can automatically traverse the whole graph and fetch all the content into their cache.

That makes a IPLD-based linking approach really beneficial.

Scuttlebutt

Scuttlebutt is a first step into the right direction. One can say what one wants about electron and the whole technology stack which is used in Scuttlebutt (and like or dislike the whole Javascript world), but so far Scuttlebutt seems like the first social network that is completely distributed.

I thought about whether it would be a great idea to port Scuttlebutt to use IPFS in the backend. From what I know right now, it would be a nice way of bringing IPFS and IPLD to the mix and therefor enhancing and extending the capabilities of Scuttlebutt itself.

I have not final conclusion on that thought, though.

Problems

There are several problems one has to think about when designing such a system.

Comments on Posts (and comments)

Consider you want to comment on a post. Of course you create new content, which links to the post you just commented. But the person who wrote the original post does not automatically link to your comment, so is neither able to find the comment (which could be solved via pubsub), nor are others able to find them.

The approach to this problem is simple: Notification about comments can be done via pubsub. And, if a user gets a notification about a new comment, she can approve it and automatically publish a new version of her post, with some added meta information:

A link to the comment
A link to the “old version of the content in IPFS”

Now, if a client fetches all posts of a profile, it resolves all entries for their newest version (so basically the one entry which does not link to an older version of itself) and only shows the latest versions of it.

Comments on comments (and so on) would be possible with the exact same approach. That would, of course, cause a whole tree of comments to be rebuild every time a new comment is added.

Maybe not the best idea in that regard.

Multi-Device Support

There are several problems regarding multi-device support.

Publishing content

Publishing from multiple devices with the same profile is possible – one just needs to import the private key for the signatures and the profile information to the other device.

Though, this needs some sort of merging mechanism if two posts are published from two devices (or more) at the same time / without the other devices beeing online to get notifications of the new point of truth.

As creating two posts from two seperate devices would create two new versions of the profile (because of IPLD linking), which means two points of truth suddenly exists, a merging-mechanism must be implemented to merged multiple points of truth for the profile. This could yield a rather large network of profile versions, but ultimatively a DAG (Directed Acyclic Graph).

        Profile Init
             ^
             |
          Post A
             ^
             |
          Post B <----+
             ^        |
             |        |
  +-----> Post C    Post C'
  |          ^        ^
  |          |        |
Post D    Post D'   Post D''
  ^          ^        ^
  |          |        |
  |          +--------+
  |          |
  |       Post E
  |          ^
  |          |
  +----------+
             |
             |
          Post F

A scenario like the one above (each Post also represents a new version of the profile) would be easy to create with three devices:

One starts using the network on a notebook
Post A published from the notebook
Post B published from the notebook
Profile added on the workstation
Post C published from the notebook while off of the internet
Post C' published on the workstation
Profile added to the mobile phone (from the notebook)
Post D published from the mobile while off of the internet
Post D' published from the notebook while off of the internet
Post D'' published on the workstation
Notebook comes back online, Post E published, merging the state from Post D'' from the workstation and Post D' from the notebook itself.
Phone comes online, one of the devices is used to publish Post F, merging the state from Post D and Post E.

In this scenario, there would still be one problem, though: If the profile is published as an IPNS name, branching off of versions would be problematic. If C is published while C' is published, both devices would publish their version as an IPNS name. Now, first come first serve applies. And of course that is problematic, because every device would always see one of the posts, but no device could see the other. Only at E (in the above example), when the branches are merged, both C and C' would be visible (though D wouldn't be visible as long as it isn't merged into the chain). But how does a device discover that there are two “current” versions which have to be linked to the new post?

So, discoverability is an issue in this approach. Maybe someone can come up with a clean and easy solution that would work for netsplit and all those scenarios.

One idea would be that there is a profile-key which is used to publish profile versions under an IPNS name as well as a device-key, which is used to announce profile versions as a seperate IPNS name. That IPNS name could be added to the profile, so each other device can find it and fetch “current” versions from each device. Only the initial setup of a new device would need to be made carefully then.

Or, maybe, the whole approach is wrong and another approach would fit better for this kind of problem. I don't know.

Subscribing

Another issue with multi-device support would be subscribing. For example, if a user (lets call her Amy) subscribes to another user (lets call him Sheldon) on her Notebook, this information needs to be stored somehow. And because Amys machines do not necessarily sync with each other, her mobile phone may never know that following Sheldon is a thing now!

This problem could by solved by storing the “follow”-information in her public profile. Although, some users might not like everyone to know who to follow. Cryptographic things could be considered to fix visibility.

But then, users may want to “categorize” their friends, store them in groups or whatever. This information would be stored in the public profile as well, which would create even more noise on the network. Also, because cryptography is hard and information would be stored forever, this might not be an option as some day, the crypto might be broken and reveal all the things that were stored privately before.

Deleting profile versions

Some time, a user may want to remove a biography entry or a user name she once published. Because all the information is chained in a long chain of versions, one may think that deleting a node is not possible. But it is!

Consider the following (simple) graph of profile versions:

A<---B<---C<---D<---E

If the user now wants to delete node C in this graph, she simply drops it. Now, E beeing the latest point of truth, one may think that finding B and A is not possible anymore. That's true. But why not shipping around this by creating a new profile version and linking the previous versions:

A<---B     <---D<---E<---F
      \                 /
       -----------------

Of course, D would now point to a node which does not exist. But that is not a problem. Indeed, its a fundamental concept of the idea – that content may be unavailable.

F must not contain new content. It even should not, because dropping F because of its content becomes harder this way. Also, new versions of the profile is simple and cheap.

Problems are hard in distributed environments

I do not claim to know the final solution to any of these problems. Its just that I think of them and would love to get an open conversation started on the whole subject of distributed social networks and problems that come with them.

tags: #distributed #network #open-source #social #software

The development of a music taste

October 26, 2017

It is funny how a music taste changes over time.

When I started listening to music extensively, which was about 10 years ago (at the time, the nickname “musicmatze” came up, btw), I mostly listened to German rap and HipHop. I did not listen to “known” artists at the time. Some names I can remember are Pidvalid, Syntheciser, End and Alligatoah (yeah kids, take that – I listened to Alligatoah before it was cool).

After I graduated from middle school and entered the Gymnasium, my friends all listened to Heavy Metal. Soonish, I discovered Heavy Metal myself and found my new favourite genre – Melodic Death Metal. MDM is my favourite genre till today, but at the time it was the only genre I liked. Bands were Norther, Insomnium, Dark Tranquillity, Soilwork and In Flames – my favourite bands still today.

When I turned 18, and finaly was able to stay at parties all night, I developed a more broad music taste, including Death Metal, Black Metal, Nu Metal, Neue Deutsche Härte, Metalcore and also, a bit later, Hardcore, Screamo, Brutal Deathcore and other related genres.

When I was around 20 and 21 I discovered EBM, Industrial and Dark Wave, a bit later also Aggrotech. This led me to like electronic music – which led me to love EDM and especially Trance and Psy Trance but also Hardstyle when I was about 24 until today.

I really have no point here – I just wanted to write down that a I think it is amusing how a music taste changes over time.

Btw – I rarely listen to HipHop anymore. And if I do, not German one. Today I think German music (with the exception of Rammstein and a few other Bands) just sucks.

tags: #music