How to improve your open source code (4) - API Design

This post was written during my trip through Iceland and published much latern than it was written.

What is a nice and gold API. How is “nice” defined when it comes to library interfaces? That’s a question I want to discuss in this post and also, how you can create a nice API in your open source library without studying a topic like software architecture or similar.

Definition of a “nice”/” easy to use” API

But first, we have to define what makes an API good. And that’s not that easy because this topic is very biased.

For me, a good API is one where I can get the job done without thinking much about it. That means that there shouldn’t be that much setup code involved in my code just to use the library. So no Factory hell if the only thing I want to have is the current time, for example. This also means that the API has to be decent high level, but without losing the ability to do fine-grained work if necessary. So for the most part, low level (for example implementation details) things are not interesting for me. But when I want to bit-fiddle around with the library, it should let me.

If a builder, factory or some other mechanism is necessary to produce objects in some way, the library should make clear (documentation wise but also code wise) why it is needed. There’s no point in making the user call the tenth factory instantiation if it is not necessary and also it makes the users codebase blow up in size and complexity.

The naming of things in the library should be good, appropriate and, for the most part, be consistent. If a function on an object which returns the string representation of that objectbis named “to_string” it should be named that way for all types from that library, not only some parts.

Statelessness

Calling functions of your API should always result in the same values for the same arguments. That does not mean that your API should be pure in a functional programming meaning, but rather that the actions executed when calling a function should not result in some library-internal variables to be set, changed or unset. This is easily achievable by letting the user of the API have an object that holds the state, and functions of your API work based on that value. In short: your library should not have global variables.

This simple design pattern already results in easy to use APIs and a nice user experience.

Error exposure

Good libraries don’t hide errors. Indeed, it is even better if errors are exposed to the user as much as possible. Because the user of the library knows best when and how to handle errors, even from your library.

I’m also a big fan of lots of error cases. The more error cases (the better a user of a library can distinguish between different errors) the better. This way, you let the user decide where she doesn’t distinguish between two almost-equal error cases and where it is better to handle them independently. If your library does not give that opportunity, the user has to make ugly Spaghetti-code handling things to be able to tell what is going on. Of course, these things have to be documented properly.

Another thing that can come in handy is when your error types or your library exposes functionality to translate error types into text which can be shown to a user of your library. Nothing is worse (from a users point of view) than a “CallOnInconsistenStateObjectBuilderFactory on line 2832” error message shown in an user-facing interface (and trust me, I’ve seen such things already).

Completeness

Nothing is worse than an API that is not complete. I mean, don’t get me wrong - sometimes one does not think of all cases a library could be used for - and that’s completely okay. But some things are too obvious for being left out. For example, if you provide functions to transform your time object from local time into GMT, why wouldn’t you provide functions for converting it into UTC or EST? These also matter!

Also cleanup routines. In some languages it is necessary to include cleanup routines for your objects. If your library exposes alloc_vacation_location_obj() it should also provide free_vacation_location_obj()! Sure, a user could use free(), but it is not nice API-wise. Even if your function does nothing more than call to free(), it is better to provide a function (and if you want to include some more cleanup in your function later on, in a new version of your library, a user does not have to think about it that much when upgrading their dependencies).

Consistency

We had the naming game already, but it always comes back to us, right? Consistent naming is one of the most important things in an API. If allocating worked with functions prefixed with new_ all the time, it shouldn’t be done with alloc_ this time. Also not in later versions if your library. Not even in a major version bump.

Even more important than naming is behaviour. A function that is named with some alloc prefix should only allocate, never initialize or do other fancy stuff (debugging output excluded here, if necessary).

Next

In the next episode we will talk about how one can plan an application.