How to improve your open source code (2) – Function naming and size

September 20, 2017

This post was written during my trip through Iceland. It is part of a series on how to improve ones open-source code. Topics (will) contain programming style, habits, project planning, management and everything that relates to these topics. Suggestions welcome.

Please note that this is totally biased and might not represent the ideas of the broad community.

We're slowly working ourselves up from if statements in the last episode of this article series to functions in this article. I'm using the term “function” in this article, but you're welcome to interpret this word as you like – whether you're from the C world, a C++ hacker, you're lining up your types in the wonderful lands of Haskelldonia or you like snakes and play with Python all day long doesn't really matter. Functions can be Functions, Procedures, Methods or whatever you name these things in your language of choice.

So when thinking about functions, what do we have to think of first? Well, yes...

Naming things

Computer scientists and Nerds like to bikeshed this to death. We can talk about function naming all day long, can't we? I do not like this topic at all because most people cannot keep their heads calm when talking about this. So I simply list what I think should be in a general guide on function naming, and it'll help you exactly nothing:

Short, but not too short
Expressive what the function does, but not too expressive
To the point
Not interpretable
Should only contain the good-case in the name
Should not contain “not” or “or” and “and” – except when it should

Shrtpls

Function names should be short. If you have a look at the C “string” header, you'll find a function names “strlen”. This name is truly wonderful. It is short, to the point.

Your function names shouldn't be too short. So single-character names are a no-no! Even two characters are most certainly too short. One-word-names are a good way to go for simple functions. So a function “sum”, “ceil” or “colour_in” are fine.

Expressiveness

A name should always express what the function does. The examples from the last section are a good example for this. Bad examples are “enhance”, “transform” or “turn_upside_down”.

If a function name has to be a bit longer to express what it, the function, actually does, that's okay I guess.

To the point / Not interpretable

A reader of your code should understand what your function does when reading the name alone, maybe including the types involved, if your language offers types. But not only that, she should also be able to tell you, the writer of the code, what the function does without you correcting her.

I think it is always a good idea to think “from a third persons perspective”. If you think someone else can tell you what your code does, you can consider it good enough. Not perfect – every code can be improved. But good enough.

The rest

I want to summarize the rest of the points from above in this section. A good heading for this section might be “good practices for function naming”, but that also might not fit as nicely as it should.

The thing is, if your function actually implements business logic (you might not have a “business” in your open source codebase, but at least a domain), it really should not contain terms that are boolean operators. For example a function should never be names “does_not_contain_errors”. Not only is this name way to long, also including boolean logic in function names makes it harder to actually using them in boolean expressions because you have to wrap your head around these things all the time. A better name would be “contains_errors” - You can use a negation operator on this after calling!

In the end it's all about size

Out there in the primal world bigger is better. Bigger muscles, bigger knifes, guns or tanks, even bigger cities, cars, houses. But in the world of programming, things are reversed. Small things matter!

So your functions should be small. As small as possible, actually. Todays compilers can inline and optimize the hell out of your code – and if you're one of the scripting language enthusiasts out there – does speed actually matter in your domain? Steve Klabnik once said that, in the Rails world, people tell each other that yes, Ruby is a slow language, but network and the database are your bottleneck, so nobody cares.

Also, think of the advantages of short functions: people can more easily understand what you're doing, testability gets improved because you can apply tests on a more fine-grained level and documentation might also get improved because you have to document more functions (that's another topic we'll discuss in a different article as well).

I really don't want to write down any line numbers or approximations on how long a function actually should be. That's not only because I don't want to burden myself with this, but also because I cannot give any advice without knowing what the function should do – if you have a complex function that has to do a lot of things in preparation for the domain logic which cannot be outsourced (for example acquiring some locks), it might be 100 lines or more (also depending on your language). If you're doing really simple things for example building a commandline interface or setup work, it might be even 1000 lines or more. I really cannot tell you how much is too much.

But in general, shorter is better when it comes to functions.

Scoping

Some programming languages offer scopes. For example C, C++ or Rust, but also Java and even Ruby.

Scopes are awesome. Variables get cleaned up at the end of the scope, You can, of course, use them for if statements or loops. But scopes are also good for structuring your code!

In the above section I wrote that some functions might need to do some setup work or preparation before actually implementing their logic. For example, if a function needs to acquire some locks or allocate some resources before doing the actual work.

In some cases it is possible to separate the domain logic of the functions by simply applying some scopes. Let me write down an example in pseudo-rustic code:

fn calculatesomevalue() { // do // some // setup

{ // domain // logic }

// cleanup }

I guess that makes my point pretty clear.

Up next we will talk about modularization of code and how you can structure your programs by seperating things into classes, modules, namespaces etc etc.

tags: #open-source #programming #software

musicmatzes blog