API design — in search of excellence

I have the simplest tastes. I am always satisfied with the best.
Oscar Wilde

API design is a vital topic for software developers and architects. Creating software as a software engineer, even to use it internally and personally, you are also creating an API as well, at least implicitly.

However, there are surprisingly very few available books, studies and courses related to the API design in general. In case you know something outstanding, modern and inspirational in this area, let me know, please.

Of course, it is hard to ignore that API design principles are tightly dependent on used computer languages, business and technological domains, corporate and industrial standards, and many other factors.

This article mostly describes some general programming API principles related to computer languages and doesn’t cover more specialised topics (like RESTful, gRPC, XSD). There are also popular principles like SOLID, GRASP and others, but because they use mostly OOD terminology, I incline not to make direct references.

Personally, I’ve got experience in a few computer languages like Scala, Java, Haskell, C/C++, JavaScript and here I‘ll try to intentionally avoid specialised rules hugely dependent on some of the computer language or programming styles (yet they probably tend to statically-typed languages more).

Common design principles

Start with a big picture, divide it logically into the hierarchy of modules (whatever appropriate for your particular solution). Modules might be OS processes, web-services, microservices, components, library/database packages and namespaces), etc. This is top-down functional decomposition and structural design step to define boundaries, in which we would want to define an API.

A rule of thumb — you should be able to define one main responsibility for each module, which you can describe using just one plain and simple sentence.

For example, one might define it as:

* My Scala library does: JSON coding.* The library has a package org.my.library.coder.json that contains all related to JSON encoding.* The package has a trait Encoder that describes methods to encode JSON.* The trait has a method: Encoder.encode[T] that provides encoding objects to JSON.

To start doing modularisation properly, you have to comprehend what are you dealing with — domains, requirements and restrictions. You highly likely will have to revisit the whole concept again and again to do some reengineering that doesn’t fit and unsound anymore. Yet hey, we need a decent start, so take into account all of the requirements, use-cases, hardware/budget restrictions and other inputs you might have.

Gather all of the inputs you have, but analyse them sceptically, and do some basic probability/statistical analysis for everything seems suspicious. If you believe blindly and literally to everything other people say, you might end up with very complex “universal” architecture and API design, which nobody even wants to use.

And by the way, sceptically doesn’t mean rejecting everything that people ask you to implement. Discuss unclear places with others, communicate and make assumptions.

It might sound a bit trivial, but you have to name everything you do. Even before you start implementing something.

A good name usually is:

  • Concise
  • Short
  • Self-explanatory in a context — so, if you have, for example, an interface org.example.codec.json.Encoder, you don’t have to repeat some part of its name creating a method calling it encodeJson(). In this case, just encode()is good enough
  • Locally unique, to avoid using well-known common names from standard libraries, frameworks to prevent confusion
  • Follows mainstream/defacto/standard development coding styles for your computer language and environment (frameworks, company standards). For example, I’d rather not use “camelCase” for Scala/Java class names, because it is “PascalCase” is a much more common in those languages.
  • Minimising usage of obscure and unknown abbreviations — so, JSON is ok, when HLLL isn’t (of course you might have some corporate abbreviations and they mostly fine if you implement some internal software).

If it is really hard to find an appropriate name, it might also be a sign that you’re trying to overcomplicate a module with its responsibilities. In this case, it is better to simplify it (dividing into multiple modules, for example) instead of looking for better names.

There are some additional (mostly controversial rules) to do it. So, for example, I read from Martin Odesrky the “two-words-max” rule for method/function names:

We are generally averse to methods with more than two names in them.

At the end of the day

You should always keep in your mind, that everything you expose in your API is something that other people have to learn how to use and understand their design and its “grammar”. It would be even yourself, after some period of time, trying to remember how it really works.

So, if you define a lot of different models, interfaces, methods, patterns as your API, it would be hard to comprehend and start using it, just because people now have to learn more concepts.

So, it is always a good approach to analyse thoroughly and critically every public API module (interface, class, method, data class, function, etc) you have had already and asking yourself one important question — do we really need this in this API in this particular place? Is there any other option to solve it somehow else? Maybe, creating a more generalised solution or model (yet not overcomplicating it and still following modularisation principles)? You may need to do lots of refactorings and deprecating obsolete things, to keep your API clean for incoming requirements.

You also would find very rare cases that might affect seriously on your API design and its clarity. Maybe it is better to solve them, creating another API and modules, instead of ruining your short, clean and concise API for other 99% cases.

Almost nobody needs the name of SQL constraint from your database type in a web-browser user error page details (except hackers and your engineers probably).

Avoid all of the implementation details in your API, generalise and hide them, using:

  • interfaces/traits/type classes and instances;
  • access directives from your computer languages (like private/public/protected, import/export, etc).

Don’t forget about the exception and error handling. Error handlings is also an essential part of your API.

I recommend implementing API from the perspective of API clients. Using approaches from TDD and unit-testing libraries are the best way to do this.
This is definitely something I strongly recommend to do, just to be sure if your API is good enough for at least yourself.

Suppose you decided to implement a casual weather forecast library for Scala, so (after the requirement analysis stage), you can start to design an API from the view of an API client like this:

import com.example.weather.ForecastServiceval service = new ForecastService()// I want a default implementation of prediction for current date time and some default period
service.predict()
// I want to specify some period of time for prediction
val now = LocalDateTime.now()
service.predict( untilDate = now.plusDays(1) )
// I want to specify temperature units
service.predict(
untilDate = now.plusDays(1),
temperatureUnits = TemperatureUnits.Celcius
)
// Now I want to work with the results and error handling
// Here we deciding to go with standard Either(Right,Left) from
// Scala as a monad for prediction results, so this is also is an
// API decision
service.predict() match {
case Right(prediction) => {
// get an average value of whole period of time and units
assert ( prediction.avg > .0 )
assert (
prediction.temperatureUnits == TemperatureUnits.Celcius
)
assert (
prediction.untilDate != null
)
// get avg by some period of time (for simplicity sake
// we don't specify period units like avg per hour or others)
prediction.values.foreach { pv =>
assert ( pv.avg > .0 )
}
}
case Left(err) => { // error handling
fail(err)
}
}

They should be your very early version of unit tests, focused on API at the beginning and evolving to more structured and complete unit-tests when you stabilise your API. Start with basic and core requirements.

Play with those examples and check if they fit your requirements. At some stage, it probably would be some balance between simplicity, extensibility, clarity, performance, interoperability. Because even we might want to do everything an API client wants, there are other (sometimes conflicting) limitations or requirements, and you have to return to your examples again and again to find that balance.

Of course, it is much better when you use your own API yourself to implement some other services or products, so you would be your own API user and have this API user perspective all the time.

Follow interface segregation and high cohesion principles

Basically, this means that having in API more specialised, granular and loose coupled interfaces are better in general.

The interface terminology here is general and isn’t related exclusively to OOD or any computer languages keywords.
While for Java, for example, it is called really the same — interface, for Scala now it is a trait, Haskell has a typeclass, for C++ it might be an abstract class with pure virtual functions or in some cases just a template.

Let’s analyse in detail this with some example of controversial API design from Java.

Java has a special class called java.lang.Object, which literally defines that every your class also has some methods inherited from it.

That means that every Java class automatically defines some behaviour you have absolutely no intention to design.

For instance,

class A {}A a1 = new A();
A a2 = new A();
a1.equals(a2) // false

So, you have equalsmethod now everywhere and that wasn’t your API decision.

It is worse than that when you decide to at least override equals(even if it wasn’t your original intention):

class A {  @Override
public boolean equals(Object obj) {
// ???
}
}

Now you have two additional problems:

  • equals accepts Object, and now you have to implement equals to any possible classes
  • there is a hidden (yet well-known and documented) contract, that you also have to implement an additional method called hashCode as well

In other words, now you have to decide not just how to compare your objects with other possible objects, but also how they might be hashed, even if they didn’t suppose to.

How might it have been solved better? (Now it is probably too late for Java, indeed).

Two separate interfaces is a good answer for this:

interface Eq<T> {
boolean equalsTo(T other);
}
interface Hashable {
int toHashCode();
}

Now we don’t have to implement something we didn’t intend to. (In fact, in Haskell it is already designed like that with Typeclasses and I have shamelessly borrowed those interface names from there).

(Just for your information: unfortunately, in Java, you can’t implement the same interface multiple times with a different generic type like class A implements Eq<String>, Eq<A>, so there is a restriction for some cases).

Why we might need abstract interfaces like Eq<T>and Hashable? The answer is simple: to design more generalised APIs that use those interfaces.

For example, the standard java.util.HashMap is defined at the moment as:

class HashMap<K,V> 
// where K — is a type of key, and V is a type of value

which means that you can use it with every possible object in Java, and while that’s might be convenient sometimes, it also hides issues related to the equals and hashCode implementation. In fact, it works only just because every class in Java has implicit implementations of those methods, which are provided by Java language designers, so you can’t do something similar and add some your own methods to all of the Java classes.

With the interfaces like Eq<T> and Hashableyou could design it more precise and accurate as:

class HashMap <K extends Hashable & Eq<K>, V>

With the API like this, you don’t need any more hidden contracts between methods in the documentation. This is the contract, and this time it can be checked by compilers, not our eyes only.

Once upon a time you invented some additional rule and made a decision, for example, that you name all of the class names in your API for some abbreviations in uppercase like EntityID, ProjectID, ModuleIDand so on. One year later, you found out that you now like more doing it in PascalCase like FileId, FieldId, CustomerId etc.

Don’t do that silently, you have two options:

  • Follow the previously adopted rule
  • Make the previous rule obsolete, declare old definitions deprecated and provide some way to migrate for legacy code

It is not just about naming, it might be any template, pattern or behaviour visible to API clients.

For example, you have service factories and you create them like: MyCustomerDatabase.create(),FileStoreService.create() , …
So, you provide some API pattern and all API users are expected that this pattern has very similar characteristics (obvious return types, close complexity, resource management, etc), and you can’t change it for some service without any explanation or clear reasons.

Another example is parameter ordering in methods and functions.
Use the same parameter ordering in contextually and logically similar places.
For example, you’re providing an API to draw on canvas some geometric shapes, so there are functions like:

Circle.draw(canvas, x, y, r)// use the same ordering as before
Rect.draw(canvas, x, y, w, h)
// don't do this, for instance
Rect.draw(x, y, w, h, canvas)

Use specialised types: String isn’t the answer for everything

Let’s have a look at some data type definition example:

UserProfile (
id : String,
locale : String,
address : String
)

This data type contains fields with the same type and they all strings. Although, in reality, they very different types, aren’t they? We can create objects now like UserProfile("my-id","en","UK"), but also accidentally like UserProfile("en","UK","my-id") and you easily might make mistakes like this.

Let’s try to do better:

UserProfile(
id : UserId,
locale : Locale,
address : Address
)

This time, it is a well-defined type and prevents API clients from silly mistakes (as we saw before).

One may concern here about run-time performance and memory overhead for this decision in cases you really wanted store only string values and didn’t want to create other type instances. Well, it depends what computer language you use and if you can define types that generally known as value classes or value objects (mathematically speaking, they are the types that isomorphic to the value they hold).

This means that the specialised value class can be checked at compile-time, while at run time the two types can be treated essentially the same, ideally without any overhead at all.

In Scala this is might be defined as:

case class UserId ( value : String ) extends AnyVal

In Haskell this would be:

newtype UserId = UserId String

Unfortunately, for Java there is no support for anything similar, and you have to create common classes (or still use String type if you need it). There is a JEP 169, but it is still in progress.

For C++ you can do it similar approach overriding operators and defining initialising constructors and const.

Additionally to the type checking, defining those fields this way, you will able to extend their types in the future with more complex structures and specialised methods or functions.
For our example maybe you would want more complex Locale type like:

Locale(
language : Language,
country : Country
)

I’m not suggesting doing it all the time for all of your string fields. You can find compromises like this (and they also fine in my opinion):

UserProfile(
id : String,
locale : Locale,
address : Address
)

So, just bear in mind, that specialised types are really powerful and, with the help of compilers, let you avoid this kind of mistakes.

Software Developer & Architect