Blog

What do CORBASOAP and gRPC have in common?

A lot actually, even though they all three made an impact in different decades (1990's, 2000's, 2010's), were initiated by three different organisations (OMG, Microsoft, Google), and purported to be much better than any previous or existing solution. In fact, when each of these technologies was released, they were billed as the best way to communicate between nodes of distributed software. Common characteristics include:

  • Programming language agnostic, with bindings for all modern languages at the time
  • DSL for describing endpoints and messages
  • Set of standardized types for message fields
  • Client and server code generation
  • Allow remote procedure calls
  • Facilitate developing operation-oriented APIs

While some interesting features stand out, the main problem that I find with these technologies is the last characteristic in the list. The remainder of this article aims to show why this is such a terrible flaw.

Form influences substance

In order to understand my critique, one must understand that I am not saying that each of these technologies is a fraud, nor that large corporations can't build successful software on top of them. The matter of the fact is that many great examples prove the usefulness of all three.

The problems I point out in this article are mainly ones of form, and not of substance. Such problems should not matter, except that in many cases form seems to influence substance very much, and especially where programming is concerned. So as to not get carried too far into a philosophical discussion, I'll illustrate my point.

The C programming language has proved it's worth many times over. If it weren't for Dennis Ritchie, Brian Kernighan and countless others from Bell Labs, IT would arguably not be as far advanced as it is today, and I would likely not be a computer programmer. Yet who would write a new program in C today? Unless you had to, you wouldn't. You might use Java, or C#, or NodeJS. Or you might use some newfangled language like Rust. Now anything you can code in any programming language out there today, you can code in C: this concept is called Turing-completeness.

So why is the choice of programming language important? There are various factors: the target platform, the team's skills, the libraries available, the CEO's golf partners, and somewhere on your list of criteria there might be the appropriateness with respect to the problem being solved.

Let's say you wanted to write a Web API, that many clients will need to consume. You might choose Java with Spring Boot, or C# with ASP.Net Web API. Both options can result in equally cost of development and total cost of ownership. In reality, one will strive for best practices, the other will become a big ball of mud. I leave the reader to figure out which is which, as an exercise. The only solid reason I can find that decides which outcome occurs is a one of form. One language and its community thirst for elegance, while the others wallow in patchwork monstrosities.

Now let's say you wanted to write a highly optimized HTTP load balancer. You could try Go, and be hit by limitations due to the garbage collector and the thread model. You could write it in C, or C++, which won't burden you with a GC, and will allow you to bend the threading model to your will, but you might lose your sanity. In the end, you could choose Rust. Rust is a singular language where the compiler itself can lead you to best practices while being more accessible than a purely functional language like Haskell.

RPC ignores the experience of the World Wide Web

One of the many critiques of SOAP was that it always communicated using the POST HTTP verb. This meant that it was using the Web while disabling all of its optimisations. The gRPC framework advertises that it uses HTTP/2, but you really have to dig deep to find out how to declare operations as idempotent or safe. Currently, there is an open issue to add caching optimizations to gRPC's wire protocol that have been an integral part of HTTP for a couple decades.

This is but one of many wheels that gRPC has reinvented or will have to rediscover. SOAP and CORBA never even attempted to tackle these issues.

RPC leads to poor API design

The inherent operation orientation of RPC means that API's end up as a list of operations and message types. There are ways of grouping operations together into subsystems — gRPC calls them services — and certain naming conventions can portray information about what an operation's constraints are. It is possible to design and maintain a beautiful and efficient operation oriented API. The form though usually influences developers to add operation after operation mindlessly, giving little thought and documenting even less the constraints imposed by the domain.

Alternatives to the operation-oriented API

The biggest challenger to operation-oriented architecture is resource-oriented. The main reason to push this style of API is that it naturally fits with Domain Driven Design.

The best-known example of resource-oriented API architecture is ReST (see also here, here, and here).

There are many misconceptions of ReST :

  • It's too difficult to implement
  • Its constraints are pointless
  • It must be implemented as JSON payloads over HTTP
  • Not all requests can be modelled as a creation, read, update, or deletion, so you end up needing endpoints representing RPC anyway
  • You can't generate a client from a standard description like WSDL or Protocol Buffers language
  • HATEOAS constrains the API to be inefficient, especially compared to an API built on GraphQL

I will not try to refute these statements here. If the reader is really in search of the truth, they will find it quite easily elsewhere.

Rather I would like to underline the importance of resource orientation. Just as the notion of class channels towards better software design, the resource gives much-needed structure to APIs. A resource, much like an object, correlates with something found in the physical world. It has constraints, states, and transitions that should be documented, or can at least be inferred from the API's structure. Resource identity leads to message validation and reuse of validation logic. When implemented over HTTP, a resource usually has a unique URI with a prefix shared by all resources of its type. A controller tends to handle all requests to URIs with that prefix, thus the responsibility of data integrity is clearly assigned to a single component. These same concepts remain when using other protocols.

Conclusion

The gRPC technology, a.k.a. the modern iteration of the ideas that brought you CORBA and SOAP, isn't a problem for extremely disciplined developers.

The problem is that it tends towards inefficient and obscure API design.

 

MEAN

For those who don't know, MEAN is an acronym for MongoDB, Express, Angular and Node. It is a standard stack in the same vein as LAMP (Linux, Apache, MySQL, PHP).

What's in this article

The goal is to take a basic MEAN application generated by Yeoman, package it as a container using Docker and run it with the help of an instance of MongoDB also running in a container.

Tools and versions used in this article

ToolVersion
MongoDB3.4.2
Express4.15.2
Angular1.6.3
Node5.12.0
Yeoman1.8.5
generator-angular-fullstack4.1.4
Docker1.13.1
Docker Compose1.8.0
MongoDB3.4.2

Generating the project

Yeoman is a great tool for scaffolding a Javascript project. It is extensible with many generators available. One such generator that creates an example project using the MEAN stack is The Angular Full-Stack Generator.

The following commands install the generator and run it to create a mean project.

Yeoman will ask you a bunch of questions. You can just hit Enter to accept the default settings or bend the project to your will.

Once that's finished you can run it by typing

Setting up a development database

Unless you already have a MongoDB instance running on your computer, listening to the default port 27017, the previous command will generate a few errors. To fix them, either you can use your favorite package manager to install and start an instance, or you can use Docker.

Now you should be able to run the project.

Testing and Building

The project has a few tests — written in Jasmine or Karma + Chai + Sinon depending on what you chose — that require the local database to be running.

Building the project is straightforward and puts everything in the dist folder.

Building a Docker image

In order to build a Docker image, you need to specify how the image should be constructed. One rarely builds an image from scratch. Instead you can take a base image that's close to what you need, and then add the rest. The following Dockerfile specifies just this.

Dockerfile

We take an image that already contains a specific version of Node, then we add the package.json file, ask npm to install all dependencies, before copying the rest of the built files. The last two instructions indicate that the process in the container will be listening on port 8080, and that the process is started by running the command npm start.

You can build the image with the following command.

Running the application in Docker

Once the application image is available, we just need to start a container with MongoDB, then start the application as another container. We'll need to feed the database connection URL to the application. The easiest way to do this is using an environment variable.

We'll also need to tell our application that is running in production mode. Again setting an environment variable does the job.

We could run each container using individual commands, but it is far easier to write a composition that Docker Compose can manage. This makes starting and stopping the containers over and over much easier.

The following is the contents of a docker-compose.yml file that tells Docker Compose how to configure and run the containers needed by our application.

docker-compose.yml
Icon

Spaces are very important in the utterly awesome and totally user-friendly YAML syntax. (Seriously (tongue))

This composition is run by the command

and torn down by the command

.

Using RESTful web services to implement a distributed system is an idea that has been known for many years now — Roy Fielding's thesis on the matter was published in 2000 — but these last years has gradually gained popularity and is now almost indespensable. I say this because ASP.NET Web API has entirely replaced WCF, and Microsoft have long played the role of broom wagon in the internet technology race.

As with any new trend, a question that often comes up is "Are we doing it right?".

In response, Leonard Richardson has developed a scale for measuring the maturity, or right(eous)ness, of a web service. Martin Fowler has written a good article explaining the different levels of this scale. According to this measure, the final level of maturity is attained when the web service in question serves resource representations as hypermedia, and clients leverage this to navigate application state. This is the HATEOAS constraint.

One benefit of complying with HATEOAS, is that a client needs very little out-of-band information in order to interact with the service. It needs the URIs for the entry points of the service, and some knowledge of the semantics of the link relations it references. This comes with a perk: if clients really adhere to this, that is they don't embed URIs that aren't entry points, the service can change all other URIs at will without breaking clients.

To take advantage of these benefits, the number of entry points should be kept low. How low? I don't know (big grin). So far, I haven't seen a book or article on the subject that gives any guidelines for this matter.

In any case, HATEOAS helps us with getting RESTful web services right, and provides some rules for how clients consume them. However there are still a few grey areas.

The Problem

Web services can be consumed by different types of clients, from a desktop application for a human end-user to an automated system hosted somewhere on the Internet. In particular, the front-end of an application may serve as a façade to several back-end subsystems implemented as microservices. If this front-end is web based, it needs to define its own namespace of URIs to which it responds. In order to attain the lofty heights of HATEOAS, how should one organise this namespace?

For example, consider an application that manages your contact information. The back-end could be just one web service with the following API:

MethodURIDescription
GET/contactsGet a list of all contacts
POST/contactsCreate a new contact
GET/contacts/{id}The details of a contact
PUT/contacts/{id}Edit the details of a contact
DELETE/contacts/{id}Delete a contact
GET/contacts/{id}/addressesGet a list of addresses for a contact
POST/contacts/{id}/addressesAdd a new address to a contact
GET/contacts/{id}/addresses/{address-type}Get a specific address for a contact
.........

However, not all of these URLs are to be exposed as entry points, if the full agility of HATEOAS is to be accomplished. In this example, the /contacts URI could be the single entry point.

So what of the front-end that uses this web service?

I have searched far and wide for various answers to this question. Here are a few of them:

Many of the answers I found mapped front-end URIs such as http://front-end/contacts/Bob/addresses/home to back-end URIs such as http://back-end/contacts/Bob/addresses/home, with the back-end URI template embedded in the source code of the front-end. This violates the fact that these URIs are not entry points.

So how do we do this right?

After measuring these answers up in terms of Richardson Maturity, I have come to the conclusion that there are three answers that comply with HATEOAS, and the difference between each of them can be reduced to choosing the multiplicity of entry points for the façade and for the subsystems.

 Façade entry pointsSubsystem entry points
Solution 1singlesingle
Solution 2multiplesingle
Solution 3multiplemultiple

The next sections of this article describe each of these solutions in turn, and provide insights into the tradeoffs that they carry.

Solution 1: single entry points

In this solution, the façade and the subsystems behind it each have a single entry point. Other resources that a subsystem exposes must be accessed by traversing links using only knowledge of link relationship names and the semantics they carry. The façade also exhibits this trait. For example, if we're talking about a user facing website, it would be a single page application with no deep links.

We can see the obvious tradeoff here: in the name of HATEOAS, we sacrifice the ability to bookmark a useful subpage for later. For instance, if I'm browsing an e-commerce website, and I might want to bookmark a few interesting items, or I might be in the middle of checking out, but I'm not sure so I'll bookmark the URI of the current step so I can come back later and complete the process. Yet none of these scenarios is permissible when there is only one entry point.

This tradeoff can be mitigated by adding useful links to the representation of the entry point. In fact many websites do this. Amazon allows users to add articles to a personal wish list and then access that list from the home page.

Also, this setup mirrors the behavior of most graphical desktop applications. Users just double-click on the application's icon and it opens in a standard state. It is unusual to be able to launch a wizard in the application, fill out some of the pages, then create a link on the desktop that allows you to come back at some other time and finish the remaining steps.

Solution 2: multiple façade entry points, single subsystem entry point

This solution allows users to bookmark any façade URIs, but the façade itself only uses one entry point to access all subsystem resources.

How would this work in the case of a contact manager? For example, if the user bookmarked the URI http://front-end/contacts/Bob/addresses/home and accessed it later, the façade might make the following requests to the back-end:

MethodURIResponse
GEThttp://back-end/contacts
GEThttp://back-end/contacts/456/addresses
GEThttp://back-end/contacts/456/addresses/home
Icon

The Traverson Javascript library simplifies these kinds of traversals.

It has inspired an API provided as part of the Spring HATEOAS library.

This solution makes the façade very chatty, and therefore trades bandwidth for bookmarkable URIs and fully mature RESTful back-end services.

Solution 3: multiple entry points

But in the end, who cares? HATEOAS is for purists and hypothetical utopian clients. Let's just go back to embedding URIs in clients and make everything easier. (thumbs up)(big grin)

This is not what my third solution is about!

Rather it consists in exposing some deeper back-end URIs as valid entry points, but not all of them. With HATEOAS, we're allowed to have more than one entry point. We just need to keep it to a few. How few? That's up to you. (wink)

To help with your decision, let's look at another benefit of relying on link relations: their presence or absence carries meaning. It hints that the client can or cannot make that transition. This could be determined by several factors: the state of the resource, the user's permissions, to name a couple. These links can be quite volatile, and thus should not be bookmarked. However, you can probably identify a subset of your URIs that are part of the core domain and are stable. In the rare case that they do change, just be cool and put redirects in place.

In our contacts example, the template http://back-end/contacts/{id} could be a useful entry point, but http://back-end/contacts/{id}/addresses/{address-type} would not necessarily.

This solution sacrifices some abstraction and agility for less bandwidth usage and bookmarkable URIs, all the while preserving full maturity.

Summary of tradeoffs

Solution

Abstraction & agility

Bandwidth usageBookmarkingMaturity
Solution 1(plus)(plus)(minus)(thumbs up)(big grin)
Solution 2(plus)(minus)(plus)(thumbs up)(big grin)
Solution 3(minus)(plus)(plus)(thumbs up)(big grin)

A couple anti-patterns

Icon

I've seen some articles or comments alluding to a solution in which the façade caches URIs for later use. However, this runs the risk of treating every back-end URI as an entry point and would expose the client to the volatility of certain state transitions.

Icon

I've also come across a similar solution where the façade embeds back-end URIs as query parameters of its own URIs. This is an anti-pattern for a few reasons:

  • the resulting façade URIs do not comply with the resource constraint, level 2 of the Richardson Maturity Model;
  • these URIs break encapsulation, which circumvents the very purpose of a façade;
  • this scheme risks circumventing entry points.

Final thoughts

This trilemma reflects a conclusion I have come to after digesting the current state of art. However, I wouldn't be surprised if other solutions are found in the future.

Also, I currently feel that each solution has equal merit. This may change as more experience is gained.

Finally, the examples given only integrate with one back-end service, but the solutions apply to façades that integrate multiple services. In fact, one could quite possibly chose different solutions for communicating with each of them.

Why it's the right time for .Net developers to embrace Jenkins and Docker

The Agile movement in software has created a market for various different tools, each automating a specific part of the development process. For instance, the call to continuous integration (CI) has given rise to the likes of Hudson/Jenkins and Maven that allow you to concentrate on the task at hand while getting precious feedback within seconds.

The .Net platform and the ecosystem surrounding it have been latecomers in providing decent tools for CI. Team Foundation Server (TFS) took a long time to provide anything more than a glorified copy of Apache Ant. The introduction of NuGet has helped somewhat with dependency management. Jenkins has had a plugin for MSBuild since 2008, providing an alternative to TFS. However, most IT departments who are fully enslaved by invested in Microsoft's products are loath to take what they consider to be a huge risk.

However, in a somewhat surprising turnaround, Microsoft has recently declared its love for Linux and open source software. This sentiment has been followed by the release of the first fully Microsoft supported implementation of the .Net framework for Linux which enables building and running ASP.Net applications from a Linux command line. This should help companies to take a chance on using more and more open source tools, especially ones that are currently only available on Linux, like Docker for instance. Microsoft has even announced a partnership with Docker and are working on fully integrating it with Windows.

Contents

What's in this lab

This lab will show you how to set up a virtual machine with a Git repository, and an instance of Jenkins, each inside their own Docker containers. A very simple ASP.Net application without any tests will be pushed to the repository. Then using a Jenkins Pipeline-as-code build, we'll produce a Docker container. Finally we'll run that container and check that our application is in fact running.

Icon

For the most part this lab is written from the point of view of a Windows user. However, it can be done entirely on Linux or Mac OS X. (big grin)

Tools and versions used in this lab

ToolVersion
Docker1.11.0
Docker Compose1.7.0
Docker Machine0.7.0
VirtualBox5.0.18 r106667
Jenkins2
GitLabCommunity Edition 8.7.0

ASP.Net Core

.Net Core

1.0.0-rc1-update1

This lab assumes that you are already familiar with these tools.

Prerequisite

Make sure that your machine has 64bit CPU that supports virtualization. You may have to activate virtualization support via your machine's BIOS setup utility.

If you're on Linux, you also need Git installed.

Lab

Install VirtualBox

Download and install Oracle VM VirtualBox using the appropriate installer for your OS.

Install Docker & Docker Machine

On Windows or Mac OS X, install Docker Toolbox. This will install both Docker and Docker Machine, along with Git.

On Linux, follow Docker's instructions for installing Docker then Docker Machine.

Install Docker Compose

Again, follow Docker's instructions for installing Docker Compose.

Set up a virtual machine with Docker

Thanks to Docker Machine this is a very easy step. In your favorite terminal — on Windows I recommend PowerShell over CMD — type the following command to create a virtual machine named "default" that runs Linux and has Docker installed.

After it has been created, the machine boots up in headless mode. On the first run, Docker Machine generates keys for securing SSH connections to the machine as well as TCP connections to the Docker daemon running on it.

Docker Machine provides several commands for stopping and starting the machine, as well as connecting to it via SSH, and setting the appropriate environment variables for connecting to its Docker daemon.

Speaking of which, use the following command to find out how to do that on your terminal and do it.

On PowerShell under Windows, this would output something like this:

We can see that Docker Machine set up the Docker daemon on the virtual machine to listen to TCP connections on port 2376. However, not everybody has access; only those that use the right SSL certificates. Thankfully Docker Machine also placed those certificates in .docker/machine/machines/default in your home directory.

Once you execute the suggested command, any subsequent invocations of Docker will talk to the virtual machine's Docker daemon.

Example for Windows
Icon

For the purpose of this lab, I assume that Docker Machine assigned the IP address of the virtual machine it created to 192.168.99.100.

If yours is different, make sure to adjust that IP address in all subsequent commands and files.

Transfer SSL client keys to the virtual machine

In order for Jenkins to build and run Docker containers, it needs to be able to communicate with a running Docker daemon. One way to do this would be to install one inside the container alongside Jenkins: this has been dubbed "Docker in Docker" or DinD for short. This does not work, as someone else explains. Another way is to allow the Jenkins container to see the daemon that's running it: similarly this is called DooD. There are many blog posts out there that explain more or less convoluted ways of doing that.

In this lab, we'll have Jenkins connect to that same TCP port that the Docker daemon is already listening to. Since this port is encrypted, we need to provide certificates for making client connections. One easy way is to reuse the ones that Docker Machine has already created.

Use Docker Machine to copy those certificates to the virtual machine for later use.

Icon

If for some reason this doesn't work, you can use another SCP client, as long as it supports SSL key authentication. Log in to 192.168.99.100:22 as user docker and point the client to the SSL key at ~\.docker\machine\machines\default\id_rsa.

To check that this has worked, you can connect to the virtual machine and list the contents of /home/docker/ssl.

Create Docker containers for GitLab and Jenkins

Create a new directory called aspnetci and a new file within it called docker-compose.yml with the following contents.

docker-compose.yml
Icon

The dnx volume mentioned in this file will be used to cache NuGet packages between builds. This saves a lot of time and bandwidth since packages will not be downloaded more than once.

This is similar to a technique used in conjunction with Maven builds.

Once this is done, run the following command from the aspnetci directory.

Configure GitLab and create a project

An instance of GitLab is now available at http://192.168.99.100:8000. It will require some initial configuring so go to that address.

Firstly, it will prompt for a new password for the root user. Provide one (and remember it (wink) ).

Secondly, log in as root, and create a new empty project. Let's call it aspnet-hello.

Thirdly, create a new directory on your machine and open a terminal in that directory. Use the following commands to clone a simple ASP.Net application and commit it to GitLab.

Configure Jenkins and create a build

An instance of Jenkins is available at http://192.168.99.100:8001. It also requires some configuration.

Firstly, it will prompt for the initial password. This is generated randomly and stored in a file inside the Jenkins container. It can be retrieved using the following command.

Then, let Jenkins install suggested plugins. Once that is done, you will be prompted to create a new user.

Install the Docker Commons Plugin and add an installation for Docker entering the following information:

  • Name: Default
  • Install Automatically: check
  • Add installer: Extract *.zip/*.tar.gz
    • Download URL for binary archive: https://get.docker.com/builds/Linux/x86_64/docker-1.11.0.tgz
    • Subdirectory of extracted archive: docker

Finally, create a multibranch pipeline entering the following information :

  • Branch sources > Add source: Git
    • Project repository: http://192.168.99.100:8000/root/aspnet-hello.git
    • Credentials: root/******

Once you have saved this the build should trigger. It will scan the Git repository to find all branches that have a pipeline build script named Jenkinsfile, then use that script to build the branch.

Jenkinsfile

This script carries out the following steps:

  • check out the current branch from the Git repository
  • ask Jenkins for the directory where it installed a Docker client
  • add that directory to the PATH
  • run the following steps in a Docker image that has the ASP.Net runtime and development tools installed:
    • fetch dependencies from NuGet repository
    • package the project as a publishable artifact
    • assign ownership of all files created during the previous steps to the jenkins user (otherwise Jenkins won't be able to complete the next steps)
  • build a Docker image using the Dockerfile contained in the project, tagging it using a combination of the branch name and the build number
Icon

Using separate Docker images to run build steps is an instance of separation of concerns in the build environment! This keeps you from having to add to the base Jenkins image all of the tools for all of the different programming languages that you might use.

Once the build is finished, you can check that the Docker daemon knows of the image by using the following command.

Run the resulting Docker image

Run the Docker image with the following command.

The application should be available at http://192.168.99.100:5004.

Where to go from here

So now you have a simple ASP.Net application that can be built with a multibranch pipeline Jenkins build. If you create a new branch and add a feature, you can build and test drive it in a new container.

There's still some way to go before we have a production-ready application and build. Here are just a few things that could be added: