Creator of Tyk.io, digital hippie, tinkerer, collector of books

Horizons: Service Mesh without Sidecars

Executive Summary

Microservices tend to be associated with service mesh, and service mesh has an ugly secret at it’s heart - the sidecar proxy. When we try to simplify something that is complex, the most elegant solution tends to be the best, and sidecar proxies are definitely not elegant. In this article we’ll discuss some alternatives and what the future for Service Mesh looks like.

Service Mesh as a concept has been with us since around 2018, as most may already know, service meshes emerged from the melange created by the proliferation of container orchestration systems, which in turn emerged because of the proliferation of containers, which started because of the emergence of the micro-service architecture as the preferred software service delivery method across most digitally-enabled businesses.

In short, a solution (near-idempotent and transportable software packages), led to complexity, which led to a solution, which led to more complexity, which landed us with service meshes.

This is certainly not a criticism, when complexity can be simplified through tooling it inevitably makes the advantages of the complexity more available to non-specialists. Which in turn means more people can benefit, and ultimately that leads to a benefit for the consumer.

This is the whole “standing on the shoulders of giants” trope, my favourite example is”AI”. It’s a deeply complex topic to master, but it has been made highly accessible due to black-box tooling such as TensorFlow, making a complex set of tools available to almost any software developer. Proliferating the benefits of this technology to everyone.

The thing is, service mesh was born to solve a complex problem, and the crux of the solution was to glue essentially duct-tape two containers together and shodilly weld their pipes together and then add some really solid software on top of it.

It’s pretty terrifying when you look under the hood, because in order for a side-car to work, it needs to rewrite a whole bunch of firewall rules of it’s mated-pair in order to capture outbound and inbound application traffic. Of course, all of this kludgery is in the interest of simplicity, and the ability to turn any service into a mesh-enabled one without modifying the application.
It’s a kludge, and an ugly one at that.

What's been amazing to watch though is the fact that this deep-seated ugliness that sits at the heart of something ultimately rather elegant, is finally being addressed by folks that are actually trying to make developer’s lives easier.

First off, Envoy introduced the XDS API specification, also known as control-plane-APIs. These discovery services enable “data planes” - such as Istio - and “control-planes” - such as Envoy - to use a common set of APIs to enable discovery, routing and addressing.

This API-set was introduced in order to commoditise the proxy (and indirectly) the side-car layer of the service mesh. Which is fantastic, as more competition breeds innovation, and so long as it centres around some common standards, everyone benefits.

A wonderful side-effect of the XDS APIs, is that they also mean that you probably may not need a side-car in your future service mesh.

We’re already seeing this in action, with the extremely popular gRPC framework (a framework and SDK primarily aimed at building microservices), adopting an XDS layer within the framework itself, enabling the application to offload all the side-car related functionality to the application framework, and bundling it all into a single homogenous family.

Given we are talking about gRPC here, it also means that versioning, compatibility issues and fallbacks are baked right into the framework. Which essentially makes integrating into a service-mesh a code-first problem that can be checked, tested, and verified at the SDLC level rather than via an integration test within an environment.

This framework-level integration of the XDS APIs may not provide all the functionality that a side-car might, but it does provide something extremely important: a real reduction in complexity.
In the ever-increasing melange of the kubernetes-microservices-service-mesh-industrial-complex, any reduction in systems complexity is a boon.

This is all well and good for green-field applications, unfortunately for third-party systems such as databases and message queues, as they are bought-in, a sidecar is the only option to play nice with your mesh. However - should XDS APIs truly standardise the service-mesh management and communications space, it wouldn’t be surprising to see even-more cloud-native applications baking some level of compatibility into their core offering.

Think of it this way - when we use a runtime like Java, or .Net, we are outsourcing complexity to the runtime for syscall interactions, so why can we not move one layer up into the networking stack and make that simpler too? There’s a definite need for better-equipped server-side frameworks to embrace the realities of the modern application stack.

From the userspace software level, to the kernel level - the next way to ditch that ugly sidecar has been enabled through a recent kernel module called the extended Berkeley Packet Filter.

eBPF, as it’s known in “the biz” basically enables a user-space piece of software (say, your application, or a linux service), to set traffic handling rules at the kernel level, and modify those rules in real-time. Essentially it means that packet-level decisions can be made much earlier.

Instead of traffic needing to pass through the chain of modules that eventually lead a decoded packet to userspace, that packet can be analysed, and a decision be made before any of that happens, protecting the upstream application, or simply routing the traffic to the correct process or service elsewhere.

This is extremely powerful, and has already been jumped on by software-defined-network stacks such as Cilium, where instead of a sidecar, a standardised daemon packaged with your application in your container can handle all routing, load balancing and network-level transactions formerly part of the side-car.

“But you’ve basically just bundled the sidecar into my container! I could do that with Envoy too!” you may say, and to an extent that is true. Except that in the eBPF scenario, the daemon is operating at the kernel level, and so the overhead of all this processing is far less than what to expect from a sidecar. It also completely removes the need for a complex network and firewall kludge, because your eBPF-enabled daemon is replacing your firewall altogether, making configuration and management much clearer than the hidden complexity a sidecar is based on.

We’re not quite there yet with the removal of the side-car, and as I said above - the sidecar will be necessary for legacy applications that do not converge around standardised APIs such as XDS, but I can definitely see a future where our software runtimes and frameworks begin to embrace the modern server-side application environment and embed them into their core.

Runnning a Business for Humans

There is a tendency in the MBA-led, VC-fuelled startup community to take a ruthless, grow-at-all-costs attitude that focuses more on the acquisition of funds and vanity growth metrics to validate a valuation than to build a sustainable business.

The most extreme examples of this have been evident in the IPOs of two unicorns: Uber and Lyft, two businesses that have skyrocketed growth but burn cash and haven’t made any significant profit since inception. While everyone hopes that Uber will do an about-face like Facebook did, it is still a significant gamble for any investor looking for a return.

Is a startup a real business?

Now you could say that this should be expected, after all, a “startup” by definition is designed to grow quickly, increase in value-related metrics quickly, move through multiple funding rounds to define a market price and then sell-up in some way to generate a return for the investors and ultimately the founders of the startup.

However, a startup is also a business, and for a business to be truly successful, it needs to be self-sustaining. Like a power station, it needs to generate more output than the cost of materials going into it.

Remember, this is only my opinion, but a good business shouldn’t have to focus exclusively on growth. It should be possible to reach a stable equilibrium that can generate a consistent income for those working for the business (I include founders in this definition). This is essentially the very definition and appeal of a small to medium sized enterprise, and is also what most businesses are.

So why and when should a business focus on growth? Well, when it needs to compete, it needs to dominate and carve out a share of the market that it owns and controls, and the more of that market position is sustainable and controllable, the better the business is. Competition forces the business into growth phases, since revenue may be lost in an existing market, growth outwards geographically into more markets is the only way for the business to keep it’s equilibrium.

Of course, ambition is also a major factor in why a business should grow - it is in and of itself deeply exciting and fulfilling seeing something that you created flourish into a place that you could ever imagine, and the feeling of that high is incredible, and it’s something that many founders want to maintain.

You are complicit and have a responsibility

But I digress, the reason for this ramble is because ultimately, when you start a business, you have a responsibility, and I feel that many startups do not take that responsibility seriously.

Startups garner employees, all that sweat equity that founders put in eventually turns into other people helping to do the work and helping with the lifting. The difference is, you are paying them for their time. As a left-leaning friend argued: an employee is selling you their time and knowledge for money.

To him, it’s a value exchange: they could spend their time elsewhere - it’s the need for a roof over our heads and the food in our (and our families) belly, that creates demand to take on work.

That’s a pretty socialist perspective, from a capitalist perspective, an employer is creating opportunity for work, and therefore providing a service to the labourer by giving them a job so that they can live.

However, taking for granted the understanding of the motivations for why people work has also slightly perverted the understanding of what it means to take on staff. Society as it stands in the west is extremely focussed on the perception - and system - of education leading to work, and the extended focus for young adults to gain employment and join the workforce. In many ways, the underlying demand is taken for granted and getting hired is an expected value judgement to maintain our social status quo.

Now I’m not going to rant about how worrying that particular trend is, what I will say is that the expectation to work is a major factor in the capitalist view on labour in general, and easily distorts how we see our employees when we start recruiting for growth: as cogs, or, in that desperately inhuman term: human resources.

Founders aren’t as important as they think they are

As a business grows, the input and value of the founders should diminish, otherwise they are not doing their job properly. The best kind of business is a self-sustaining machine, that can ramp-up and grow, but also continue to generate value at an equilibrium state.

As soon as a business can do this, the value exchange is no longer between founder and staff, or between company and staff. It is between staff and other members of staff. Ultimately a company, especially a knowledge or service-based one, relies on the people inside it to generate value.

So as a founder, the real responsibility is to make sure that your team are happy, and that they are working for themselves as much as they are working for you. That your team have lives that they live outside of work, and you should be helping them achieve their own long-term goals, not just those that increase your potential wealth.

You must invest in their lives as much as they are investing their lives in your company, and that goes far beyond paying a salary.

The sooner you can reach value equilibrium as a business, the sooner the founder can make themselves obsolete. A business that relies on personalities or heavy-lifters cannot sustain itself when that dependency leaves. That means a team should be mutually inclusive, mutually beneficial and should be able to withstand churn without impact. To do that requires trust, faith and a human bond with the organisation, not just a pay check.

Business continuity can be solved through process, and I can’t argue more highly for having good processes in place that can help people do their jobs when they are a new hire! However what is missing is the underlying value that is generated by those people only really comes about once there is trust and interactivity between them while they work through a process. Good process is a gateway to value, the people working through that gateway are the ones that generate it.

So, I have rambled for a while here, where am I trying to get to?

A business for humans

I’d like to propose some simple ideas that founders can follow to build “a business for humans”, and here’s my first stab at them:

  • The team is precious, nurture it
  • A business is a garden for generating exceptional people, help them become the best they can be
  • A team that is valued generates more value together than an individual
  • Build good process, but not to generate value, trust in people to generate value
  • Founders are important, but they should aim to become obsolete
  • Grow in the face of competition, but always grow to equilibrium
  • Equilibrium generates sustainable value
  • Sustainable value ensures long-term investor happiness

Happy hacking!

All opinions expressed in this blog post are those of the author and are in no way endorsed or connected to their employer