As a service: the why of Platform Engineering?
As devops has expanded and more companies are moving to complex cloud based infrastructure, we find ourselves needing more and more abstractions to solve the same problems that devops was designed initially to solve, that is that developers can focus on developing.
A common solution is to have dedicated teams that are responsible for implementing and maintaining these services, providing software and platforms ‘as a service’. There are many advantages to operating this way, including cost savings, reliability, and increase developer velocity.
This post will cover how we best to set up these internal services, and how to manage them.
Why create internal managed services?
So you may ask with all the tooling that GCP/AWS/Azure provides, why would we want to create extra teams to manage this workload? Can’t the customer feature teams (CFT) do that themselves with one or two dedicated devops engineers?
Even with the tooling and services that cloud providers provide, there’s still a huge amount of complexity and ways that these services can be configured, because they are designed to serve thousands of different businesses and use cases, not a tailored solution for your specific business. In your company, you want things configured in a specific way in order to satisfy regulatory controls, interface with other systems, and any other customisations that are are required.
Another advantage is that for every service, whether it be cloud infrastructure, build systems, or a customer facing app, you need to manage lifecycles, patching, monitoring, and all those things that keep the service reliable and secure.
If you have every CFT solving these same problems in their own way, you can quickly see that there is duplicated effort, and with only one or two devops engineers, they will quickly get swamped trying to maintain the services that they have set up.
You abstract and reuse software libraries, why not do the same for common software and infrastructure platforms?
Managed internal services provide efficiency of scale, reduced scope for CFTs, speed up new product development, and save costs.
That sounds great, so what is a managed service?
A managed internal service is a similar concept to any third party SaaS or PaaS, except it’s much more tailored to your specific business, and much more flexible as the business has full control of the internal service.
Common managed internal services are things like CD, observability tools, and kubernetes clusters, but really it can include anything that solves a common internal business problem.
However, this is not just a ‘service', it’s a managed service, so you need a dedicated team to create and maintain it.
What do you need in a managed service?
It needs to be as reliable, especially if it is a service that your customer facing apps require at runtime. This means you need create and stick to SLAs, have appropriate monitoring and alerting, and most importantly you need to ensure that BAU work (ie patching and maintenance) is prioritised over new features. No CFT will want to use your service if it’s not reliable.
Ease of use and clear interfaces
The service needs to be easy to use. If you’ve created a complex service that requires in depth knowledge of the solution that you’ve implemented, have you actually solved a problem, or are you just gatekeeping?
Provide a clear interface for CFTs to interact with the service, preferably abstracted from the technology that implements the service itself, so that you can fully control the service while the CFTs can continue using it without having to change anything in their code or processes.
A clear interface also helps implement guard rails, that is enforcing not only what can be done, but how you do that specifically, and what you can’t do. This is important to enforce patterns which allow common problems to be satisfied much easier.
Provide support for your service
As with any third party SaaS/PaaS, you need to have some sort of support system for humans to interact with, and this needs reasonably tight SLAs to reduce turnaround time. If someone has to wait hours or days to ask a simple question about using your service, they’ll probably get fed up and just implement it themselves.
As part of this support, you should also provide training, preferably in person so that developers can ask questions and get quick turnaround so that they can decide how their particular application can work with it. You may even want to have a managed service team member embedded in another team to assist with any migration to your service.
Your managed service should be able to scale automatically, and without having to add extra people to the managed resource team. This means the only cost increases you get when you onboard more CFTs is possibly some extra cloud costs, or depending on the solution, it may not be any cost at all.
Self service is a great way to ensure that your service is scalable, as teams can onboard themselves with little or no assistance from the managed service team, just ensure that the self service mechanism is easy to use and doesn’t expose too much implementation detail.
Integration with other services
Your managed service is only going to solve one piece of the puzzle for CFTs, so you need to ensure that it will integrate with other internal and third party services. For example, you may have a managed kubernetes platform, but you can implement the logging and monitoring that is another internal or third party managed service, so that when a CFT wants to run in your platform they also get these other services ‘for free’.
Also ensure flexibility to allow integration to other ‘unknown’ services, like an old on-prem system or a government service, that CFTs can manage themselves. If a CFT has a specific business requirement to integrate with one of these systems and your managed service prevents this, they won’t be able to use it no matter how good it is.
This is probably one of the biggest advantages in a managed service, you can implement internal business and government regulations in one place. Often these regulations may not make sense to CFTs, or they don’t even know about them, but any that you implement in your service are automatically implemented for the CFTs.
What do you need to be aware of before starting a managed service?
So it all sounds like a great idea and you want to get started on creating a managed service for your company!
Before you start, there are a few major considerations you need to sort out:
You need strong support from the executive level not only for the creation of the managed service, but ongoing staff, cloud resource and other expenses. Managed services can be a significant expense, and if management pull support from it, it will either leave the service to rot or CFTs will be forced to do significant work to migrate to other solutions if it is shut down.
Ensure you are actually solving a problem. Don’t reinvent the wheel if there is an existing solution (internal or third party), and don’t add just a layer of obfuscation if it won’t fix any problems, or introduces more problems than it fixes. Also ensure that it solves problems for enough CFTs to take advantage of economies of scale. If it’s only applicable to one CFT, that CFT might as well implement and own it.
You should architect your managed service just as you would end user applications. Consider the user journey, the persona of your users and their technical abilities, and make sure they will be able to easily use your product. CFTs shouldn’t be made to use your managed service, they should want to use your managed service. This is especially important for your own job security as well, if you implement something that nobody uses, or everyone complains is blocking productivity, you’ll quickly lose support from upper management and they may make the entire service, and its staff, redundant.
Conduct research on the current tech that the CFTs are using and take that into account. For example, if your CFTs are mainly managing legacy, stateful, non-cloud native applications, don’t use kubernetes in your managed service. Also consider regulatory requirements, especially data sovereignty as that can restrict the location and rule out some third party managed services straight away.
Internal managed services are a great way to stop duplication, increase reliability, and increase overall developer velocity, but they are a project in themselves to implement, and something that has to be in the forefront of your engineering strategy. If you’re considering implementing a managed service, the team at Innablr has extensive experience in managed services, and can provide technical and non-technical guidance, as well as the actual creation of the service itself.
Innablr is a leading cloud engineering and next generation platform consultancy