Microservices

Service discovery for microservices: components, patterns & more

If you’re building a software application, it was once common to build a monolithic product: that is, one large service that does everything. Today, it’s more and more popular to build several different microservices instead, each of which is a distinct module with a specific function. Together, the product is a microservices application. The pros and cons of creating a microservice architecture are discussed more in this article. If you decide to pursue this path, it’s important to make sure that the services work well in tandem — which means you’ll need some way of finding out what microservices are used in the application and where they live. This is known as service discovery. 

Historical Context 

The problem of searching for and using the right network-equipped services or devices has existed throughout the history of networking. This problem gave rise to specific protocols for finding devices (like printers) on a local network. One such example of a protocol is Bonjour, developed by Apple. Another option is to have a single configuration file with a static list of network addresses. One example of this is the Hosts.txt file, which is a mapping of IP addresses to hostnames (device labels).

In the software field, many Java, Rails, and Node.js applications are still built and packaged as monoliths. With monoliths, there isn’t a large need for advanced service discovery as most relevant services are available as part of the same codebase and only one or few programming languages are used. Any external services used are easily locatable via REST APIs or API gateways. However, as these products become larger, they have scalability problems and it becomes difficult to analyze dependencies. As a result, companies are shifting to service-oriented architectures, which brings up the need to discover services. 

Microservice Discovery

When an application is built with a service-oriented architecture, it’s split into separate microservices — often running in Docker containers to make the application more modular — and each service does not intrinsically know about the other services. 

The relevant services may be part of a distributed system, especially with dynamic scaling, so it’s important to accurately identify their locations. It’s also possible that microservices may contain off-the-shelf components written in different programming languages, so a core part of service discovery is ensuring that the separate services can interface with each other well. 

There are several different reasons to invest in a good process for service discovery: 

  • There may be multiple versions of a service or outdated ones. 
  • Services may be managed by different teams, so it’s important to understand who the correct owner is. 
  • Services may scale down (manually turned off by a developer or automatically removed if there is less traffic or the service is malfunctioning) as well as scale up (by adding new services or multiple copies of the same service to distribute load better). 
  • There will also be new versions of services developed and pushed to production, and when these updates are made, locations may change. There may also be a transition period between the old and new services. 

Patterns of Service Discovery

When implementing a service discovery mechanism, there are two general approaches that a development team can follow. Note that for any service discovery pattern, it’s necessary to set up a service registry which lists all of the microservices available and their network locations with details such as ownership, documentation, and recent deployments. It is possible to have multiple copies of this registry for load balancing. Some examples of a service registry are Apache Zookeeper and Consul (developed by Hashicorp). 

Client-side service discovery

Here, a client refers to each individual microservice. In client-side discovery, the client first does a lookup to retrieve another service’s location from the registry. Then, the client directly calls the other service. The registry is not involved in the communication between services.

One advantage of this approach is that the infrastructure is relatively simple to set up. However, it requires updating all microservices if part of the service registry changes. If the services are written in different programming languages or frameworks, this can involve a large number of changes.

Server-side service discovery

In server-side discovery, a client does not directly talk to other services. Instead, the client makes a request to a load-balanced endpoint, which then queries the service registry. Once the location is retrieved, the endpoint forwards the request directly to the service. The client itself is not allowed to interact with the service registry; its request is simply forwarded along. 

The advantage of this setup is that there is one central point for discovery. However, the load-balanced endpoint is another part of the system that needs to be maintained, adding complexity. There are also more network hops needed before the destination service is reached, which may add latency. One example of a load-balanced endpoint used for server-side discovery is an AWS Elastic Load Balancer: when incoming requests from a client reach the load balancer, they are forwarded to the correct compute instances via a router. 

Cortex can be connected to Amazon Elastic Container Service to look up services in your AWS account. It can also be connected to Kubernetes to easily import services from multiple Kubernetes clusters into Cortex and periodically send information about active replicas and currently deployed versions back. 

Types of Service Registration Patterns

There are two ways that services are added and removed from the registry. In a service-oriented architecture, this happens quite frequently as new services are added and removed in response to traffic demands.

Self-registration

In self-registration, the services directly interact with the registry. When a service starts up, it adds the service name to the registry; when it shuts down, it removes itself. The registry may do a health check to make sure that available service instances are active. In this model, code must be implemented for each service to register and deregister from the registry. One example of a service registry that can be used for self-registration is the open source Netflix Eureka client. If this registry is used, each individual service must ping the service registry directly to add itself. 

Third-party registration

In third-party registration, the service registry registers all of the microservices. It does this by polling the deployment environment or subscribing to events. All of the microservices are decoupled from the service registry, so logic does not need to be implemented for each service. An example is autoscaling groups within Amazon Web Services, where new instances are added and set up via an algorithm without user input.

Service Discovery Implementations

So far, we’ve talked about patterns of service discovery and ways that registration is done. Here, let’s discuss specific types of technology and infrastructure used to implement service discovery. 

DNS

The simplest way to implement service discovery is to use Domain Name Service (DNS) infrastructure, which is a way of mapping resources over the Internet. Each service is mapped to a virtual IP address, which needs to be kept up to date as services are created and destroyed. This is simple to implement in any programming language and can be easily used with client-side registration. However, DNS is not a real-time system and data is cached, so quick updates are not possible. This type of infrastructure is also operationally expensive to maintain.

Key/value and sidecar

Another way to implement service discovery is to use a key/value store such as Apache Zookeeper. The client can easily query the datastore and retrieve the relevant information about services. It’s also possible to set up a local proxy (sidecar) to enable server-side discovery patterns without much difficulty. This method works well to connect different hosts over HTTP or TCP, but does not work well for talking to other database schemas or topics (as used in Apache Kafka).

Specialized libraries for service discovery 

A final way to do service discovery is using a custom solution like Netflix Eureka. There are specific libraries used to communicate with these solutions, and the correct APIs need to be explicitly called. The advantage is that it works over any possible paradigm (hosts, database schemas, or topics). However, if multiple programming languages are supported, developers need to make sure each language has relevant library support, similar to client-side discovery. 

Summing it up

When choosing to build a product with many constituent microservices, there are many decisions that need to be made about how the services can find each other and how your infrastructure is set up. There is no one solution that will work for every product; one should be chosen based on your existing setup and the features you need.

Cortex gives a single source of truth for all production services and makes it easy for developers to see which services are working and which need maintenance. Essentially, Cortex handles the “human” side of service discovery. There is one central dashboard for seeing your organization’s services, their owners, dependencies, documentation, and what actions may be needed — all of which empowers the team to make quicker decisions. With Cortex, you can focus more on delivering better products and less on operational reliability. 

If you’re interested in learning more, sign up for a free trial of Cortex.