服务网格
Apr 30, 2019 21:55 · 2705 words · 6 minute read
服务网格(Service Mesh)是一个可配置、低延迟基础的基础架构层,旨在利用 API 处理应用程序基础架构服务之间的大量网络通讯。服务网格保证了容器化的应用程序之间的快速通讯,既可靠又安全。网格提供了包括服务发现、负载均衡、加密、可观察、可回溯、身份验证和授权等关键功能,以及对断路器模式(circuit breaker pattern)的支持。
服务网格通过为每个服务实例提供名为 sidecar 的代理实例来实现。sidecar 们处理服务间通讯,监控和安全相关的问题(事实上,任意可以从服务中抽象出来的东西)。这样,开发人员来编写和维护服务中的应用程序代码,而维护服务网格和落地应用程序交给运维团队。
Istio,背靠 Google、IBM 和 Lyft,是目前最知名的服务网格架构。完全由 Google 设计的 Kubernetes,是 Istio 目前支持的唯一容器编排框架。
当然 Istio 也不是唯一的选择,其他的服务网格方案也在开发中。sidecar 代理模式最流行。还有其他可以替代的架构:Netflix 用应用程序库(Ribbon、Hysterix、Eureka、Archaius)来提供服务网格,而 Azure Service Fabric 这样的平台则将类服务网格功能嵌入到应用程序框架中。
服务网格的组件:
- 容器编排框架(container orchestration framework):随着越来越多的容器被添加到应用程序的结构中,容器编排框架(一种用于监控和管理容器集的独立工具)变得必不可少。Kubernetes 已经统治了这个市场,它的主要竞争对手 Docker Swarm 和 Mesosphere DC/OS,甚至提供了与 Kubernetes 集成作为可选项。
- 服务与实例 (Kubernetes pods):实例是微服务的单个运行着的副本。有时候实例就是一个单独的容器;在 Kubernetes 中,一个实例由一小组相互依赖的容器(称为 pod)组成。客户端不直接访问实例或者 pod,而是访问服务,服务是一组可扩展且随时可被杀的 pod 副本。
- sidecar 代理(sidecar proxy):sidecar 代理与单个 pod 一起运行,用来路由或代理与其一起运行的容器的流量。sidecar 与其他同类通讯,并由编排框架管理。不少服务网格通过使用 sidecar 代理拦截和管理所有出入 pod 的流量来实现。
- 服务发现(service discovery):当实例需要与不同的服务交互,它要寻找——发现健康可用的服务。通常实例要进行 DNS 查询。容器编排框架保留了一张已经准备好接收请求的实例的列表,并提供 DNS 查询的接口。
- 负载均衡(load balancing):大多数容器编排框架已经提供了网络层(OSI 模型第3层)的负载均衡。服务网格实现了更复杂的应用层(OSI 模型第7层)的负载均衡,有着更丰富的算法和更为强大的流量管理。通过 API 配置负载均衡的参数,从而协调蓝绿或者金丝雀部署。
- 加密(encryption):加解密请求和应答,这样就无需服务本身来执行这些操作,消除了这部分的负担。还可以通过优先重用现有的持久连接来提高性能,减少新建连接的开销。mTLS 是最常见的加密流量的实现手法,其中一台设备(PKI)生成并分发 sidecar 代理使用的证书和密钥。
- 身份验证和授权(authentication and authorization):授权和验证来自应用程序内外部的请求,向实例发送经过验证的请求。
- 支持断路器模式(support for the circuit breaker pattern):隔离不健康的实例,在有保障的情况下逐渐将它们带回健康的实例池中。
服务网格管理实例之间的网络流量被称为数据平面(data plane)。生成和部署控制数据平面行为的配置是使用单独的控制平面来完成的。控制平面通常是命令行或者图形化的工具,调用 API。
服务网格的一个常见用例是在使用容器和微服务时解决非常严苛的操作问题。它不可能是所有应用程序运行和交付问题的答案。服务网格架构中的元素,Nginx、容器、Kubernete 和微服务可以在非服务网格实现中高效地使用。举个栗子,Istio 是作为一个完整的服务网格架构开发的,但是其模块化的设计意味着开发者可以随心所欲挑选他们需要的组件。考虑到这点,就值得深入理解服务网格概念。
What Is a Service Mesh?
A service mesh is a configurable, low‑latency infrastructure layer designed to handle a high volume of network‑based interprocess communication among application infrastructure services using application programming interfaces (APIs). A service mesh ensures that communication among containerized and often ephemeral application infrastructure services is fast, reliable, and secure. The mesh provides critical capabilities including service discovery, load balancing, encryption, observability, traceability, authentication and authorization, and support for the circuit breaker pattern.
The service mesh is usually implemented by providing a proxy instance, called a sidecar, for each service instance. Sidecars handle interservice communications, monitoring, and security‑related concerns – indeed, anything that can be abstracted away from the individual services. This way, developers can handle development, support, and maintenance for the application code in the services; operations teams can maintain the service mesh and run the app.
Istio, backed by Google, IBM, and Lyft, is currently the best‑known service mesh architecture. Kubernetes, which was originally designed by Google, is currently the only container orchestration framework supported by Istio. Vendors are seeking to build commercial, supported versions of Istio. It will be interesting to see the value they can add to the open source project.
Istio is not the only option, and other service mesh implementations are also in development. The sidecar proxy pattern is most popular, as illustrated by projects from Buoyant, HashiCorp, Solo.io, and others. Alternative architectures exist as well: Netflix’s technology suite is one such approach where service mesh functionality is provided by application libraries (Ribbon, Hysterix, Eureka, Archaius), and platforms such as Azure Service Fabric embed service mesh‑like functionality into the application framework.
Service mesh comes with its own terminology for component services and functions:
- Container orchestration framework. As more and more containers are added to an application’s infrastructure, a separate tool for monitoring and managing the set of containers – a container orchestration framework – becomes essential. Kubernetes seems to have cornered this market, with even its main competitors, Docker Storm and Mesosphere DC/OS, offering integration with Kubernetes as an alternative.
- Services and instances (Kubernetes pods). An instance is a single running copy of a microservice. Sometimes the instance is a single container; in Kubernetes, an instance is made up of a small group of interdependent containers (called a pod). Clients rarely access an instance or pod directly; rather they access a service, which is a set of identical instances or pods (replicas) that is scalable and fault‑tolerant.
- Sidecar proxy. A sidecar proxy runs alongside a single instance or pod. The purpose of the sidecar proxy is to route, or proxy, traffic to and from the container it runs alongside. The sidecar communicates with other sidecar proxies and is managed by the orchestration framework. Many service mesh implementations use a sidecar proxy to intercept and manage all ingress and egress traffic to the instance or pod.
- Service discovery. When an instance needs to interact with a different service, it needs to find – discover – a healthy, available instance of the other service. Typically, the instance performs a DNS lookup for this purpose. The container orchestration framework keeps a list of instances that are ready to receive requests and provides the interface for DNS queries.
- Load balancing. Most orchestration frameworks already provide Layer 4 (network) load balancing. A service mesh implements more sophisticated Layer 7 (application) load balancing, with richer algorithms and more powerful traffic management. Load‑balancing parameters can be modified via API, making it possible to orchestrate blue‑green or canary deployments.
- Encryption. The service mesh can encrypt and decrypt requests and responses, removing that burden from each of the services. The service mesh can also improve performance by prioritizing the reuse of existing, persistent connections, which reduces the need for the computationally expensive creation of new ones. The most common implementation for encrypting traffic is mutual TLS (mTLS), where a public key infrastructure (PKI) generates and distributes certificates and keys for use by the sidecar proxies.
- Authentication and authorization. The service mesh can authorize and authenticate requests made from both outside and within the app, sending only validated requests to instances.
- Support for the circuit breaker pattern. The service mesh can support the circuit breaker pattern, which isolates unhealthy instances, then gradually brings them back into the healthy instance pool if warranted.
The part of a service mesh application that manages the network traffic between instances is called the data plane. Generating and deploying the configuration that controls the data plane’s behavior is done using a separate control plane. The control plane typically includes, or is designed to connect to, an API, a command‑line interface, and a graphical user interface for managing the app.
Service mesh architectures are not ever likely to be the answer to all application operations and delivery problems. Architects and developers have a great many tools, only one of which is a hammer, and must address a great many types of problems, only one of which is a nail. The NGINX Microservices Reference Architecture, for instance, includes several different models that give a continuum of approaches to using microservices to solve problems.
The elements that come together in a service mesh architecture – such as NGINX, containers, Kubernetes, and microservices as an architectural approach – can be, and are, used productively in non‑service mesh implementations. For example, Istio was developed as a complete service mesh architecture, but its modular design means developers can pick and choose the component technologies they need. With this in mind, it’s worth developing a solid understanding of service mesh concepts, even if you’re not sure if and when you’ll fully implement a service mesh application.