Update Cilium to expose Prometheus metrics
Title: Expose Cilium Metrics to Prometheus
Author: Mitch Murphy
Date: 2023-12-15
Introduction
This guide serves as a follow up to a previous article covering how to configure RKE2 to use Cilium as the CNI (and fully replace kube-proxy)
Prerequisites
- RKE2 Cluster
- cilium
- kubectl
- helm
Prometheus
Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability. It is particularly well-suited for monitoring dynamic, cloud-native environments such as Kubernetes. Here’s an overview of Prometheus and how it facilitates monitoring in a Kubernetes environment:
Prometheus Overview
- Data Model:
- Prometheus follows a multi-dimensional data model, where time-series data is identified by metric names and key-value pairs called labels. Metrics represent numerical values that can be observed and monitored over time.
- Data Retrieval:
- Prometheus uses a pull-based approach to collect metrics from monitored targets. Each target (e.g., a Kubernetes node or container) exposes an HTTP endpoint where Prometheus can fetch metrics.
- Scalability:
- Prometheus is designed to be highly scalable, making it suitable for large-scale distributed systems like Kubernetes. It can handle a high volume of metrics from numerous targets.
- Query Language:
- Prometheus Query Language (PromQL) allows users to express complex queries to analyze and aggregate collected metrics. PromQL facilitates the creation of custom dashboards and alerts. Monitoring Kubernetes with Prometheus:
- Exporter Components:
- Kubernetes exposes a rich set of metrics through its API server, kubelet, and other components. Prometheus exporters are used to convert these metrics into a format that Prometheus can scrape.
- Service Discovery:
- Prometheus uses service discovery mechanisms to automatically discover and monitor new instances of services as they are deployed in Kubernetes. Kubernetes Service Discovery and DNS-based service discovery are commonly used.
- Instrumentation:
- Kubernetes components are instrumented to expose metrics in a format that Prometheus understands. Prometheus also supports custom instrumentation, allowing users to expose application-specific metrics.
- Alerting and Monitoring:
- Prometheus provides a built-in alerting system that allows users to define alerting rules based on metric conditions. Grafana is often used in conjunction with Prometheus to create dashboards for visualizing metrics.
- Integration with Kubernetes Ecosystem:
- Prometheus is a CNCF (Cloud Native Computing Foundation) project and is well-integrated with the Kubernetes ecosystem. It works seamlessly with other tools like Grafana, Alertmanager, and Kubernetes itself.
In summary, Prometheus is a powerful monitoring solution for Kubernetes, offering a flexible and scalable approach to collecting, querying, and alerting based on the metrics generated by the Kubernetes ecosystem. It plays a crucial role in ensuring the reliability and performance of applications deployed in Kubernetes clusters.
Prometheus Installation
The Prometheus stack includes a few components that greatly help SREs/administrators monitor/observe deployed Kubernetes resources. These include:
- The Prometheus Operator
- Highly available Prometheus
- Highly available Alertmanager
- Prometheus node-exporter
- Prometheus Adapter for Kubernetes Metrics APIs
- kube-state-metrics
- Grafana
Installing the Prometheus stack on Kubernetes using Helm is a common and convenient approach. Helm is a package manager for Kubernetes that simplifies the deployment and management of applications. Here’s a step-by-step guide for Kubernetes administrators on installing the Prometheus stack using Helm:
-
Add Prometheus Helm Repository:
- Add the Prometheus Helm repository to your Helm client.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
-
Create Namespace (Optional):
- Create a dedicated namespace for Prometheus if desired.
kubectl create namespace monitoring
-
Install Prometheus Operator:
- Install the Prometheus Operator using Helm.
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring --create-namespace
-
Verify Installation:
- Check the status of the deployed resources.
kubectl get pods -n monitoring
-
Access Prometheus Dashboard:
- Prometheus comes with a web-based dashboard. Expose it for access.
kubectl port-forward -n monitoring svc/prometheus-stack-kube-prom-prometheus 9090
Access the Prometheus dashboard at http://localhost:9090 in your web browser.
-
Access Grafana Dashboard (Optional):
- Grafana is included in the Prometheus stack. Expose it for access.
kubectl port-forward -n monitoring svc/prometheus-stack-grafana 3000:80
Access the Grafana dashboard at http://localhost:3000 and log in (default credentials are admin/prom-operator).
-
Configure Data Source in Grafana (Optional):
- Configure Prometheus as a data source in Grafana.
- URL: http://prometheus-stack-kube-prom-prometheus:9090
- Save the configuration.
- Configure Prometheus as a data source in Grafana.
-
Explore and Customize:
- Explore the Prometheus dashboard and Grafana to monitor your Kubernetes cluster.
- Customize Grafana dashboards and Prometheus alerting rules as needed.
By following these steps, Kubernetes administrators can easily set up and start using the Prometheus stack for monitoring their Kubernetes clusters. The Helm charts provided by the Prometheus community simplify the installation process and allow for customization based on specific monitoring requirements. Take note of the namespace where you install the prometheus-stack to as we will need to specify that when we upgrade Cilium. Stay tuned when I cover configuring alerts in Prometheus and integration with Alertmanager!
Cilium
Please refer to my previous post to find more information on what Cilium offers.
Cilium and Prometheus
Integrating Cilium and Prometheus in a Kubernetes environment brings several benefits, particularly in terms of exposing advanced metrics and enhancing the observability of your microservices-based architecture. Here are the key advantages of integrating Cilium and Prometheus:
- Rich Network Visibility:
- Cilium Network Security:
- Cilium provides advanced network security features, including API-aware network security policies, load balancing, and encryption.
- Prometheus integration with Cilium allows you to capture and visualize metrics related to network security, such as the number of allowed/denied connections and data on network policies.
- Cilium Network Security:
- Fine-Grained Service Metrics:
- Service-Aware Metrics:
- Cilium can enforce policies at the service or application layer, providing insights into the interactions between microservices.
- Prometheus can scrape and expose fine-grained metrics related to service communication, allowing you to monitor service-level performance and troubleshoot issues.
- Service-Aware Metrics:
- Distributed Tracing:
- Tracepoint Metrics:
- Cilium includes tracepoint-based metrics that can be exposed through Prometheus.
- By integrating Cilium and Prometheus, you can leverage these metrics for distributed tracing, gaining visibility into the flow of requests across microservices and identifying bottlenecks or latency issues.
- Tracepoint Metrics:
- Security Metrics:
- Security Incident Monitoring:
- Cilium enhances security by providing metrics related to security incidents and policy enforcement.
- Prometheus integration enables monitoring of security-related metrics, helping Kubernetes engineers detect and respond to security events in real-time.
- Security Incident Monitoring:
- Efficient Resource Utilization:
- Resource Consumption Metrics:
- Prometheus can collect and expose metrics related to resource consumption at the network level.
- This integration allows Kubernetes engineers to optimize resource allocation, identify performance bottlenecks, and ensure efficient use of network resources.
- Resource Consumption Metrics:
- Customizable Dashboards:
- Grafana Integration:
- Prometheus integrates seamlessly with Grafana, a popular visualization tool.
- Engineers can create customizable dashboards in Grafana to visualize Cilium and Kubernetes metrics, providing a comprehensive view of the entire system’s health and performance.
- Grafana Integration:
- Scalability and Performance Monitoring:
- Scalability Metrics:
- Cilium and Prometheus integration enables monitoring of metrics related to the scalability and performance of microservices.
- Engineers can use this information to optimize configurations, scale resources based on demand, and ensure a responsive and scalable application architecture.
- Scalability Metrics:
- Alerting and Automation:
- Alertmanager Integration:
- Prometheus integrates with Alertmanager to provide alerting capabilities based on predefined rules.
- Engineers can set up alerts for key metrics, enabling proactive monitoring and automated responses to potential issues.
- Alertmanager Integration:
In summary, integrating Cilium and Prometheus in a Kubernetes environment enhances observability by providing detailed metrics related to network security, service communication, distributed tracing, and resource utilization. This integration empowers Kubernetes engineers to monitor, analyze, and optimize the performance, security, and scalability of their microservices applications.
Cilium Upgrade
In order to integrate our previously installed Cilium/Hubble we need to update a few of our Helm values, particularly that involving ServiceMonitors. Update the file containing the values (detailed here) to:
cluster:
name: smig-cluster1
id: 11
prometheus:
enabled: true
serviceMonitor:
enabled: false
dashboards:
enabled: true
hubble:
metrics:
enabled:
- dns:query;ignoreAAAA
- drop
- tcp
- flow
- icmp
- http
dashboards:
enabled: true
serviceMonitor:
enabled: true
namespace: monitoring
labels:
release: prometheus-stack
relay:
enabled: true
prometheus:
enabled: true
serviceMonitor:
enabled: true
namespace: monitoring
labels:
release: prometheus-stack
ui:
enabled: true
baseUrl: "/"
version: 1.14.3
operator:
prometheus:
enabled: true
serviceMonitor:
enabled: true
namespace: monitoring
labels:
release: prometheus-stack
dashboards:
enabled: true
envoy:
prometheus:
enabled: true
serviceMonitor:
enabled: true
namespace: monitoring
labels:
release: prometheus-stack
clustermesh:
apiserver:
metrics:
enabled: false
serviceMonitor:
enabled: true
namespace: monitoring
labels:
release: prometheus-stack
Take note of the label that is added to each ServiceMonitor, this must be present in order for Prometheus to detect the “target” and start scraping it. Now let’s upgrade Cilium:
cilium upgrade -f cilium.yaml
Cilium ServiceMonitor Verification
n order to verify that Prometheus is picking up these new targets, you need to port-forward and make sure that they show up under Status -> Targets.
Cilium Grafana Dashboards
Because we specified in our values file that we wish to create Grafana dashboards, these get created as ConfigMaps and automatically picked up by Grafana. Until we expose Grafana as an Ingress, we must port-forward to it:
kubectl port-forward svc/prometheus-stack-grafana 8080:80 &
Now visit our dashboards, and click on the Hubble dashboard. Because we have not deployed anything to our cluster we should not see much here, but as you can see Cilium/Hubble provides quite advanced metrics! We will visit this dashboard again after we deploy a few things and test our system!