With New Relic’s most recent addition, Pixie
Tracking network traffic on a Kubernetes (k8s) cluster, capturing request-response tuple and associated metadata. Streaming the captured data out of the cluster to be analysed further.
Since we’re planning on tracing each network request (OSI Layer 7), there are a few considerations that form a part of our requirements. They are:
- Lightweight →Low memory and CPU footprint are critical.
- Scalable → New nodes are added and removed from a cluster, and the solution should scale accordingly.
- Language and Framework agnostic → Containers might be created in any number of languages and frameworks, this should not affect/influence our tracking.
- Rapid → The scale of all network data in a server would be massive. Our solution must be able to track and report that data to us within considerable time limits.
- Comfortable → Users should feel comfortable/safe/assured when deploying this on their production clusters.
- Plug and Play → Low onboarding time and cost. No code changes to attach hooks.
Based on the above details, we chose Pixie for this task. Pixie is an open source observability platform built by New Relic, one of the leaders in the APM space.
The fact that Pixie is a Cloud Native Computing Foundation project and New Relic uses it for its Kubernetes observability lends credibility to the project. Alleviating the rapid and comfortable considerations made above.
In their own words:
“Open source Kubernetes observability for developers
Auto-instructed. Scriptable. Kubernetes native.”
For us, it means easy deployments, custom data points tracking (if required), and native capture of requests and responses made on the cluster.
So, how does Pixie do that? And what makes it special?
Pixie uses eBPF technology to run event-driven scripts at the kernel level. Bypassing all code requirements, making the Linux kernel its first-class citizen.
This makes Pixie language and framework agnostic, and since eBPF works on any system running the Linux kernel, Pixie has a wide install base.
Note: This also means Pixie will not run on Windows servers, though the volume of production k8s clusters running on Windows servers might be pretty low.
Full list of supported Pixie environments and cluster types: Install | Requirements.
[Optional] Further resources on eBPF technology:
But still, why Pixie? What are its advantages?
Edge Moduleis at a node level. So, irrespective of the pods and services running, only a single edge module deployment is required per node. Addressing lightweight and scalable.
- Pixie has a very simple deployment and installation. Once authenticated, only the
px deploycommand needs to be run. Addresses plug and play.
- As soon as it is, Pixie is ready to report on
Network data. No other setup is required.
- Pixie provides an easy-to-use API (in Python and Go) for extracting/fetching/streaming that data via the internet. It also supports in-transit e2e encryption of the data.
We go ahead with Pixie as it addresses most of our questions while providing additional features which make development easier.
In the next sections, we’ll discuss pixie specific things, such as:
- What all data does Pixie track and its impact on the cluster?
- Finally, streaming data out from the cluster using Pixie’s API.
Installing Pixie is pretty easy, there are only a few steps to follow. Pixie’s documentation: Comprehensive Install Guide for Pixie Community Edition.
If you already have a Kubernetes cluster available you can skip this section and go straight to the installation steps for Pixie. Here, we’re going to install and run a
minikube cluster for local testing.
Steps to set up a local Kubernetes environment:
Step 1: Install minikube.
Step 2: Increase the configured CPU and memory limits. This is required as we’ll be deploying Pixie and a microservice for request simulation.
minikube config set cpus 4
minikube config set memory 8192
If you already have
minikube installed and running, then you need to perform
minikube delete before your changes are materialized.
To check if the changes took place correctly run the below commands.
minikube config get memory
# Output: 8192minikube config get cpus
# Output: 4
Step 3: Start
minikube with the
minikube start --driver=hyperkit
Steps to install Pixie on a cluster:
Step 1: The main interface for Pixie to interact with the cluster is its CLI. Install it via:
# Copy and run command to install the Pixie CLI.
bash -c "$(curl -fsSL https://withpixie.ai/install.sh)"
Step 2: Pixie (CLI as well) requires an account to work and authenticate with the cluster. To authenticate Pixie CLI, run:
px auth login
Step 3: Once login is successful. Deploy Pixie on the cluster.
That’s it ✅ for Pixie’s installation. We’ve set up a cluster and installed Pixie on it. But, we can’t track traffic if there isn’t any so we deploy a demo application. Sock Shop Weaveworks.
px demo deploy px-sock-shop
Once this command finishes, to check if everything is installed correctly, run the following command:
kubectl get pods -A
If you look closely, there is a
load-test-* pod in Init state, which is simulating traffic.
Go to Pixie’s Live UI for exploring all the metrics that it collects:
Some example views:
The most relevant for us:
Pixie comes with a lot of scripts preinstalled to make all sorts of tracking easier — About Pixie | Data Sources
Below is a list of important things to note regarding its data collection and impact on the cluster:
- Pixie stores the data it collects in-memory on the nodes in your cluster; no data is sent to a centralized backend outside of the cluster.
- Pixie has a 2GiB memory requirement. After installing Pixie, it is normal to see a temporary increase in memory usage.
- How much data does Pixie store?
Retention time depends on the level of traffic in your cluster, but will generally be on the order of hours. Each data table in Pixie (eg
http_events) has its maximum size. These tables collect data until the maximum size is reached, and then begin dropping the oldest data.
- Typical performance overhead for node CPU usage is between 2–5%, depending on the amount and type of traffic. We are in the process of optimizing this, however, so watch for that to go down even further.
Now that we have installed Pixie and it has started collecting HTTP events data, our goal is to stream that data out.
Pixie provides an API that we can use. To get started we require two values:
Once we have these values ready, we can set up a new Python virtual environment and install
pxapi in it.
pip install pxapi
If the above fails on macOS, try:
GRPC_PYTHON_BUILD_SYSTEM_ZLIB=true pip install pxapi
Script to stream