xvs

package module
v0.2.10 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 1, 2025 License: GPL-2.0 Imports: 20 Imported by: 1

README

XDP Virtual Server

An XDP/eBPF load balancer and Go API for Linux.

I have now moved the new IPv6/layer 3 tunneling branch to main. It's not quite fully ready yet, so continue to use the v0.1 branch/tags for production code. Ubuntu 24.04 (kernel 6.11.0) is used for development and it is quite possible that older kernels will fail to load the eBPF code.

This code is originally from the vc5 load balancer, and has been split out to be developed seperately.

XVS implements a layer 4 Direct Server Return (DSR) load balancer with an eBPF data plane (that is loaded into the kernel), and a supporting Go library to configure the balancer through the XDP API.

IPv6 and layer 3 tunnels are now supported. Tunnel types implemented are: IP-in-IP (all flavours), GRE, FOU and GUE. A NAT system provides a mechanism to directly query services via the virtual IP address on backends (using the appropriate tunnel type), which allows a client to perform accurate health checks and so enable/disable new connections to targets as necessary.

There is no requirement to use the same address family for virtual and real server addresses; you can forward IPv6 VIPs to backends using an IPv4 tunnel endpoint, and vice versa.

Some facilities may not have been implemented in the new code yet, but will be added shortly.

A compiled BPF ELF object file is committed to this repository (tagged versions) and is accessed via Go's embed feature, which means that it can be used as a standard Go module without having to build the binary as a separate step. libbpf is still required for linking programs using the library (CGO_CFLAGS and CGO_LDFLAGS environment variables may need to be used to specify the location of the library - see the Makefile for an example of how to do this).

Portability

eBPF code is JITted to the native instruction set at runtime, so this should run on any Linux architecture. Currently AMD64 and ARM (Raspberry Pi) are confirmed to work.

Devices with constrained memory might have issues loading the default size flow state tables. This can now be overriden with the FlowsPerCPU parameter.

cmd/balancer -r 180 wlan0 192.168.0.1/24 192.168.101.1 192.168.0.10

Documentation

Some notes about design are in the doc/ directory, and the Go API is described here.

The API is loosely modelled on the Cloudflare IPVS library (Go reference).

Sample application

A simple application in the cmd/ directory will balance traffic to a VIP (TCP port 80 by default, can be changed with flags) to a number of backend servers on the same IP subnet.

Compile/run with, eg.:

  • make
  • cmd/balancer -r 180 -t gre ens192 10.1.2.254/24 192.168.101.1 10.1.10.100 10.1.10.101

where 180 is the number of seconds to run for, gre is the tunnel type, ens192 is the network card you wish to load the XDP program onto, 10.1.2.254/24 is the IP address of the router that will handle tunneled traffic (the /24 allows the library to determine the local IP address to use as the source for tunnel packets), 192.168.101.1 is the VIP, and 10.1.10.100 & 10.1.10.101 are two real servers to send the traffic to. Only port 80/tcp is forwarded by default, but other ports can be added (-h for help).

On a separate client machine on the same subnet you should add a static route for the VIP directed at the load balancer's own IP address, eg.:

  • ip r add 192.168.101.1 via 10.1.2.3

You should then be able to contact the service:

  • curl http://192.168.101.1/

No healthchecking is done, so you'll have to make sure that a webserver is running on the real servers and that the VIP has been configured on the loopback address (ip a add 192.168.101.1 dev lo).

This is not intended to be a useful utility, more an example of using the library. A more complete example with health check and BGP route health injection is currently available at vc5.

Performance

This has mostly been tested using Icecast backend servers with clients pulling a mix of low and high bitrate streams (48kbps - 192kbps).

A VMWare guest (4 core, 8GB) using the XDP generic driver was able to support 100K concurrent clients, 380Mbps/700Kpps through the load balancer and 8Gbps of traffic from the backends directly to the clients. Going above 700Kpps cause connections to be dropped, regardless of the number of cores or memory assigned to the VM, so I suspect that there is a limit on the number of interrupts that the VM is able to handle per second.

On a single (non-virtualised) Intel Xeon Gold 6314U CPU (2.30GHz 32 physical cores, with hyperthreading enabled for 64 logical cores) and an Intel 10G 4P X710-T4L-t ethernet card, I was able to run 700K streams with 2Gbps/3.8Mpps ingress traffic and 46.5Gbps egress. The server was more than 90% idle. Unfortunately I did not have the resources available to create more clients/servers. I realised that I carried this out when the server's profile was set to performance per-watt. Using the performance mode the CPU usage is barely 2% and latency is less than 250 nanoseconds.

Above tests were done on the old layer 2 code, but will be broadly the same. I'll do some updated tests soon.

On a Raspberry Pi (B+) ... don't get your hopes up!

Recalcitrant cards

I initially had problems with the Intel X710 card, but some combination of SFP+ module/optics replacement and moving to Ubuntu 24.04 seems to have fixed the issue.

The Intel X520 cards that I had previously used work flawlessly.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Client

type Client interface {
	Info() (Info, error)

	Config() (Config, error)
	SetConfig(Config) error

	Services() ([]ServiceExtended, error)
	Service(Service) (ServiceExtended, error)
	CreateService(Service) error
	UpdateService(Service) error
	RemoveService(Service) error

	Destinations(Service) ([]DestinationExtended, error)
	CreateDestination(Service, Destination) error
	UpdateDestination(Service, Destination) error
	RemoveDestination(Service, Destination) error

	// SetService combines the functionality of CreateService,
	// UpdateService, CreateDestination, UpdateDestination and
	// RemoveDestination. If the service does not exist it will be
	// created with the given parameters and destinations, or updated
	// to match them if extant.
	SetService(Service, ...Destination) error

	VIPs() ([]VIP, error)
	VIP(netip.Addr) (VIP, error)

	// NAT returns an address which can be used to query a specific
	// virtual IP on a backend ("real") server, this can be used to
	// implement health checks which accurately reflect the ability of
	// the backend to serve traffic for a particular VIP.
	NAT(vip, rip netip.Addr) (nat netip.Addr)

	// ReadFlow retrieves an opaque flow record from a queue written
	// to by the kernel. If no flow records are available then a zero
	// length slice is returned. This can be used to share flow state
	// with peers, storing the flow with the WriteFlow()
	// function. Stale records in the queue (older than a few seconds)
	// are skipped.
	ReadFlow() []byte
	WriteFlow([]byte)
}

func New added in v0.2.0

func New(interfaces ...string) (Client, error)

func NewWithOptions added in v0.2.0

func NewWithOptions(options Options, interfaces ...string) (Client, error)

type Config added in v0.2.0

type Config struct {
	IPv4VLANs map[uint16]netip.Prefix
	IPv6VLANs map[uint16]netip.Prefix
	Routes    map[netip.Prefix]uint16
}

type Destination

type Destination struct {
	Address               netip.Addr
	TunnelType            TunnelType
	TunnelPort            uint16
	TunnelEncapNoChecksum bool
	Disable               bool
}

type DestinationExtended

type DestinationExtended struct {
	Destination       Destination
	ActiveConnections uint32
	Stats             Stats
	Metrics           map[string]uint64
	MAC               [6]byte
}

type Info

type Info struct {
	Stats   Stats
	Latency uint64
	Metrics map[string]uint64
	IPv4    netip.Addr
	IPv6    netip.Addr
}

type Logger added in v0.2.7

type Logger interface {
	Debug(msg string, args ...any)
}

type Options added in v0.2.0

type Options struct {
	DriverMode         bool                    // Use XDP_FLAGS_DRV_MODE flag when attaching interface
	Bonding            bool                    // Explicitly declare interfaces to be aggregated
	BPFProgram         []byte                  // Override the embedded BPF program with this object code
	FlowsPerCPU        uint32                  // Override default size of flow tracking tables
	InterfaceInitDelay uint8                   // Pause (seconds) between each link attach/detach; to prevent bonds flapping
	IPv4VLANs          map[uint16]netip.Prefix // VLAN ID/IPv4 Prefix mapping
	IPv6VLANs          map[uint16]netip.Prefix // VLAN ID/IPv6 Prefix mapping
	Routes             map[netip.Prefix]uint16 // Override route selection for layer 3 backends; prefix-to-VLAN ID map
	Logger             Logger
}

type Protocol

type Protocol uint8
const (
	TCP Protocol = 0x06
	UDP Protocol = 0x11
)

type Service

type Service struct {
	Address  netip.Addr
	Port     uint16
	Protocol Protocol
	Sticky   bool
}

type ServiceExtended

type ServiceExtended struct {
	Service Service
	Stats   Stats
	Metrics map[string]uint64
}

type Stats

type Stats struct {
	Connections     uint64
	IncomingPackets uint64
	IncomingBytes   uint64
}

type TunnelType added in v0.2.0

type TunnelType uint8

type VIP added in v0.2.0

type VIP struct {
	Address netip.Addr
	Stats   Stats
	Metrics map[string]uint64
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL