etcd Snapshot Too Big?

Assuming you are using a self-managed Kubernetes cluster, taking an etcd backup should be part of your cluster recovery strategy. The following command is the simplest one for taking an etcd snapshot and saving it to disk:

export ETCDCTL_API=3

timestamp=`date +%Y%m%d-%H%M%S`

etcdctl --endpoints 127.0.0.1:2379 \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  snapshot save /data/etcd-backup/snapshot-$timestamp.db

However, sometimes you may find the backup size has exceeded several hundred megabytes, which is substantial for a simple cluster containing 10-20 active namespaces and hundreds of pods. This occurs due to the internal fragmentation of etcd database files. The backup size (and disk space used in /var/lib/etcd) can be reduced by sending a fragmentation signal to all the etcd members. The following command should be sufficient to accomplish this:

export ETCDCTL_API=3

etcdctl --endpoints 127.0.0.1:2379 \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  defrag --cluster=true

Learn more about etcd defragmentation here.