etcd Snapshot Too Big?

Assuming that you are using self-managed Kubernetes cluster, you must be taking etcd backup as your cluster recovery strategy. The following command should be the most simple one that takes etcd snapshot and saves to disk.

export ETCDCTL_API=3

timestamp=`date +%Y%m%d-%H%M%S`

etcdctl --endpoints 127.0.0.1:2379 \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  snapshot save /data/etcd-backup/snapshot-$timestamp.db

However, sometimes you may find the backup size has crossed several hundred megabytes which is huge for a simple cluster containing 10-20 active namespaces and hundreds of pods. This happens due to the internal fragmentation of etcd database files. The backup size (and disk space used in /var/lib/etcd) can be reduced by sending fragmenting signal to all the etcd members. The following command should be sufficient to do so.

export ETCDCTL_API=3

etcdctl --endpoints 127.0.0.1:2379 \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  defrag --cluster=true

Read more about etcd Defragmentation here.