Evergreen Google Containers

Slides available at:
slides.mobiusconsortium.org/blake/evergreengoogledocker

Created by Blake Graham-Henderson / blake@mobiusconsortium.org

Press ESC to browse slides

Crossroads

  • Stay the same
    • Buy new servers
    • Renew contract with hosting company
    • Continue processes for upgrades and rollouts
  • Go another direction
    • Cloud hosting cost analysis

What's needed?

  • Database server(s), SSDs
  • Application/Utility Server(s)
  • Load balancing
  • Networking, public IPs, bandwidth

Virtual Machines

  • VM hosting, 4CPU, 15GB Memory, 100GB disk = $101.09

Docker is interesting

Benefit

No longer need to worry about

  • /etc/ha.d/*
  • Bandwidth issues
  • Network failover
  • Networking in general
  • SCALING!
  • O. M. GI GI ' s s

Start local Docker

Surprise! It's easy!

  • apt-get install docker-engine (read distro docs)
  • docker pull ubuntu
  • docker run -it ubuntu

Customize the container

  • Docker images do not allow fs mounting at runtime
  • Folder mapping -v
  • Ports -p
  • Hostname -h

docker run -it -p 80:80 -p 210:210 -p 443:443 -p 32:22 -p 7680:7680 -p 7682:7682 \
-p 6001:6001 -v /mnt/evergreen:/dfsdump -v /home/blake:/mnt/evergreen \
-v /etc/timezone:/etc/timezone:ro -h docker-app1.missourievergreen.org 5ba180628aec

Bare bones

And I mean BARE!

  • No syslog
  • No logrotate
  • No ssh
  • No sudo
  • No cron
  • NOTHING

Dockerfile

  • FROM
  • EXPOSE
  • RUN
  • ADD
  • ENTRYPOINT

Example here

cd /folder && docker build .

Automate Evergreen installation

Ansible is sweet!

  • Run commands as different users
  • Variables

Example here

cd /egconfigs && ansible-playbook install_evergreen.yml -v -e "hosts=127.0.0.1"

A word about ENTRYPOINT

  • Docker normally exists after container finishes
  • Need to run something that will only exit when error
  • This works
  • && while true; do sleep 1; done

Start your engines!

  • Wait! First, we need to make a beefy postgres server
  • Standard VM
  • SSDs
    • How many?
    • Speed?
    • Let's see

Postgres needs IOPS!

Google offers

Create DB VM

  • 16 CPU, 102GB Memory
  • Add "Local SSD" 375GB each
  • 280,000 IOPS

Is it really 280,000 IOPS?

fio --name=writefile --size=100G --filesize=100G \
--filename=/dev/sdb --bs=4k --nrfiles=1 \
--direct=1 --sync=0 --randrepeat=0 --rw=read --refill_buffers --end_fsync=1 \
--iodepth=200 --ioengine=libaio

It's 184k!

Let's try RAID10

Let's try RAID10

mdadm \
--create /dev/md0 \
--level=10 \
--raid-devices=4 \
/dev/disk/by-id/google-local-ssd-0 \
/dev/disk/by-id/google-local-ssd-1 \
/dev/disk/by-id/google-local-ssd-2 \
/dev/disk/by-id/google-local-ssd-3

mdadm --detail /dev/md0

RAID10 test 4k

fio --name=writefile --size=100G --filesize=100G \
--filename=/dev/md0 --bs=4k --nrfiles=1 \
--direct=1 --sync=0 --randrepeat=0 --rw=read --refill_buffers --end_fsync=1 \
--iodepth=200 --ioengine=libaio

It's 88k?

RAID10 test 512k

fio --name=writefile --size=100G --filesize=100G \
--filename=/dev/md0 --bs=512k --nrfiles=1 \
--direct=1 --sync=0 --randrepeat=0 --rw=read --refill_buffers --end_fsync=1 \
--iodepth=200 --ioengine=libaio

It's 1,038,000. Not Bad

Let's try RAID0

mdadm \
--create /dev/md0 \
--level=0 \
--raid-devices=4 \
/dev/disk/by-id/google-local-ssd-0 \
/dev/disk/by-id/google-local-ssd-1 \
/dev/disk/by-id/google-local-ssd-2 \
/dev/disk/by-id/google-local-ssd-3

mdadm --detail /dev/md0

RAID0 test 4k

fio --name=writefile --size=100G --filesize=100G \
--filename=/dev/md0 --bs=4k --nrfiles=1 \
--direct=1 --sync=0 --randrepeat=0 --rw=read --refill_buffers --end_fsync=1 \
--iodepth=200 --ioengine=libaio

180,364 IOPS

RAID0 test 512k

fio --name=writefile --size=100G --filesize=100G \
--filename=/dev/md0 --bs=512k --nrfiles=1 \
--direct=1 --sync=0 --randrepeat=0 --rw=read --refill_buffers --end_fsync=1 \
--iodepth=200 --ioengine=libaio

It's 1,991,000!

Holy Cow

A word about Local SSD

  • Google considers it "scratch" disk
  • Make backups!
  • Postgres replication

Compute Engine offers always-encrypted local solid-state drive (SSD) block storage for virtual machine instances. Each local SSD is 375 GB in size, but you can attach up to eight local SSD devices for 3 TB of total local SSD storage space per instance . Optionally, you can format and mount multiple local SSD devices into a single logical volume.

Unlike persistent disks, local SSDs are physically attached to the server that hosts your virtual machine instance. This tight coupling offers superior performance, very high input/output operations per second (IOPS), and very low latency compared to persistent disks. See Persistent Disk and Local SSD Performance for details.

Warning: The performance gains from Local SSDs require certain trade-offs in availability, durability, and flexibility. Because of these trade-offs, local SSD storage is not automatically replicated and all data on the local SSD may be lost if the instance terminates for any reason. See Local SSD data persistence for details.

Take note of internal IP

Integrate into ansible and docker images

How Google does it

IT'S ALL PRIVATE IP SPACE

Standard VMs

  • VM private space IP

Docker

  • Google Containers are VMs tuned for docker
  • Pods = running docker image
  • Deployed with YAML configs

Turn the key to AUX position

  • Download JSON key
  • Install gcloud SDK locally

Getting JSON Key

  • Click hamburger -> IAM & Admin -> Service Accounts
  • Create account and download key

Install gcloud SDK

  • gcloud auth activate-service-account --key-file [KEY_FILE]
  • gcloud config set account [email address]
  • gcloud config set compute/zone us-central1-c
  • gcloud config set project [projectID]
  • gcloud auth login
  • gcloud container --account [email address] --project [projectID]

Back to docker

  1. Build local image
  2. Test local image
  3. Contribute image to google docker registry

Watch it fly!

  • docker build .
  • docker engine creates temp images per command

Check your local images

  1. docker images
  2. Test local
    • docker run f42b5f3b175b

Send image up to the cloud

  1. docker tag [image hash] gcr.io/[project-id]/egapp
  2. gcloud docker -- push gcr.io/[project-id]/egapp

Done with docker

Docker hurdles

  • Using different hosts
  • Cannot communicate across hosts
  • Cannot mount anything
  • That's OK, we have Kubernetes

Setup central storage

  • Docker images do not allow mounting during runtime
  • Need docker images to read/write to same storage
  • Need involved VMs to read/write to same storage
  • Which storage method do we use?
  • glusterFS/Flocker/NFS
  • Supported Persistent Volumes

Just use the example

  • kubectl create -f nfs-pv.yaml (example)
  • kubectl create -f nfs-pvc.yaml (example)
  • kubectl create -f create_apps_service.yml (example)
  • kubectl create -f create_apps.yml (example)

How are things going?

  • kubectl get svc
  • kubectl proxy --address=192.168.11.119 --accept-hosts=".*"
  • http://192.168.11.119:8001/ui

Local Proxy UI

Check out auto stuff

  • First setup health check

Health Check Detail

Load Balancer

Load Balancer evaluates
Containers not Pods

  • I know, it's called container
  • "Containers" are VMs that run Docker "Pods"
  • I know, it's called container!!!
  • Pods already have this check setup from YAML

Load Balancer stuff

  • Take a pod out of rotation by removing /openils/var/web/ping.txt
  • But how do I access the pod?
  • Funny you should ask, back to that in a minute

Containers can have
more than one Pod


  • Google "load balances" the Pods onto the available containers

  • When it's time to add a Pod, Google chooses the less utilized container

  • We need to make sure the number of containers are equal to the number of Pods

Solve for X

  • Cron a script for gcloud
  • Use podsync.sh
  • INSTANCENUM=`$PATH_TO_GCLOUD compute instance-groups list|grep -v "INSTANCES" | awk '{print $6}'`

Now Scaling!

Now Scaling!

A word on Scaling

  • CPU percentage is all containers combined
  • New containers are created automatically but not Pods
  • Pods are created by that cron job - consider timing
  • When containers are removed, the Pod inside is deleted, Replication Controller will automatically create another Pod on another container
  • Cron job will detect the lower number of containers and scale down to match
  • Connections will be rerouted to other containers

Bibliotheca software has issues

  • Bibliotheca creates a TCP connection for SIP
  • And never terminates it!
  • When Pods scale down, Bibliotheca errors
  • Patrons recieve nasty "DB unavailable" error
  • FOR 30 SECONDS!
  • Pirak at Bibliotheca has been alerted

Gcloud blocks port 25

  • Documented here
  • We setup SMTP through gmail account authentication
  • Configure sendmail
  • /etc/mail/authinfo
  • /etc/mail/sendmail.mc
  • All included in ansible script
  • v=spf1 include:_spf.google.com include:_cloud-netblocks.googleusercontent.com ?all

Final Note: Access Pods

  • Find Internal IPs of the Pods
  • kubectl get po|grep -v NAME | awk '{print $1}'|while read line ; do kubectl describe po/$line ; done |grep IP | awk '{print $2}'| tr '\n' ' '
  • Use SSH netcat forwarding
  • ProxyCommand ssh -q
  • Use script for automagic (Script Example)
  • ./remote_cloud_launch.pl --dbname me-db1 --localusername blake --clustername meapps
  • Then you can use clusterssh if you want

THE END

Blake Graham-Henderson

MOBIUS

blake@mobiusconsortium.org

Slides available at:
slides.mobiusconsortium.org/blake/evergreengoogledocker

Github repo:
https://github.com/mcoia/eg-docker