Monthly Archives: August 2016

Keep Your OpenStack API Databases Tidy

After running a cloud for 2+ years our OpenStack API databases are full of cruft. Deleted instances, deleted networks, deleted volumes, they all are still in the databases. OpenStack has no periodic clean-up for this stuff, it’s left up to you. This is partly because there’s no unified way to do it and also because each operator has different requirements on how long to retain data. Over the past few weeks I’ve been cleaning up our records and would like to share what I’ve found.

Warning: This is an advanced operation. Before doing this: You should backup everything. You should test it in a dev environment. You should assess the impact to your APIs. I did all of those before attempting any of this, including importing prod data into a test OpenStack environment to assess the impact there.

Each service has it’s own database, and depending on how that code is written and usage, some services stored more data than others. Here’s the order in terms of largest to smallest of our databases, on disk size (which we will discuss later).

Pre-cleaning DB sizes
3.6G /var/lib/mysql/nova
1.1G /var/lib/mysql/heat
891M /var/lib/mysql/cinder
132M /var/lib/mysql/designate
131M /var/lib/mysql/neutron
103M /var/lib/mysql/glance
41M /var/lib/mysql/keystone
14M /var/lib/mysql/horizon

So with this in mind I started digging into how to clean this stuff up. Here’s what I found. I’m noting in here what release we’re on for each because the tooling may be different or broken for other releases.

Heat – Mitaka

Heat was the first one I did, mainly because if Heat blows up, I can probably still keep my job. Heat has a great DB cleanup tool and it works very well. Heat lets you say “purge deleted records > X days/months/etc old”. When I did this heat had so much junk that I “walked it in”, starting with 365 days, then 250, etc etc. Heat developers win the gold medal here for best DB clean-up tool.

heat-manage purge_deleted -g days 30

Keystone – All Versions

Guess what? Keystone doesn’t keep ANY deleted junk in it’s database, once it’s gone, it’s gone. This can actually be an issue when you find a 2 year old instance that has a userid you can’t track down, but that’s how it is. So as long as you’re not storing tokens in here, you’re good. We’re using Fernet tokens, so no issues here.

Cinder – Liberty

Cinder’s DB cleanup tool is broken in Liberty. It is supposed to be fixed in Mitaka, but we’re not running Mitaka. Hoping to try this after we upgrade. We have a lot of volume cruft laying around.

Glance – Liberty

Glance has no cleanup tool at all that I can find. So I wrote one, but we ended up not using it. Why? Well because it seems that Glance can and will report deleted images via the V2 API and I could never quite convince myself that we’d not break stuff by doing a cleanup. Anyone know otherwise?

Here’s my code to do the cleanup, be careful with it! Like Heat you should probably “walk this in” by changing “1 MONTH” to “1 YEAR” or “6 MONTHS”. These deletes will lock the tables which will hang up API calls while they are running, plan appropriately. Note if you look on the internet you might find other versions that disable foreign key constraints, don’t do that.

mysql -u root glance -e "DELETE FROM image_tags WHERE image_id in\
(SELECT images.id FROM images WHERE images.status='deleted'\
AND images.deleted_at <DATE_SUB(NOW(),INTERVAL 1 MONTH));"
mysql -u root glance -e "DELETE FROM image_properties WHERE image_id in\
(SELECT images.id FROM images WHERE images.status='deleted'\
AND images.deleted_at <DATE_SUB(NOW(),INTERVAL 1 MONTH));"
mysql -u root glance -e "DELETE FROM image_members WHERE image_id in\
(SELECT images.id FROM images WHERE images.status='deleted'\
AND images.deleted_at <DATE_SUB(NOW(),INTERVAL 1 MONTH));"
mysql -u root glance -e "DELETE FROM image_locations WHERE image_id in\
(SELECT images.id FROM images WHERE images.status='deleted'\
AND images.deleted_at <DATE_SUB(NOW(),INTERVAL 1 MONTH));"
mysql -u root glance -e "DELETE FROM images WHERE images.status='deleted'\
AND images.deleted_at <DATE_SUB(NOW(),INTERVAL 1 MONTH);"

Nova – Liberty

Like Heat, Nova also has a clean-up tool, and also like Heat, Nova has a huge database. Unlike Heat, Nova’s clean-up tool is more limited. The only thing you can tell it is “don’t delete more than this many rows”. Actually Nova calls it’s tool “archiving” because it doesn’t delete records, it moves them to shadow tables. So even if you use this, you need to go back and truncate all the shadow tables.

Also near as I can tell nova just tries to archive up to the max records paying not attention to any database constraints, so when you use it, you will get warnings. These appear safe to be ignore. Also the Nova archive (in Liberty) doesn’t tell you anything (I think this is fixed in Mitaka), so figuring out when you are done is some interesting guess work. Basically I just re-ran it over and over and compared the sizes of the shadow tables, when they stop changing we’re done.

Also one quick note, when this finishes and you run du, you’re going to find out that you are now using more disk space. That’s because you just did a bunch of inserts into the shadow tables.

Like everything else, walk this in.

$ nova-manage db archive_deleted_rows 50000
2016-08-30 21:49:01.404 1 WARNING nova.db.sqlalchemy.api [req-f329a277-4fe2-45d6-ba3a-51f93827ed2f - - - - -] IntegrityError detected when archiving table aggregate_metadata
2016-08-30 21:49:11.900 1 WARNING nova.db.sqlalchemy.api [req-f329a277-4fe2-45d6-ba3a-51f93827ed2f - - - - -] IntegrityError detected when archiving table instances

How Much Disk Will I Get Back

Surprise, you get nothing back! That’s because the disk space is already allocated. If this important to you, then you will need to OPTIMIZE the tables. This ends up doing a full recreate (depending on what DB you are using) and this WILL lock your tables and hang API calls. Be very careful when doing this. How much size can you save? Well for Heat it was about 6-7x smaller, 1.1G to 170M, the gain in Nova was more like 30%. Glance was also about 8x but I was took chicken to take that past our dev environment because of the API.

Why?

This is a question you should ask yourself before attempting this. Some of these operations are risky, but it’s also going to hurt your performance if you let these grow without bound. Some of you may want to just do the cleanup and skip the optimization steps. If you do the optimizations I’d recommend you know how long it takes for each table/service. If you can export your prod data onto a test node that will give you a better idea.

Others

  • Horizon just stores ephemeral session data so it’s pretty clean.
  • Neutron – the DB is small so I’ve not looked into it, anyone tried it? Comments welcome.
  • Designate – we’re on an ancient version (Juno) so any experiments here will happen on something more modern

Evolution of a Cloud As Seen Through Gerrit

Over the past two+ years our cloud has evolved, we’ve expanded, we’ve containerized, we’ve added services. Last week I thought it would be interesting to see this evolution by looking at Gerrit reviews over time, so like sand through an hourglass, these are the reviews of our cloud. Note: I used some artistic license to skip reviews that were abandoned or boring.

Review 1000 – Initial version of JDG jobs for git-crypt – Oct 2014

git-crypt is a tool that’s useful when you need to store stuff like certificates in git. We couldn’t find a package for it, so this commit sets up Jenkins Debian Glue jobs to build it.

Review 2000 – Adds monasca rabbitmq checks – Jan 2015

We’ve been running Monasca now for some time, it’s a great tool for monitoring, especially time-series data, like seeing how long it takes to build an instance. Seems like we added checks for rabbit around this time. I’m fairly sure that we were the first non-HP team to roll this out and we are active contributors to this and authored the puppet module.

Review 3010 – Enable keystone auth for designate – March 2015

Around this time we were rolling out Designate, the DNS as a Service project. It’s been pretty solid, so solid that we’re still running the Juno version today.

Review 4001 – Remove scenario variable – May 2015

Our first install was deployed using Puppet OpenStack Builder. As time permitted, we removed all the references to it. This review was to remove a scenario variable. A scenario was like “2 nodes” or “single node” or “full HA”, since we were no longer doing a reference architecture, we didn’t need this code anymore.

Review 5000 – Revert “Adding new launch instance dialog for testing” – July 2015

Even when we do testing, the code is pushed out via the standard process. It’s reverted that way too.

Review 6000 – Add more dnvrco03 test boxes – Sep 2015

Around this time we were standing up our v2 network/system architecture. We setup a fake environment here called dnvrco03 that lets us do burn-in testing on systems.

Review 7002 – Implements A10 templates for NTP server – November 2015

We believe that you need to automate all infrastructure, in this review we’re working on automation for our hardware load balancers. This was a key part of moving to our second network architecture.

Review 8000 – Merge branch ‘import/import-2015121002341449714853’ – December 2015

We use git-upstream to track and merge in changes to projects like Horizon, and also all the puppet modules. This was a merge of puppet-cinder into our local copy.

Review 9000 – Parallelize cephosd/swift/compute deploy – Feb 2016

As you grow hardware you need to speed up the deploy process. This ansible commit changes how we parallelize puppet runs on some servers. We’re still tweaking the deploy to this day.

Review 10000 – Add tesora docker support tooling – March 2016

We’re working on Trove with Tesora now and may roll it out in Docker containers. This was prep work for that effort.

Review 11000 – Update IP for bfd02-trove-002 – June 2016

This one seems boring but it’s interesting for me. The trove box needs a new IP because it’s an OpenStack VM. We’re going to run some services, like Trove, as VMs on top of OpenStack. This avoids having to chew up hardware for just a simple API server. We’re specifically doing Trove as a separate node because we want it to have a separate Rabbit cluster.

Review 12000 – Update heat/docker with new avi-heat libs August 2016

Right now we’re testing out new Heat resources for Avi Networks load balancing. The repo is here: https://github.com/avinetworks/avi-heat

We should have review 20000 in 2017, I’m really curious to see what it will be!