Monthly Archives: November 2014

My Key Learning from Paris

I learned a bunch of things in Paris: French food is amazing, always upgrade OVS, but the most important thing I learned in Paris?

All operators have the same problems.

Every operator session was a revelation for me. As it turns out, I’m not the only one writing Icinga check jobs, Ansible scripts, trying to figure out how to push out updates, or fighting with ovs. These sessions not only provided by validation to let me know that we’re not doing stuff totally wrong, but also allowed everyone to share solutions. Below are some of the themes that I found particularly interesting or relevant to this premise.

Upgrades

One topic in which operators have a common interest was upgrades. Only a few of the operators have any real experience with this, but its been a pain point before for many. Some have resorted to fork-lift style “bring up a new cluster” upgrades, which is not nice for customers and requires extra hardware. How do we solve this? It seems that some projects have upgrade guides, but finding documentation on a holistic upgrade is difficult, especially gathering information on ordering (Cinder before Nova, for a contrived example). Issues specifically called out in the session were config changes (deprecations and new additions) and database migrations (including rollback). Rollback is especially worrying as its not well tested. There was no solution for this except a resolve to share information on upgrading via the Operators Mailing List.

Etherpad for Upgrades

CI/CD & Packaging

Another issue that operators face after an Openstack deployment is how to get fixes and new code out to the nodes. An upstream bug fix might take a few days, plus a week for a backport, plus a month for the distro to pick it up. This means that even if the operator fixes it themselves, they still have a delay in getting it releases. During this delay you might be impacting customers who may not be that patient. The solution for many is using a custom CI/CD system that builds “packages” of some sort, whether distro packages or custom built vdevs. It was interesting to hear that people have a myriad of solutions here. We use a toolchain quite similar to the upstream Openstack tool chain that outputs Ubuntu packages. However even with this method we still rely on dependencies and libraries provided by the Ubuntu Cloud Archive, as much as possible anyway.

There was a bunch of talks on this subjects, not just operator talks, here are a link to the ones I attended:

  • CI/CD Pipeline to Deploy and Maintain an OpenStack Iaas Cloud (HP)
  • CI/CD in Practice (Comcast)
  • Building the RackStack (Rackspace)
  • If you know other ones that I should go back and watch please comment here.

    Automation (Puppet)

    We use Puppet to configure and manage our Openstack deployment, so this was an area of interest to me. It was great to see a 100% full room with everyone focused on improving how Openstack is configured with puppet. From discussions on new check jobs to a conversation about how to better handle HA, it was great to see the real sense of community around Puppet.

    Etherpad

    The second puppet session was more of a “how are you solving X” and “how could it be better”. This was also a great session with some interesting notes.

    Towards a Project?

    Finally, one of the more interesting things happened later in the week when Michael Chapman and Dan Bode grabbed me in the lobby and wanted me to preview an email that was about to go out. In brief, they were proposing an operations project. This was born of the realization that I’d also made, we’re all solving the same issues. This email led to an impromptu meeting of about 30 people in the lobby of the Meridien hotel and the beginnings of an Operations Project.

    Birth of a New Project

    Birth of a New Project

    It was not the ideal venue, but we still had a great discussion on a few topics. The first was, can we have a framework that will allow us to share code? We all agreed that this project’s purpose is not to bless specific tools (for example Icinga) but instead to allow us to share Icinga scripts alongside other monitoring tools. Although this had been tried before with a github repo it had only a couple contributions, hopefully this will improve. We then dove into all the different tools that people use for packaging and whether or not any could be adapted as general purpose. Having a good tool for Ubuntu/Debian packages seemed to be something in great demand. Adding debian support for Anvil seemed to be something worth investigating further.

    Other topics of discussion included:

  • log aggregation tools and filters
  • ops dashboarding tools
  • Ansible playbooks
  • You can see the Etherpad link below to see what options for each were discussed, but the best idea is to catch up on the mailing list threads which are still ongoing and add if you have to share anything.

    Etherpad

    Operations Wiki