Building a portable, scalable, reusable Deployment Pipeline for an arbitrarily complex environment (part 3)

This is the last of three posts about building an advanced deployment pipeline.

CI servers themselves aren’t really much more than glorified remote script runners and just because you might have a CI server setup with some automated tests does not mean you are doing continuous integration. As I mentioned in part 2, the “integration” part of continuous integration actually occurs as the source-control level whereby developers are merging/rebasing their changes with the latest from the master branch. That integration should occur at least once per day, and then automated tests should be run to see if anything has broken. Only if every developer is doing this regularly can you say that you are actually doing continuous integration.

  • Jenkins: is still the most mature and wide spread CI server around today. It has an active community with over 1000 available plugins giving it unrivalled flexiblility and functionality. This is really great, because most problems are not new, so it’s likely that whatever your case maybe, there is likely already a plugin to help you. It’s written in java and fully open source. If you have a complex system, it’s almost a certainty that jenkins will be able to handle it. On the downside, jenkins can be a pretty complicated beast to configure, and the user interface is pretty clunky and ugly. A few of the key pieces of sowftware and plugins I always use with jenkins to build deployment pipelines are swarm pluginjenkins job builder and cloudbees flow plugin.
  • Thoughtworks Go: I think it’s a bit surprising that not many people seem to have heard of Go, considering the guys who wrote the book on Continuous DeliveryJez Humble and Martin Fowler both work for Thoughtworks. No surprises that Go is designed “out of the box” to be suitable for Continuous Delivery and building Deployment Pipelines. Since they open sourced the product in 2014, you can see how healthy the community is on github. It has probably one of the nicest interfaces for any CI tool out there, however the rate of issues being created versus resolved is something to keep an eye on.
  • Team City: If you have used one of Jetbrains IDEs, then you’re probably are also familiar with Team City. They produce good tools that are popular with many developers. It’s no surprise that Team City is a solid CI server as well, which of course integrates seamlessly with your IDE. You can create dependencies via build chaining between individual jobs to set up a deployment pipeline. Team City is a capable tool and free for a small number of build and agents, however it is closed source and if you’re expecing to run at large scale it’s going to become pretty expensive.

Artifact Repository:

Probably one of the least exciting topics is the storing and retrieving of build artifacts, however it is important, especially for compiled languages. If releasing to production means pulling the latest changes and then compiling, I’m sorry but you’re just doing it wrong. Each time you compile your source, you’re more likely to end up with a different binary than not, even if the source has not changed. That means it is possible for the runtime execution of your program to be different, and thus any testing and verification can really only be garanteed for a certain binary, and not the source commit.

If you’re using something like php then this is potentially less of an issue, however since Facebook started turning PHP into Java, then even that is probably not true in all cases.

  • Artifactory: is a flexible repository that in it’s free version can store java packages such as jar, war and ear, but in it’s paid-for version can also mirror node npm, python pypi and ruby gems as well as OS packages such as rpm and deb! It also integrates with CI servers such as Jenkins and Team City and Go. It’s open source, which is nice, but to get all the goodies, you will need to fork over the cash.
  • Nexus: will do pretty much all the same things that Artifactory will do, however the additional language support for other package types come in the open source version. It will also integrate with all the major Ci servers and is actually a bit cheaper then Artifactory.
  • rpm/deb mirrors (and other OS packages): I mention this separately, because just like controlling your application dependencies is important, so are controlling your OS dependencies. We’ve probably all been in the situation where the depenency we were downloading somewhere off the internet went missing, or when we got an update that unexpectedly broke the build or brought down production (because we didn’t test it - oops!).

Workflow Visualisation:

Perhaps workflow tools might seem like an afterthought in the context of a Deployment Pipeline, but unfortuantely this is not so. When the topic comes up about how to manage releases, what is the definition of “done”? These tools are a necessary link in the chain.

  • Jira: Atlassian Jira is a popular and powerful issue management and workflow visualisation tool. It is highly configurable which means it’s great for handling all sorts of agile and ITIL-style processes to fit your organisation, but that is also often where it also goes wrong, resulting in a configuration nightmare. Jira’s power is also it’s curse, however when used correctly it is a fine and effective tool with a lot of in-built features and reports. It supports both scrum and kanban, but is unfortunately opinionated in these areas so if you are using some kind of blended “scrumban” then you might run into trouble. Jira can be integrated with quite a few different tools, but of course works best if you stick to the Atlassian suite.
  • Trello: is a lightweight cloud service for Kanban-style worfklow management. If you don’t want the hassle of complex workflows and just want to get stuff done then Trello could be a good fit for you, if you can live without customizations. Through other services such as Zapier, you can integrate different services with Trello so that you can get a high level overview of progress. You can also upgrade the service to business class to get access to power-ups.
  • Kanban vs Scrum and DevOps: I feel that its worth making a note that in my experience, Scrum is not ideal for doing Continuous Delivery and DevOps. Things like time-boxed sprints, backlog grooming, sprint planning, and stakeholder demos all start to feel quite restrictive in their format and routine, especially when you want the flexibility of releasing every day. Kanban is better suited for Continuous Delivery, and I’d go out on a limb to say that I think the DevOps community as a whole is moving towards support of Kanban over Scrum.

Monitoring and metrics:

There are basically two forms of monitoring and metrics that are important. You have read-time monitoring that you need to react to incidents from from production events, and then you have metrics for analytical and statistical purposes (aka Business Intelligence) that can come from either (or both) log files or database reports .

  • Prometheus: is an open source monitoring tool built by Soundcloud. I first learned about Prometheus from my friend Matthias Grüter about it at a Stockholm DevOps meetup and thought it looked quite impressive. It seemed like it actually offered something new and better things than a lot of the other monitoring tools which had been around for a while like Nagios and Graphite. It has instrumentation for lots of different languages, support for different frontends and backends and is easy to setup. Maybe it won’t do everything you want but it certainly should be a good start.
  • ELK: meaning ElasticsearchLogstash and Kibana, which is a powerful set of tools to perform logfile analysis. ELK is gaining wide acceptance because they work well and are open source with a vibrant community. Logstash will handle almost any log you can ship to it, such as web logs, database logs, syslogs and windows event logs which can then be stored, and parsed by elastic search and finally displayed by Kibana. Even though it’s 3 separate components, they are all designed to integrate seemlessly with each other. Compare this to a paid, closed source service like Splunk, it’s hard to imagine that they will survive too much longer without doing something drastic.
  • Pentaho: Is an open source BI platform that offer a free community edition as well as an enterprise product with lots of heavy stuff. If your needs aren’t met by the free version, then at least you’ll get to try and feel the product to see if you need all the power that’s offered in the paid version. I’m not sure what their pricing and licensing is like but there aren’t too many companies in this space that offer products that look as good as this, are open source, with free community editions.

If you made it this far, I hope it has been a worthwhile read. At some point in the near future I hope to be able to open source some code to show how all these pieces can be assembled, but we’ll see how that goes. Obviously the amount of work involved to get the basics up and running is not really something that you can whip up in just a weekend.

Anyways, if there are certain areas where you wished to have more information and option about tooling, maybe this list will help you. Otherwise if you have questions or comments you can shoot me an email.