Automated Drupal 7 deployments with Atlassian Bamboo

Update 2016/06/18: I finally fixed the markup of this post after migrating from WordPress to Hugo. I also fixed some typo’s and updated this post with some current information. Even though almost a year has passed since writing it, this post is still relevant for the current 5.12 version of Bamboo.

If you are reading this post you probably already know what automated deployment is and why it’s important. I’ll probably write a blog post about that subject in the near future but first I’m going to write this one about doing automated Drupal 7 deployments with Atlassian Bamboo.

This blog post is not going to be a discussion what the best deployment system is. Some people like capistrano, some people like jenkins, some people like Bamboo. For us at Nascom Bamboo works pretty well because it integrates perfectly with JIRA and Bitbucket, allowing us to view linked JIRA issues when making a build and see what on wich environment a JIRA issue has been deployed.

This is a pretty big blog post, so take your time to go through it.

Bamboo in action

Before I dive into the details about the setup I’ll first show you a 4 minute screencast with some minimal comments how the whole setup works. This gives a nice global picture so you have a better understanding of the steps that come next.

Prerequisites

For this blog post I’m going to make some assumptions about your development setup and Drupal site structure:

  • You have a running Bamboo server, either as a dedicated server or just running on your own computer. The version I’m using for this blog is 5.9.3 build 5918 - 28 Jul 15. You could also use the Bamboo On Demand service from Atlassion if you don’t want to setup your own server.
  • The Bamboo server can access your deployment target environment over an ssh connection.
  • Our Drupal source code is in a git repository that can be accessed by the Bamboo server. In my case it’s hosted on Bitbucket
  • Your Drupal site uses the env.settings.php structure I described in a blog post on the site
  • Drush has been installed on the target environment so we can run drush cc all and drush updatedb if needed.

Right, now let’s dive into the setup.

Bamboo concepts

First you need to get used with some Bamboo concepts (also see the official Atlassian Bamboo documentation):

Build plans and artifacts

A build plan is the process that generates a Bamboo artifact. An artifact is something that can be deployed later, most of the time an executable or a jar file when you are talking about software that compiles, but for our Drupal site this will simply be a compressed tar file called drupal.tar.gz that contains the Drupal source code.

Build plans are composed of three pieces: Stages, Jobs and Tasks. If you look at the graph below it should be clear how those three fit inside each other:

Bamboo Build Plan Anatomy

  • Stages execute sequentially (e.g. a Testing Stage, a Package Building stage). If phase x fails, the build process will halt and phases after x will not be executed.
  • A stage consists of jobs that can be executed in parallel (e.g. multiple types of tests in a testing stages that can run at the same time)
  • A job consists of multiple tasks that run sequentially. The first task will always be doign a source code checkout and then the next tasks use this checked out code to do some magic.

As I wrote above, the result of a build plan will be an artifact that we can use for deployment later on.

Releases, deployment plans & environments

Now that we have a build plan that produces an artifact, our drupal.tar.gz file, we need to get that deployed to our servers. We can use releases and deployments plans to achieve that:

  • Releases are simply tagged successful build plans. E.g. build #65 has been tagged as release-2.2.0.
  • A deployment plan is simply a list of environments.
  • An environment has a list of tasks that will be executed sequentally to deploy a release’s artifact to the environment.

A real life example project

I’m going to take my own Narfum IT Services website as an example deployment project. It’s a Drupal 7 site that will be deployed to a staging and a production environment.

Update 2016/06/18: This narfum.eu site is now offline, but the example is still valid.

A Drupal site can always be split into three pieces:

  • Drupal PHP source (Drupal core + all contrib and custom modules and themes)
  • Database
  • User uploaded content (sites/default/files)

If you follow my env.settings.php setup structure for Drupal 7, it’s easy to keep these 3 separated.

Our Bamboo deployment plan will only handle the first item, the Drupal PHP source. This codebase will be stored in our version control system.

The database will most of the time be deployed one time, and then updated via update hooks in Drupal. These update hooks will be run by our deployment plan (via drush updatedb), so it’s not needed to include an automated database deployment.

The user uploaded content is located outside of the Drupal PHP directory, so we can just leave that alone during deployment and just make a new symlink to it. Sidenote: you never commit this content to your version control software!

The build plan

What a build plan comes down to in practice is simply put:

  1. Download the source code
  2. Do some local modification to those files
  3. Package the result as an artifact.

This means that whatever you put in git, is not necessarily going to end up on your deployment environment. For a Drupal website this means we can do a lot of handy things during the build phase:

  • Remove unwanted files developers forgot to remove from git (e.g. remove CHANGELOG.txt and the other .txt files)
  • Compile SASS code to CSS in production mode instead of development mode
  • Remove sources and development files (SASS code, maybe PSD’s you added in git, development modules like devel and coder )
  • Add modules from another repository you need to make sure exist on production sites (e.g. modules like prod_check which is nice to have on a production environment)
  • Or ofcourse remove development modules that should not be deployed to production (e.g. devel, coder, …)
  • And maybe other things specific to your project

It will take you some time to setup all of this and make it error proof, but after that you have a fully automated build system that will never forget a single thing!

Creating a build plan

Ok, let’s start by creating a Build plan. From the “Create” menu at the top chose “Create new plan” and fill in the fields like in the screenshot below. Do not chose a version control system here yet, we’ll add that later. (If you add it here, it will be a global repository and we don’t want that).

(Click the image for a larger version)

Create a new Bamboo build plan

On the next screen just check “Yes please!” to enable the plan and click Create. We will add the tasks later, we just want an empty build plan for now. When we have our empty build plan, go to “Actions” on the right side and chose “Configure plan”. You will get the screen below (click the image for a larger version):

Configure a Bamboo build plan

As you can see Bamboo has made some default items for us: A “Default stage” stage with one job called “Default Job”. We will use these defaults for this example and just add tasks inside this one job.

Connect our git repository

As we need to do git checkouts in more than one task, we will add our Drupal git repository as a local repository for this project. On the build plan configuration page go to the “Repositories” tab and click “Add repository”:

(Click the image for a larger version)

Add a source code repository

It should be pretty obvious what you need to fill in here.

The easiest way to connect to Bitbucket is with the “Bitbucket” option, but that requires entering a password and I don’t like that. So I always chose “Git”, enter the ssh location for the Bitbucket repository and use an ssh private key to authenticate. But chose whatever method works for you.

It’s important that you chose the “master” branch here as that will be the main branch for our builds. Master should always be the code that goes onto production, so try to keep that best practice for your projects too.

If you want to read about a proper git branching model for your development, be sure to checkout the Git branching model.

Add build tasks

The last thing we have to do now is add tasks that will actually do things for us. Below is a screenshot of the real Narfum project (that has 2 stages instead of 1 but we will ignore the test stage for now) where I’m currently showing you the “Package Drupal” job.

(Click the image for a larger version)

Package Drupal job

There are 3 jobs:

  • A “Source Code Checkout” task: Checkout code from Bitbucket to a local directory on the build server
  • A “Script” task: Do some magic (in this case simply compile sass code to css)
  • A “Script” task: Make a tar.gz file

The first step will always be a source code checkout. Remember that jobs can run in parallel and they are sandboxed in their own directory. So one job does not know about another job’s files, meaning you always have to checkout files (or important an artifact) as the first task in a job.

The 3 tasks in detail:

Task 1: Checkout source code from our repository.

These files will be downloaded in the root directory of our job and will be available to be modified for the remaining tasks.

(Click the image for a larger version)

Task 1: Source Code Checkout

Task 2: Magic

Once we have the Drupal code, we can do a lot of things to modify this code. We keep it simple here and just do a production compile of the SASS files for our theme:

(Click the image for a larger version)

Task 2: Magic

Notice the “Working sub directory” at the bottom! This points to the main theme directory.

Task 3: Create Drupal tarball

The last task is always creating a Drupal tarball of our files now that we’re done with modifying them:

(Click the image for a larger version)

Task 3: Create tarball

I prefer to exclude files like the “node_modules” directory rather than removing them so a new build won’t have to download them all again. (I know they are cached in the bamboo user’s homedir yes, but it’s the idea that counts here: we don’t want to re-do too many things for new builds).

After those 3 tasks are done, we will have a drupal.tar.gz file in our root directory. We now need to make this available as a shared artifact so our deployment plan can use it.

The last build step: create the artifact

In the “Package Drupal” stage, go to “Artifacts” and add a new artifact definition:

Artifact overview page:

(Click the image for a larger version)

Artifact overview

Artifact detail page:

Artifact definition

Make sure the “Shared” box is checked, otherwise it will not be available to our deployment project!

And that’s all there is do to for a Drupal build plan. If you run this build now from the “Run” menu and then “Run plan” you should get a green page saying the build was successful. You will also be able to download the artifact manually at the bottom of the page.

This next screenshot is an example build result page from the Narfum website project. You can ignore the right upper box for now, in your project that will be empty as you don’t have a linked deployment project yet:

(Click the image for a larger version)

A successful Bamboo build

The deployment plan

Still with me after the build plan setup? Good. Because now it’s time to deploy our code to an actual environment.

Create a deployment plan

A deployment plan is nothing more than a container for multiple deployment environments. From the “Create” menu chose “Create deployment project” and fill in the screen like in the screenshot below. Make sure you select the right build project you are attaching to this deployment plan:

(Click the image for a larger version)

Create a deployment plan

After that you will see the configuration page of our empty deployment plan.

Creating enviroments

Now chose “Add environment” and simply give it a clear name. I always go for the structure “$hostingprovider - $env_type” so this could be “AWS - Production”. Click on “Create and configure tasks”.

This will be our production environment but you can of course add a staging environment and testing environment too. Using the “Clone environment” option after our production environment is finished this is very fast to setup.

You should now see the screen below. This is a similar list of tasks you saw on the job configuration page for a build plan:

Create deployment tasks

There are 2 tasks made for you already, which you should always leave as first two tasks for your deployment project: the clean working directory and the artifact download. These tasks make sure you have an empty work directory with just our drupal.tar.gz artifact file.

The next tasks will then be using this drupal.tar.gz file and get it on our target environment. The exact tasks in our deployment will be:

  • Copy the tarball to the target environment via scp
  • Extract the tarball to a new release directory
  • Added the needed symbolic links
  • Run databases updates (if needed)
  • Set this new release as the new live version
  • Run cache clears
  • Clean up older releases

This is what it will look like when we’ve set up all these tasks (I’ve switched to my Narfum website deployment plan again now for these screenshots):

Completed deployment environment tasks

Environment variables

Before we continue with the tasks, we first need to setup some variables. End your task setup process and go back to the deployment plan page. You will get an incomplete task warning but just ignore that for now.

Unfinished deployment environment

Click the “Variables” button at the bottom and add the variables “deploy.hostname” and “deploy.username” with the values needed for your server:

Deployment environment variables

We can now go back to configure our environment tasks.

Environment tasks

Remember that tasks inside a job can halt the deployment process if they fail? That’s the main reasons we split up all these things into separate tasks.

Task 1: Copy the artifact to the remote server

This is adding a “SCP Task” where you simply copy the artifact to the remote server. We can use the variable “deploy.hostname” as “${bamboo.deploy.hostname}” inside tasks, the same goes for “deploy.username”.

I’m also not using a password but ssh keys to login to the remote server. Sadly you have to upload the private key in every task, this is one of the few shortcomings Bamboo still has.

Task 1: Copy the artifact to the remote server

Task 2: Extract the tarball on the remote server

This tasks uses the “SSH Task” type we will be using for the rest of the tasks. It simply allows you the enter shell commands that will be executed on the remote server over an SSH connection.

This task makes a new release directory inside the “releases” directory on the server, extracts the tarball there and then deletes it again.

Task 2: Extract the tarball on the remote server

In this task we add the symbolic links to our env.settings.php file and our sites/default/files content. See this blog post how and why we do this.

Task 3: Update symbolic links

Task 4: Databases updates

This task is currently not present for my project, but you can easily add it here yourself. Make the same SSH Script as above and use whatever drush commands you would like.

The idea for this task is:

  • Your Drupal code has all the needed hook_update_xxx() to upgrade your database schema, enable modules, set variables etc
  • Bamboo runs a simple drush updatedb command and all those update hooks get executed

Task 5: Set this new version as the live version

This simply makes the “www” folder, which is the Apache or nginx document root, a symbolic link to the newly uploaded release folder:

Task 5: Set the new uploaded version as the live version

Task 6: Cache clear

Because this is always a good thing to do.

For most of my project I also do a sudo php-fpm reload here, to make sure the PHP opcache is cleared, but permission to execute that command needs to be set up on your server first and is outside the scope of this blog post.

Task 6: Cachec clear

Task 7: Clean up older releases

This is a nice to have task. For production environments we mostly do this manually when the server raises a disk space warning, but for testing and staging environments this can be automated.

This script only keeps the last 5 recent deployments (determined by the timestamp of the release folder) and deletes the rest. The chmod command is needed because Drupal removes the write flag from the sites/default folder:

Task 7: Clean up older releases

And that’s all for the environment tasks.

Running a deployment

Now that we have a working build plan with a linked deployment plan we can run a deployment. The steps we have to do are always:

  • Push your code to the master branch.
  • Run your build plan.
  • If the build is successful create a release on the build result page. Otherwise fix your code and go back to step 1.
  • Deploy this release to an environment.
  • Check if everything is working

You probably made some errors in your config along the way. Luckily Bamboo will show you a nice big log file where you can debug your problem, so go ahead and test with your own projects now. Your automated Drupal deployment setup is now finished!

Sidenote: Using Triggers it’s possible to automate deployments whenever a build runs successful. That might be a good thing to do for automated deployment to a dev or test environment, but for a production environment you still want to keep that a manual task.

Room for improvement

This blog post of course only shows a very simple deployment setup. To keep this blog post somewhat short I only covered the basic steps in creating the whole deployment setyp. It’s up to you to extended these build and deployment plans for your own project.

Here are a few pointers what can still be improved:

  • Use build scripts (e.g. Ant or Maven tasks with a build file) instead of Bamboo SSH scripts for you tasks. This makes re-use of deployment scripts easier and also adds them to your version control system instead of being hardcoded into Bamboo. Bamboo has special tasks for running these build scripts.
  • Add more tests in the build and deployment phase. Make them proper Unit tests and Bamboo will display them in a special tab in a build, making it easy to see how many of your tests failed.
  • While it’s not possible to run actual tests during deployment phases, you can write deployment tasks that have a fail status (when the script exit code is any other number than 0) to halt a deployment that didn’t got as expected.
  • Almost every step of a build or deployment can have triggers and notifications. You can use these to schedule builds, automate deployments on a successful build and to send out mails or Hipchat/Slack notices when a build or deployment has succeeded.
  • Add more branches. A build plan can have multiple branches so you can build your project from other branches than the master branch. Bamboo can even auto-detect and auto-build these branches using triggers.

There is also a big marketplace of plugins for Bamboo, free and commercial ones, that make your life easier.

Newer versions of Bamboo will most likely add more useful features, so make sure you keep upgrading your Bamboo installation to the latest version.

The end.

That’s all folks, I hope you learned something useful from this post. Use the comments section if you have any questions or remarks!

A better alternative for using phpMyAdmin

Almost every week I run into at least one production site that has a phpMyAdmin installed in the document root of the site, or as a separate vhost on the server. While this used to be pretty required in 2005 to make changes to the database in production, now in 2015 we have better ways to do that.

The problem

The reason phpMyAdmin is installed on the website is that the MySQL server only listens on localhost via a UNIX socket, or on the loopback interface 127.0.0.1 via tcp. That way it’s impossible to connect to it from a remote address.

The bad thing about this is that we have an extra web application on our site we need to take care of. These phpMyAdmin installs are often never updated and might contain security issues that allow attackers to gain access to you production database.

A better alternative: SSH tunnels

If your server is reachable via SSH (even via a VPN connection) we can use a better method: SSH tunnels.

How this works is pretty simple:

  • We connect to our server via an SSH connection
  • Over this SSH connection we set up a tunnel with a port forward that allows a SQL client on our own computer to use the remote database as if it was a local connection

This might sound complicated, but there are a lot of SQL clients available that do this SSH tunneling for you. Below is a screenshot from Sequel Pro for OSX:

Sequel Pro for Mac OSX

You can see 2 things here:

  • the MySQL connection (which always connects to 127.0.0.1)
  • the SSH connection (which is your normal SSH login)

Once this connection has been setup the SQL client works just as it would on a local connection.

SQL clients that support tunneling

These are the clients I use on a daily basis:

  • OSX: Sequel Pro
  • Linux, Windows & OSX: MySQL Workbench
  • And ofcourse using the mysql commandline program in an SSH connection (mostly via drush sql-cli when it’s a Drupal website)

Separating your project and environment settings in Drupal 7

For Drupal 7 sites it’s best that you split up your settings.php in 2 different files: settings.php and env.settings.php.

The logic behind this is:

  • settings.php contains all project-related settings, that are the same for all environments.
  • env.settings.php contains all environment-related settings that are different for all enviroments.

Given this logic, it’s safe to commit the settings.php to our git repository while keeping env.settings.php out of it. The env.settings.php is created manual when setting up an environment.

Examples files

Settings.php contains certain module includes, php ini settings, etc. while env.settings.php contains database connection info, memcache prefixes, reverse proxy configuration, etc.

A simple settings.php file:

$update_free_access = FALSE;
$drupal_hash_salt = 'averysecrethash';

ini_set('session.gc_probability', 1);
ini_set('session.gc_divisor', 100);
ini_set('session.gc_maxlifetime', 200000);
ini_set('session.cookie_lifetime', 2000000);

$conf['memcache_persistent'] = TRUE;
$conf['cache_backends'][] = 'sites/all/modules/contrib/memcache/memcache.inc';
$conf['cache_default_class'] = 'MemCacheDrupal';
## The 'cache_form' bin must be assigned no non-volatile storage.
$conf['cache_class_cache_form'] = 'DrupalDatabaseCache';

$conf['404_fast_paths_exclude'] = '/\/(?:styles)\//';
$conf['404_fast_paths'] = '#\.(?:txt|png|gif|jpe?g|css|js|ico|swf|flv|cgi|bat|pl|dll|exe|asp)$#i';
$conf['404_fast_html'] = '<html xmlns="http:##www.w3.org#1999#xhtml"><head><title>404 Not Found</title></head><body>
<h1>Not Found</h1>
The requested URL "@path" was not found on this server.</body></html>';

$conf['drupal_http_request_fails'] = FALSE;

# environment-specific settings
$settings = DRUPAL_ROOT . '/sites/default/env.settings.php';
if (file_exists($settings)) {
 require_once($settings);
}

An example env.settings.php file:

$databases = array (
 'default' => array (
   'default' => array (
     'database' => 'db_name',
     'username' => 'db_user',
     'password' => '',
     'host' => 'localhost',
     'port' => '',
     'driver' => 'mysql',
     'prefix' => '',
    ),
  ),
);

## We use a memcache prefix as multiple sites use the same bin
$conf['memcache_key_prefix'] = 'prod';
$conf['drupal_http_request_fails'] = FALSE;

## See https://www.karelbemelmans.com/2015/07/reverse-proxy-configuration-for-apache-or-nginx-with-varnish/
$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('127.0.0.1');
$conf['reverse_proxy_header'] = 'HTTP_X_FORWARDED_FOR';

## Zen-theme option
$conf['theme_narfum_settings']['zen_rebuild_registry'] = 0;

## Caching is on for production
$conf['cache'] = 1;
$conf['block_cache'] = 1;

## Error display is off for production
$conf['error_level'] = 0;

## Turn off js and css aggregation ON
$conf['preprocess_css'] = 1;
$conf['preprocess_js'] = 1;

## GD image quality
$conf['image_jpeg_quality'] = 90;

## Maintenance mode?
$conf['maintenance_mode'] = 0;

Directory structure

How we use this in practise is that settings.php is always deployed with the rest of the Drupal code, while the env.settings.php is kept outside of the website root but linked to with a symbolic link.

Example directory structure:

httpdocs/
  - sites/
    - default/
      - settings.php
      - env.settings.php -> $HOME/shared/env.settings.php
      - files -> $HOME/shared/files
shared/
  - env.settings.php
  - files/

You can see we use symbolic links for the files and env.settings.php file. These links are actually made by our deployment script. The httpdocs directory (= the website root) is actually also a symbolic to a specific release, like this:

httpdocs -> $HOME/releases/release-1.0.3
shared/
  - env.settings.php
  - files/
releases/
  - release-1.0.3/
    - sites/
      - default/
        - settings.php
        - env.settings.php -> $HOME/shared/env.settings.php
        - files -> $HOME/shared/files

If you maintain this logic, setting up an automated deployment is easy and you will never have to do bad stuff like committing passwords or environment specific settings to your Drupal code.

Puphpet - PHP Vagrant development images made easy

Update 2016/07/06: You should probably not be using Vagrant (and Puphpet) anymore now that we live in the wonderful world of Docker containers. I will leave this blog post online for now, but take this information as ‘extremely outdated’.

Puphpet is, beside a horrible word to type, a great tool to create specific Vagrant images that contain a very fine-tuned development stack for PHP development.

The website generates puppet scripts that will provision a Vagrant image with your specific configuration. Using the online configuration tool you have huge selection of things to include in your image:

  • Ubuntu or CentOS base images
  • apache or nginx webservers
  • php versions from early 5 version to php 7 releases candidates and even [HHVM](https://github.com/facebook/hhvm/
  • MySQL or MariaDB
  • PostgreSQL
  • MongoDB
  • Reddis
  • SQLite
  • Mailcatcher
  • Beanstalkd
  • RabbitMQ
  • Elasticsearch
  • Apache Solr (4.x and 5. versions)

Even if you know nothing of server setups this is a great tool to build a production-like environment to develop on. It’s easy to commit these config files in your git repository and let co-workers use them without having to configure one single thing about it.

If you haven’t used Vagrant yet, you are living in the past and should really catch up

Weird errors with Drupal themes containing a node_modules folder

Nowadays frontend developers are totally into the whole compass, grunt, gulp, bower, etc.. thing to automate things as compiling SASS code into css, minimizing javascript, optimizing image sizes and so on.

These tools include using the node package manager, npm, to install a load of modules using a package.json file. These modules most of the time go inside a folder “node_modules” and “bower_components” inside the root folder of your Drupal theme. And that’s where stuff can go wrong.

The error

If you’ve seen this next error after running a drush cc all or flushing caches via your site, you’re probably having the problem I’m going to describe below:

WD menu: PDOException: SQLSTATE[23000]: Integrity constraint violation: 1048 Column 'title' cannot be null: INSERT INTO {menu_router} (path, load_functions, to_arg_functions, access_callback, [error], access_arguments, page_callback, page_arguments, delivery_callback, fit, number_parts, context, tab_parent, tab_root, title, title_callback, title_arguments, theme_callback, theme_arguments, type, description, position, weight, include_file) VALUES ...

This failing menu rebuild action will probably leave your site in a broken state where no menu callbacks work.

After that error message you probably get a var_dump of the router menu code, which is pretty much useless, except that it contains references to files like this:

sites/all/themes/drupaltheme/node_modules/gulp-eslint/node_modules/eslint/node_modules/doctrine/coverage/lcov.info

The cause

See what’s going wrong there? There is a .info file being read by Drupal, thinking it’s a file for a Drupal module, while it actually is a npm module. For some npm modules this doesn’t cause any problems, but sometimes it does like in the example above and then you get vague errors like cache flushes failing and leaving your site in a broken state.

The solution

What’s the solution then? For now: remove the node_modules directory if you are not theming. At Nascom we automatically exclude this folder during builds (just like we exclude folders called .git and .sass-cache). There might be some proper way to tell Drupal 7 to ignore this folder, but I haven’t found it yet.

Edit: For Drupal 8.x there is already an issue to ignore those 2 directories.

Running rsync over ssh on a non-standard port

It seems a lot of people don’t know how to rsync over ssh when the server is running on another port than 22. Here’s the correct command to do that, with the ssh service running on port 9999:

rsync -a -v -e 'ssh -p 9999' user@host:/path/to/files .

Running individual cron commands in Drupal 7

It took me some searching, but it seems this is the best way to run a single cron command, in this case the scheduler cron job to publish and depublish content:

* * * * *  drush --root=$HOME/httpdocs eval "scheduler_cron();"

It’s safe to run this every minute and leave the normal cron job to run every hour:

0 * * * * /usr/bin/flock -w 0 $HOME/.cron.lock drush --quiet --root=$HOME/httpdocs --uri=http://www.example.org cron

In an upcoming blog post I’ll explain why running a full cron every minute is a bad idea.

Integrity constraint violations on the Drupal 7 block table

One of our Drupal 7 sites was giving a lot of white pages of death (500 errors). Looking through the Drupal logs, we saw a lot of these entries that were the cause of the white pages:

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'spock-system-user-menu' for key 'tmd': INSERT INTO {block}

It’s obviously a problem with the block table, so let’s have a look at it:

[user@server01 www]$ drush sql-query 'describe block'
bid        int(11)      NO PRI NULL auto_increment
module     varchar(64)  NO
delta      varchar(32)  NO 0
theme      varchar(64)  NO MUL
status     tinyint(4)   NO 0
weight     int(11)      NO 0
region     varchar(64)  NO
custom     tinyint(4)   NO 0
visibility tinyint(4)   NO 0
pages      text         NO NULL
title      varchar(255) NO
cache      tinyint(4)   NO 1
css_class  varchar(255) NO

There is no field called ‘tmd’, so it must be an index on the table:

[user@server01 www]$ drush sql-query 'show index from block'
block 0 PRIMARY 1 bid     A 36 NULL NULL BTREE
block 0 tmd     1 theme   A 12 NULL NULL BTREE
block 0 tmd     2 module  A 36 NULL NULL BTREE
block 0 tmd     3 delta   A 36 NULL NULL BTREE
block 1 list    1 theme   A 12 NULL NULL BTREE
block 1 list    2 status  A 18 NULL NULL BTREE
block 1 list    3 region  A 36 NULL NULL BTREE
block 1 list    4 weight  A 36 NULL NULL BTREE
block 1 list    5 module  A 36 NULL NULL BTREE

Yep, there is an index called ‘tmd’ and it’s a unique one (the 0 in the second column means it’s not non-unique). We can see it consists of the 3 colums theme, module and delta (hence the name tmd).

The problem

The most occurring problem here, and this is well known in the Drupal community, is that large delta values make the block query fail. Take the 2 block names below, which will be the value for the delta field in the query:

  • feature_multilingual_frontpage_switcher_with_select
  • feature_multilingual_frontpage_switcher_with_links

They are both longer than 32 characters. The value for the ‘tmd’ index will trim the delta field down to 32 chars, making them both the same string:

  • feature_multilingual_frontpage_s
  • feature_multilingual_frontpage_s

With the module and theme being the same value for both, this will now give a duplicate key error and explain our watchdog error.

The fix?

The easiest fix for that problem is to simply increase the length of the title, delta or theme column, depending which one is giving problems. In our case that would be the delta column:

ALTER TABLE block
CHANGE delta delta VARCHAR(128)
CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '0'
COMMENT 'Unique ID for block within a module.';

Here we raised the length of the column to 128 chars, which should be a lot better than the default 32. We can’t raise if much beyond that size though, as the maximum length of the index key on InnoDB tables is limited by MySQL (see the CREATE INDEX documentation for the full explanation).

Sadly this did not fix our problem, as we kept seeing this error in the logs after raising the column size. You can also see that in the error I posted at the start of this post the size of the key was not exceeding the length of any column, so that was already an indication that there was something else causing the problem.

The real problem gets revealed

If we dive into the Drupal code (simply searching for pieces of the query string from the error), we find the query that’s giving the exception is being called inside the function _block_rehash.

At the top of the Drupal 7 API page for this function we can see a list of all functions that call this function, and one of them is block_flush_caches:

function block_flush_caches() {
  // Rehash blocks for active themes. We don't use list_themes() here,
  // because if MAINTENANCE_MODE is defined it skips reading the database,
  // and we can't tell which themes are active.
  $themes = db_query("SELECT name FROM {system} WHERE type = 'theme' AND status = 1");
  foreach ($themes as $theme) {
    _block_rehash($theme->name);
  }

  return array('cache_block');
}

This function is an implementation of hook_flush_caches, a function that gets called at every full cache flush.

So in short: this _block_rehash function is called at every cache flush. And that’s exactly where our problems lies with this site.

The site we are reviewing is a very busy Drupal 7 site with a lot of concurrent visitors. Drupal cron is scheduled to run every minute as this is needed for scheduled content publishing and sending out push notifications to mobile devices (but that’s not really important for this issue).

A cron run does a partial cache flush (called in system_cron) and after that the first visitor of the site triggers a cache rebuild. But because this site is a very busy site, the _block_rehash function seems to get called more than once at the same time, by different visitors.

As the_block_rehash function does not start a transaction, or have any kind of table locking on the block table, we can get a situation where the same query wants to get inserted more than once. Luckily our database fails on that because of the unique index, and thus saving us from corrupt data. But the user will get a 500 error returned and thus see a white page instead of the content he expected.

The real fix

For now we simply added a try/catch block around the query in the _block_rehash function, so that when it fails it won’t give a white screen of death anymore:

try {
  drupal_write_record('block', $block, $primary_keys);
}
catch (Exception $e) {
  watchdog('block_rehash', 'Trying to insert an existing block with message: @message', array('@message' => $e->getMessage()), WATCHDOG_WARNING);
}

The proper solution would be that Drupal implements a transaction system that prevents these kind of problems happening again. A shoutout to Jasper Knops for helping me figure out this issue (and proofreading this post :)

About Drupal cron

I mentioned above that this site runs a Drupal cron every minute, which is not a good idea on any busy site. If you have jobs that need to run every minute you should invoke individual cron commands instead of running a full Drupal cron.

Reverse proxy configuration for Apache or nginx with varnish

A while ago I wrote a blog post about how you should tell your Drupal 7 site that the webserver is using a reverse proxy setup to make sure the correct ip adress of visitors gets sent to Drupal.

If you want to have the correct ip address in your Apache or nginx logs too, you need to add some configuration. Two things you need to take care of:

  • The Varnish vcl needs to ad an X-Forwarded-For header to backend requests
  • Apache or nginx needs to use this header for logging instead of the client ip address

Varnish vcl configuration

This is an example for a varnish 3.x vcl, where you need to add the X-Forwarded-For header in the vcl_recv function:

sub vcl_recv {
 if (req.restarts == 0) {
    if (!req.http.X-Forwarded-For) {
      set req.http.X-Forwarded-For = client.ip;
    }
  }

Sidenote: If you want a fully working vcl for Drupal or WordPress, you should take a look at Mattias’ github project for varnish 3.x or varnish 4.x templates.

Apache configuration

After the X-Forwarded-For header has been added by varnish, we can tell apache to use it for logs. This is an example from an Ubuntu 12.04 LTS server and the file /etc/apache2/apache2.conf, other setups might be similar.

Our vhosts use the “combined” log method, so that’s actually the only one you need to update. But it’s nice to keep everything in line and update all the log formats that used %h before:

# The following directives define some format nicknames for use
# with a CustomLog directive (see below).
# If you are behind a reverse proxy, you might want to change %h
# into %{X-Forwarded-For}i
#
LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

In the example above I already replaced the %h variable with the X-Forwarded-For variable.

Nginx configuration

Nginx has a very simple configuration option you have to change. In your vhost config you have to add this code in the server block:

server {

  ...

  # Pass the real ip from varnish/
  set_real_ip_from 127.0.0.1;/
  real_ip_header X-Forwarded-For;

  ...

This ofcourse assumes the varnish server runs on the same server as the webserver. You might need to change this to the ip of your actual varnish server if it runs on another ip.

If you’ve done all of this correctly, you should now see the actual client ip addresses appear in your webserver logs.

Performance debugging basics: the MySQL slow log

In this blog post I’ll show you how simple and usefull the MySQL slow query log can be to find performance problems in a Drupal 7 site running a complex import task as a drush batch process.

Look at the graph below, showing a Zabbix graph of the CPU utilization of a staging server for one of our projects:

Zabbix CPU utilization graph during a performance problem

On the left side, between midnight and 6:30, you can see a lot of colors in the graph. The green one is idle time, the blue one is a PHP application doing some work but the dark yellow and purple ones are bad ones: they indicate that there’s way too much disk I/O going on while running the PHP application.

As this server has no New Relic APM Pro account, doing a deep code inspection wasn’t possible. I did however see that most of the processing time was spent on MySQL, as you can see in the standard (free) New Relic APM graph below:

New Relic APM graph of a drush command

The first and most basic thing you then have to do is turn on the MySQL slow log on the server. The slow log is useful to log 2 things:

  • queries that are taking longer than a set amount of time to execute
  • queries that are not using indexes where they should be using one

In MySQL you simply have to add these statements in your my.cnf file. This example config is for a MariaDB 5.5 server, if you have a different version or a different fork of MySQL, check the documentation for the proper slow log configuration as it might be different:

# Slow logging
slow_query_log = OFF
log_slow_queries = /var/log/mysql/slow.log
long_query_time = 2
log-queries-not-using-indexes

In this example config I’ve set the query cut-off time to 2 seconds, so anything that takes longer than 2 seconds to execute gets logged. Make sure you create the directory (if it doesn’t exist yet) and touch an empty slow.log file:

# Create the directory
mkdir -p /var/log/mysql
chown mysql:mysql /var/log/mysql

# Create an empty log file
touch /var/log/mysql/slow.log
chown mysql:mysql /var/log/mysql/slow.log

Restart your MySQL server to load the configuration changes.

This config does not turn on slow logging yet, it just configures the log location and sets the options. To actually enable it, you need to run this command in a MySQL prompt with admin privileges.

The example below uses a Plesk server (which we use for our shared development enviroment), but it could be as simple as running “mysql” as root for your system:

[root@acc01 ~]# mysql -uadmin -p`cat /etc/psa/.psa.shadow`
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 2849
Server version: 5.5.42-MariaDB-log MariaDB Server
Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [mysql]> SET GLOBAL slow_query_log = 'ON';

We don’t enable this by default for the server as the log can fill up quite fast. So do not forget to turn it back off when you are doing by running the same command with the value OFF again:

SET GLOBAL slow_query_log = 'OFF'

The benefit of using this command is that you can run it on the server without having to reload or restart it. So once the configuration is setup, enabling or disabling the slow can be done with just this simple command.

Right, now we have our slow log turned on, let’s see what a tail -f /var/log/mysql/slow.log now shows while our PHP application runs. I’ll filter out all the useless queries (and there probably will be a lot) and go straight to the performance killer:

# Time: 150625 16:46:17
# User@Host: user[user] @ localhost []
# Thread_id: 16 Schema: db_prod QC_hit: No
# Query_time: 1.522648 Lock_time: 0.000053 Rows_sent: 0 Rows_examined: 200343
SET timestamp=1435243577;
UPDATE user_member_status SET is_invoiced=0;

This query would show up every 2 seconds, meaning the PHP script probably calls it every time it starts a new batch of things to process. MySQL logs this query because it doesn’t use an index and therefor has to scan all the rows in the table, which leads to a load of blocks being loaded from disk and is a thing you want to avoid at all cost in databases. There are just over 200 000 rows and the query takes 1.522 seconds to run, which is a very long time for a query.

If you look at the query, you can see that it really doesn’t make sense to run this query over and over again. It clearly sets the value for the is_invoiced column to 0 for all the rows in the database, running it more than once is useless.

It could also be missing a where clause, but that’s something only the developer that wrote the code can tell.

So, with this information I went back to the developer, who then confirmed it was a bug: this query should only be run at the start of the process. He made a fix, pushed it to git, made a new release with our Bamboo deployment server, pushed it to acc01 and within 15 minutes of my “discovery” the performance fix was live.

If we look back at the Zabbix graph from above, we can see that at the right side there is now again a lot of CPU action, but only good things. The disk I/O bottleneck is gone as we removed the “bad” query locking up the server:

Bad query locking up the server

That’s it, another performance problem found and fixed using some basic devops tools. I hoped you learned something for this blog and remember to turn off your slow log when you are done!