How to add useful logging to your Drupal 7 code

A lot of times my job involves debugging problems on Drupal 7 production websites and find out why things aren’t working, or even worse: why do they only work some of the time.

The first thing I then do is check the Drupal logs, but sadly developers have a habit of not adding proper logging. So let me give you some pointers on how to add actual useful logging to your code.

If you are reading this post for the sake of analysing Drupal logs, you might also want to read this blog item about debugging Drupal 7 development.

Do I need to log something or just silently fail?

Let me start with stating that not everything that doesn’t go as expected needs to get logged. We want logs yes, but only useful ones.

Translating that to Drupal means that every exception you catch probably doesn’t need to end up in the logs. Example:

function build_some_render_array($field) {
  $build = array(); // Drupal render array

  // more stuff happens here

  $entity = $field['entity'];
  $entity_wrapper = entity_metadata_wrapper('node', $entity);
  try {
    $value = $entity_wrapper->field_some_name->value();

    // more code

    return $result;
  }
  catch (Exception $e) {
    // do nothing
  }
  return drupal_render($build);
}

What this means is that there most likely is no field called “field_some_name” for this entity. If you are passing different types of entities to this function, this can happen. This doesn’t mean something is wrong with this entity, we simply don’t use this field for building up the $build array.

The problem is that if you don’t try/catch this block, you hit the Drupal default exception handler. That results in a white page of death. So we catch it here, but don’t log anything as it’s not something we need to know about in debugging problems.

Use the proper error levels

In includes/common.inc you have this function defined:

/**
 * Returns a list of severity levels, as defined in RFC 3164.
 *
 * @return
 * Array of the possible severity levels for log messages.
 *
 * @see http://www.ietf.org/rfc/rfc3164.txt
 * @see watchdog()
 * @ingroup logging_severity_levels
 */
function watchdog_severity_levels() {
  return array(
    WATCHDOG_EMERGENCY => t('emergency'),
    WATCHDOG_ALERT => t('alert'),
    WATCHDOG_CRITICAL => t('critical'),
    WATCHDOG_ERROR => t('error'),
    WATCHDOG_WARNING => t('warning'),
    WATCHDOG_NOTICE => t('notice'),
    WATCHDOG_INFO => t('info'),
    WATCHDOG_DEBUG => t('debug'),
  );
}

These log levels are not something Drupal came up with, but are actually defined in the syslog RFC 3164 and are considered to be the standard severity levels for UNIX services:

Numerical     Severity
 Code
 0            Emergency: system is unusable
 1            Alert: action must be taken immediately
 2            Critical: critical conditions
 3            Error: error conditions
 4            Warning: warning conditions
 5            Notice: normal but significant condition
 6            Informational: informational messages
 7            Debug: debug-level messages

Adding the severity level is pretty obvious using these functions:

// Normal watchdog() call
function watchdog(
  $type,
  $message,
  $variables = array(),
  $severity = WATCHDOG_NOTICE,
  $link = NULL) {}

// Just pass the exception on to watchdog:
function watchdog_exception(
  $type,
  Exception $exception,
  $message = NULL,
  $variables = array(),
  $severity = WATCHDOG_ERROR,
  $link = NULL) {}

(Notice how watchdog() uses WATCHDOG_NOTICE as default, so you probably want to add a better severity level for all the calls you make in your code that are actually errors)

One of the ideas behind syslog is that you can attach an alert trigger to a severity level, causing it to send out an email or text message. While your Drupal site doesn’t have to use syslog (although it’s awesome and you should really look into it), using the same severity levels is a good way to be able to filter important things in your logs.

If you prefer Drupal sending out alerts when a certain level of severity is found, the watchdog module has hooks that allow you to create triggers. There are also existing modules that do a similar thing.

Use the correct watchdog calls

You always need to add 4 arguments to your watchdog() calls:

Tip: Are you logging an error that was caught from an exception? Just use watchdog_exception ($module, $exception).

Good stuff to log

This is a list of things you really want to find in your logs:

CRUD operations

Just about everything you can CRUD should be logged (nodes, users, custom entities, etc …). Add as much possible variables as needed in the log entry, but never log passwords or other secret things!

Example message from the Drupal node module that logs these items with type “content”:

Watchdog log entry of a node edit action

Notice how there is a link at the bottom under “Operations”. This is the $link argument the node module passed to the watchdog() call when making this log entry:

API calls

Just like the previous CRUD items you should also log any outgoing API call your system makes. Yes, for a site that is API-driven this will mean a lot of log records, but trust me, it’s worth it.

Include a FULL list of parameters and a response code, combined with using the proper watchdog severity level:

If an error occurs, it’s even a good idea to write more error information into a separate log table in Drupal as this information is vital in debugging problems. You want this information available for calls in the past, so you can easily see what happened back when your customer filed a bug report.

Cache invalidation messages

The modules we use for integrating Drupal with varnish cache invalidation are varnish and expire. The varnish module handles the backend communication module with the varnish server while the expire module determines what should be invalidated and when.

These 2 modules log their actions to watchdog so you can see what ban commands are set to varnish and which urls expire decided to clear. This information is vital when you are debugging the “why is this page not being refreshed?” problems. Spoiler: it’s most of the time the Drupal anonymous page cache that you are not clearing before the expire call is made.

Calls and results of drush commands

Most of our import and sync scripts are written as separate drush commands and not just run from hook_cron or form submissions on the website. Everytime such a script is called it logs a message, even if it doesn’t do anything. We just want to know that it ran.

If you configure your Drupal cron to run via the Linux crontab as “drush cron”, you can already see that log message in your logs.

Bad stuff appearing in logs

Now this is the real import part. Logs are only good if they only have useful things and are not filled with garbage. This doesn’t mean anything about the total amount of log entries generated. If it’s a lot of entries, but all useful ones, it’s still good logging.

These things you really do not want to have clogging up your logs:

403 and 404 errors

There are a few files browsers and crawlers will request without your site linking to them. The best way to prevent those giving a 404 is to actually make those files, even if they are just empty files:

/robots.txt
/favicon.ico
/apple-touch-icon.png
/apple-touch-icon-precomposed.png
...

There are also scripts that generate all the touch icons for you from a single source image.

Another feature your site’s webserver should have is to turn off access logging for those files. This is an example of an nginx vhost:

location = /favicon.ico { log_not_found off; access_log off; }
location = /robots.txt { allow all; log_not_found off; access_log off; }

Enable the Drupal fast_404

In the default.settings.php there is a clear explanation about fast 404’s:

[php]
/**
 * Fast 404 pages:
 *
 * Drupal can generate fully themed 404 pages. However, some of these responses
 * are for images or other resource files that are not displayed to the user.
 * This can waste bandwidth, and also generate server load.
 *
 * The options below return a simple, fast 404 page for URLs matching a
 * specific pattern:
 * - 404_fast_paths_exclude: A regular expression to match paths to exclude,
 * such as images generated by image styles, or dynamically-resized images.
 * If you need to add more paths, you can add '|path' to the expression.
 * - 404_fast_paths: A regular expression to match paths that should return a
 * simple 404 page, rather than the fully themed 404 page. If you don't have
 * any aliases ending in htm or html you can add '|s?html?' to the expression.
 * - 404_fast_html: The html to return for simple 404 pages.
 *
 * Add leading hash signs if you would like to disable this functionality.
 */
$conf['404_fast_paths_exclude'] = '/\/(?:styles)\//';
$conf['404_fast_paths'] = '/\.(?:txt|png|gif|jpe?g|css|js|ico|swf|flv|cgi|bat|pl|dll|exe|asp)$/i';
$conf['404_fast_html'] = '<html xmlns="http://www.w3.org/1999/xhtml"><head><title>404 Not Found</title></head><body>
<h1>Not Found</h1>
The requested URL "@path" was not found on this server.</body></html>';

/**
 * By default, fast 404s are returned as part of the normal page request
 * process, which will properly serve valid pages that happen to match and will
 * also log actual 404s to the Drupal log. Alternatively you can choose to
 * return a 404 now by uncommenting the following line. This will reduce server
 * load, but will cause even valid pages that happen to match the pattern to
 * return 404s, rather than the actual page. It will also prevent the Drupal
 * system log entry. Ensure you understand the effects of this before enabling.
 *
 * To enable this functionality, remove the leading hash sign below.
 */
# drupal_fast_404();

I can’t explain it better than the function description already says. On very busy sites this can be a big performance increase and save your website from a lot of 404 notices, but it has side-effects you should be aware of.

PHP warnings about variables not defined

These truly are the worst kind. It’s a result of developers not initializing their variables correctly, not adding enough “what if stuff is empty?” checks or simple passing wrong variables to a function.

Examples:

Warning: Invalid argument supplied for foreach() in feature_site_cron() (rule 1410 of /var/www/sites/all/modules/features/feature_site/feature_site.module)
Notice: Undefined index: google_analytics_reports_cache_length in google_analytics_reports_api_admin_submit() (rule 184 van /var/www/sites/all/modules/contrib/google_analytics_reports/google_analytics_reports_api/google_analytics_reports_api.pages.inc)
Notice: Undefined offset: 24 in timer_read() (rule 488 van /var/www/includes/bootstrap.inc)

These are just 3 examples, but you probably get the idea now. This thing should not be appearing in production logs, so check your logs often during development and fix these things!

Yes, it will cost you some time, but you will get that time back in multitudes when you need to debug your production site later on.

Things that cannot be put back in the right context, or are too generic in nature.

I hope it’s pretty clear you should not log things like “Something went wrong” without extra information. But I still see things like that appear on sites, so I’m adding it here for the sake of completeness.

That’s all, I hope you learned something!

comments powered by Disqus