Introduction to automated, static code analysis in PHP

published on in category Development PHP , Tags: PHP Code review PHPCS PHPMD PHPCPD

In my daily work one of my jobs is to assure code quality of our web applications written in PHP. Beside the usual checks like manual code review, training (e.g. using XP programming sessions) and automated unit tests using PHPUnit, there are some metrics that can be measured automatically making my work easier and helping to reduce error rates in the review process. In the following article I want to introduce the most common methods and tools addressing static code analysis in PHP.

Code files

These code files will be used to illustrate the function of the following tools:

working-example.php

<?php

echo "Hello World!";

failing-example.php

<?php
declare(strict_types = 1);
function a() : bool
{
   return 1;
}
var_dump(a());

Lint

Linting is one of the most obvious things to do. It assures the basic workability of your code by interpreting it and finding syntax errors. If PHP was an compiled language, the PHP lint mode (php -l) would basically check if the code was able to be compiled. This might be of special interest if you’re maintaining an application for multiple PHP versions such as 5.5, 5.6 and 7. Because once a developer is using a keyword that’s not defined in an older version than he developed against, PHP lint for the older version will fail while it passes for the current version. As mentioned before, PHP already includes a possibility to lint code as shown below:

php -l working-example.php
No syntax errors detected in working-example.php

Using the previous example, declaring a PHP 7-only function with return type hinting and running lint against version 5.5:

php -v
PHP 5.5.30 (cli) (built: Oct 23 2015 17:21:45)
Copyright (c) 1997-2015 The PHP Group
Zend Engine v2.5.0, Copyright (c) 1998-2015 Zend Technologies
php -l php7-example.php
Warning: Unsupported declare 'strict_types' in php7-example.php on line 2
Parse error: parse error, expecting `'{'' in php7-example.php on line 3
Errors parsing failing-example.php

To lint all files within a directory (which you might wanna do), use the following command:

find -L . -name '*.php' -print0 | xargs -0 -n 1 -P 4 php -l

Code Style Check

Sadly, especially in the PHP community a large part of the developers has very bad manners when it comes to code style. Classes are defined inline, braces are not where they belong if at all, code is intended both with tabs and spaces, contains all different variations of line feeds, PHPDoc comments are outdated, again, if at all, and on and on. Even when it comes to large open source projects, the code style is often neglected since many people do not understand that it’s not about beautifying code but making code readable and understandable. This is where PHPCS (https://github.com/squizlabs/PHP_CodeSniffer) comes into play. It’s a tool for checking PHP code style based on predefined rulesets. PHPCS is shipped with many rulesets for the most common style guides in PHP including Zend, PEAR, Symfony2 and PSR. My preferred code style is PSR-2 (http://www.php-fig.org/psr/psr-2/), so I’m using this one in the examples below.

Installation

The easiest way to install PHPCS is by using the PHAR archive provided by the author:

curl -OL https://squizlabs.github.io/PHP_CodeSniffer/phpcs.phar

Basic Usage

Now we can check the code written in the previous example:

php phpcs.phar --standard=PSR2 failing-example.php

FILE: /var/www/failing-example.php
----------------------------------------------------------------------
FOUND 1 ERROR AND 1 WARNING AFFECTING 2 LINES
----------------------------------------------------------------------
 1 | WARNING | [ ] A file should declare new symbols (classes,
   |         |     functions, constants, etc.) and cause no other
   |         |     side effects, or it should execute logic with side
   |         |     effects, but should not do both. The first symbol
   |         |     is defined on line 3 and the first side effect is
   |         |     on line 7.
 5 | ERROR   | [x] Line indented incorrectly; expected at least 4
   |         |     spaces, found 3
----------------------------------------------------------------------
PHPCBF CAN FIX THE 1 MARKED SNIFF VIOLATIONS AUTOMATICALLY
----------------------------------------------------------------------

Time: 52ms; Memory: 3.5Mb

Creating a own ruleset

PHPCS rulesets can also be customized by creating a own ruleset XML file that composes existing rules (“sniffs”). Generally, PSR-2 does not check if there are valid PHPDoc comments, but this is generally a good thing to do. The following example shows how to enhance the PSR-2 ruleset to include PHPDoc tests, as well as some other things you could do in a ruleset file:

<?xml version="1.0"?>
<ruleset name="MyOwnRuleset">
    <description>PSR-2 and PHPDoc checks</description>

    <!-- Check for function comments -->
    <rule ref="Squiz.Commenting.FunctionComment" />

	<!-- if you want to ignore vendor libraries... -->
    <exclude-pattern>*/vendor/*</exclude-pattern>

    <rule ref="PSR2">
		<!-- or if you want to exclude special rules... -->
        <exclude name="Squiz.Classes.ValidClassName.NotCamelCaps"/>
    </rule>

    <!-- or exclude files from a specific test -->
    <rule ref="PSR1.Files.SideEffects">
        <exclude-pattern>*/Bootstrap.php</exclude-pattern>
    </rule>
</ruleset>

As you can see, the important part here, besides some other examples of what you can do with a ruleset, is this:

<rule ref="Squiz.Commenting.FunctionComment" />

This enables the FunctionComment sniff that will throw an error if a function docblock is missing:

php phpcs.phar --standard=ruleset.xml failing-example.php

FILE: /var/www/failing-example.php
----------------------------------------------------------------------
FOUND 3 ERRORS AND 1 WARNING AFFECTING 4 LINES
----------------------------------------------------------------------
 1 | WARNING | [ ] A file should declare new symbols (classes,
   |         |     functions, constants, etc.) and cause no other
   |         |     side effects, or it should execute logic with side
   |         |     effects, but should not do both. The first symbol
   |         |     is defined on line 3 and the first side effect is
   |         |     on line 7.
 3 | ERROR   | [ ] Missing function doc comment
 5 | ERROR   | [x] Line indented incorrectly; expected at least 4
   |         |     spaces, found 3
----------------------------------------------------------------------
PHPCBF CAN FIX THE 2 MARKED SNIFF VIOLATIONS AUTOMATICALLY
----------------------------------------------------------------------

Time: 66ms; Memory: 3.75Mb

List of available Sniffs

All available sniffs are listed in the corresponding folder of PHPCS: https://github.com/squizlabs/PHP_CodeSniffer/tree/master/src/Standards

Making Exceptions

Sometimes it’s not possible to fully conform to the style guides. For this reason, you can either completely exclude the regarding file (ruleset.xml) or just ignore the lines of code, that can not be fixed using:

// @codingStandardsIgnoreStart
// ... some code
// @codingStandardsIgnoreEnd

Mess Detection

Maybe the most important utility to improve the maintainability of your code is the PHP Mess Detector (PHPMD, http://www.phpmd.org/). It provides checks for the most common bad programming manners including eval expressions, goto statements, bad variable and method naming, tight coupling between objects, too big code complexity, usage of superglobals, etc… Therefore phpmd is very easy to use and should be somewhere in any code review process out there.

Installation

The Installation of PHPMD can be accomplished using the following command line:

curl -OL http://static.phpmd.org/php/latest/phpmd.phar

Basic Usage

The following example runs PHPMD on failing-example.php, printing results as plain text (html, xml and text are available) and checks for all rulesets available:

php phpmd.phar failing-example.php text cleancode,codesize,controversial,design,naming,unusedcode

/var/www/failing-example.php:3	Avoid using short method names like ::a(). The configured minimum method name length is 3.

You could also use a complete folder as subject filename.

Rulesets & Rules

PHPMD has sane defaults for the most rules. The rules, as shown above, are consolidated as rulesets. Here’s a list of all available rulesets and rules, including a description:

https://phpmd.org/rules/index.html

If you want to change the defaults for a specific rule, please refer to the respective rule description on the official homepage.

Custom Rulesets

As in PHPCS, PHPMD rulesets are defined as XML files with a similar structure. The following example show a ruleset that includes as well as excludes some rules and rulesets:

<?xml version="1.0"?>
<ruleset name="Custom Ruleset" xmlns="http://pmd.sf.net/ruleset/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd" xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd">
    <description>
        My custom ruleset.
    </description>

    <rule ref="rulesets/design.xml">
        <exclude name="CouplingBetweenObjects" />
    </rule>

    <rule ref="rulesets/codesize.xml" />

    <rule ref="rulesets/controversial.xml/Superglobals" />

    <rule ref="rulesets/naming.xml">
        <exclude name="LongVariable" />
    </rule>

    <rule ref="rulesets/unusedcode.xml">
        <exclude name="UnusedFormalParameter" />
    </rule>
</ruleset>

As you can see, you can either include a full ruleset and then exclude some special rules, or include only one or another rule without using the full ruleset at all.

Pass the path to your ruleset.xml to PHPMD as follows:

php phpmd.phar failing-example.php text ruleset.xml

Exceptions

You can also suppress warnings for either all rules or only one special rule (which is better) within your desired scope:

/**
 * Suppress all warnings
 * @SuppressWarnings(PHPMD)
 */
class foo {
    /**
     * Or just a single one
     * @SuppressWarnings(PHPMD.UnusedLocalVariable)
     */
    public function bar() {
        // code
    }
}

Have a look at the following document for more information: https://phpmd.org/documentation/suppress-warnings.html.

Copy & Paste Detection

Duplicate code is almost always evidence of a design flaw. Therefore you should also check for duplicate code within your project. PHP Copy/Paste Detection (PHPCPD, https://github.com/sebastianbergmann/phpcpd) allows you to do just that.

Installation

PHPCPD is available as a PHAR archive from GitHub:

curl -OL https://phar.phpunit.de/phpcpd.phar

Basic usage

The command line interface exposed by PHPCPD is as easy as it could be. Just provide the path to your code as the first parameter:

php phpcpd.phar failing-example.php
phpcpd 2.0.2 by Sebastian Bergmann.

0.00% duplicated lines out of 8 total lines of code.

Time: 47 ms, Memory: 2.50Mb

If we had some duplicate code now, we would get information about where to find it.

Exceptions

If you want to exclude a specific file or folder from checks (you might wanna do this with your vendor libs), you can use the following command line, which will check any file within the current folder, except for the vendor directory:

phpcpd --exclude=vendor .

Synopsis

As you might have seen, it’s relatively easy to automatically check many important attributes of your code. Beside the CLI usage, almost any IDE supports automated code checks on save, wrapping the command line tools shown above and exposing them through their GUI. Having those tools in your code review process right before you do a manual review will let you concentrate on the logical side of the code rather than style and maintainability.