In a series of pull requests I have been working on improving the PHPStan type inference for sprintf, vsprintf and sscanf.

sprintf inference

On the 4th of june Philippe Villiers (aka @kissifrot) reported an interesting issue regarding sprintf:

// initial reported snippet, which reported errors on PHPStan until 1.7.14
<?php declare(strict_types = 1);

class HelloWorld
{
  /**
   * @psalm-param int|numeric-string $divisor
   */
  public static function divide(int|string $divisor): string
  {
    return 'You divided by ' . $divisor;
  }

  public static function extractPercentage(float $percentage): string
  {
    // PHPStan error
    // $divisor of static method HelloWorld::divide() expects int|numeric-string, non-empty-string given.
    $subtotalAmount = self::divide(sprintf('%.14F', $percentage));

    return $subtotalAmount;
  }
}

The code example he provided made me immediately think about our own codebase. It looked so familiar to me, that I realized fixing this problem might also fix issues in our own codebase. So I started to work on it.

The first iteration on the problem fixed the issue mentioned. I just had to make the already existing SprintfFunctionDynamicReturnTypeExtension handle a possible ConstantStringType-format-string. When the format is constant we know all its details at analysis time and can do a better return type inference.

This first PR also sparked some great conversations with other PHPStan contributors, which made it obvious that we can do even better.

One improvement was to add support for positional arguments.

The more time you invest into the problem area the better your mental model gets. While working thru all this I had a few more ideas about possible use cases, which I wanted to cover.

sscanf inference

I always try to step back for a moment and get the overall picture of my change. While doing a walk around my home area I realized that there is a obvious counterpart to sprintf - namely sscanf. While sprintf is used to format a string, sscanf can be used to parse a string back into separate parts.

For the time being PHPStan treated the returned variables as a generic array, without further type specification.

$parts = sscanf($mandate, "%s %d %d");
// PHPStan until 1.7.14 treated $parts as a plain array

With a new SscanfFunctionDynamicReturnTypeExtension I was able to give PHPStan a better idea of the types involved:

// as of PHPStan 1.7.15+ knows the types
list(
  $month, // string
  $day, // int
  $year // int
) = fscanf($r, "%s %d %d");

vsprintf inference

After doing all of the above, I realized php-src also contains a vsprintf function - I have never used before. Since this function accepts the same format string as sprintf, I just had to adjust the already existing extension to also do its magic for this function.

summary

Working on this kind of problems makes really fun. I have used the sprintf and sscanf functions a lot before and have a pretty good idea what to expect from them.

PHPStan until 1.7.14 did not have a good idea about the types involved, and therefore you would have written some unnecessary code to make it aware of the obvious stuff like:

// PHPStan until 1.7.14, you had to work around unknown types
list(
  $month, // mixed
  $day, // mixed
  $year // mixed
) = fscanf($r, "%s %d %d");

if (!is_string($month)) {
  throw new \Exception('month is not a string');
}
if (!is_int($day)) {
  throw new \Exception('day is not an integer');
}
if (!is_int($year)) {
  throw new \Exception('year is not an integer');
}

// work with the parsed values

With the newly added extensions, as of PHPStan 1.7.15+ you no longer need to write boilerplate to convince PHPStan about the types involved. It just knows them by heart:

// as of PHPStan 1.7.15+ knows the types
list(
  $month, // string
  $day, // int
  $year // int
) = fscanf($r, "%s %d %d");

Found a bug? Please help improve this article.


<
Previous Post
phpstan-dba type inference
>
Next Post
GitHub Maintainer Month