@phpstan-require-extends
and @phpstan-require-implements
semantics in PHPStan.
People using psalm might find this feature familiar as it is already supported in psalm.
The idea is to define at interface or trait level, which requirements the usage class has to fulfill.
The development of this feature was possible, thanks to sponsoring by Pixel & Tonic, the team behind Craft CMS. In addition Ondřej Mirtes provided excellent feedback and guidance during the development.
The feature was implemented in separate Pull Requests, which built on top of each other:
require-extends
and require-implements
in phpdoc-parser@phpstan-require-extends
and @phpstan-require-implements
require-extends
and require-implements
rulesrequire-extends
should not error on interfacesrequire-extends
and require-implements
in result cache… which fixed the following issues:
@property
@phpstan-require-use
for requiring implementors/subclasses to use certain traits… and eventually became the headlining feature of PHPStan 1.10.56.
If you are in need of a certain feature or bugfix in PHPStan, Rector or related tooling, please get in touch.
@phpstan-require-extends
trait-exampleIt’s best described with an example, so have a look at the psalm documentation example:
/**
* @phpstan-require-extends DatabaseModel
*/
trait SoftDeletingTrait {
// useful but scoped functionality, that depends on methods/properties from DatabaseModel
}
With this declaration we define that a class wich uses the SoftDeletingTrait
has to extend the DatabaseModel
class.
If not, PHPStan will report an error. See the full example running in the PHPStan sandbox
@phpstan-require-extends
interface-exampleSimilar to what was shown above, the same is possible with interfaces:
/**
* @phpstan-require-extends DatabaseModel
*/
interface SoftDeletingMarkerInterface {
}
With this declaration we define that a class which implements the SoftDeletingMarkerInterface
has to extend the DatabaseModel
class.
When using interfaces we can achieve more though, because the interface type can be used as e.g. a parameter-type:
/**
* @phpstan-require-extends DatabaseModel
*/
interface SoftDeletingMarkerInterface {
}
class DatabaseModel {
public string $tableName;
public function softDelete():void { /* … */ }
}
// its allowed to call lookup properties and call method of the require-extends type, when using the interface-type
function runSoftDelete(SoftDeletingMarkerInterface $model):void {
$tableName = $model->tableName;
$model->softDelete();
// …
}
Since its only valid to implement the SoftDeletingMarkerInterface
when extending the DatabaseModel
class,
PHPStan will not error when accessing public properties or calling public methods of DatabaseModel
, based on the SoftDeletingMarkerInterface
type.
See the full example running in the PHPStan sandbox
NOTE: Looking up properties/calling methods on the interface type is currently only possible in PHPStan. I have opened a dedicated psalm feature request #10538 for discussion.
@phpstan-require-implements
trait-exampleSimilar to the @phpstan-require-extends
trait example, its supported to use @phpstan-require-implements
on traits:
/**
* @phpstan-require-implements DatabaseModelInterface
*/
trait SoftDeletingTrait {
// useful but scoped functionality, that depends on methods/properties from DatabaseModel
}
With this declaration we define that a class wich uses the SoftDeletingTrait
has to implement a DatabaseModelInterface
interface.
See the full example running in the PHPStan sandbox
As with most phpDoc annotations, PHPStan will happily accept a psalm-prefixed @psalm-require-implements
.
NOTE: Looking up properties/calling methods on the interface type is currently only possible in PHPStan. I have opened a dedicated psalm feature request #10538 for discussion.
The new feature is mentioned in the PHPStan docs and PHPStan blog and was recently announced by Ondřej Mirtes on Twitter and mastodon.
We plan to support generics in these phpDoc annotations in the future, see the described idea by Ondřej Mirtes. If you are interessted in this or any other feature addition, please considering sponsoring it.
]]>To back up the message of the post, I used some contribution statistics similar to the ones shown below (excerpt):
|----------------------------------------------|-----------------------|--------------------|
| project | merged pull requests | addressed issues |
|----------------------------------------------|-----------------------|--------------------|
| phpstan/phpstan* | ~116 (~188 in 2022) | 33 (83 in 2022) |
| rector/rector* | ~178 | 13 |
| FriendsOfREDAXO/rexstan | 88 | 24 |
| FriendsOfREDAXO/rexfactor | 55 | 6 |
| staabm/phpstandba | 44 (~300 in 2022) | 8 |
| redaxo/redaxo | 27 (70 in 2022) | 4 |
| TomasVotruba/unused-public | 25 | 1 |
…
These numbers were crunched with a small tool: staabm/oss-contribs
I just decided to make this tool available for anyone, so you can generate your own statistics.
the result is grouped by repository.
Find all the details in the tools repository README.
enjoy.
In case you find my PHPStan contributions and/or this tool useful, please consider supporting my open source work.
]]>I have plenty of experience in contributing changes to PHPStan core, or implementing custom extensions.
As of now, I am available for hire to make the tooling fit your needs.
You are blocked by a reported issue in PHPStan or related tooling? Your projects would benefit from getting certain features implemented in PHPStan?
I can fix bugs or implement features that are blocking you to get the most out of PHPStan.
PHPStan/Rector is running slow in your project? You need help to get a faster feedback loop?
Let me analyse your case at hand and investigate possible solutions. I love analysing php based tool performance problems.
I can help you build custom extensions and/or rules to seamlessly integrate PHPStan into your framework, libraries, and/or development workflow.
PHPStan is critical for your business? Consider supporting my open source work with your sponsoring to reduce the PHPStan projects busfactor.
please reach me via E-Mail or contact me on Twitter or Mastodon for paid support.
]]>The announcement tweet / toot got a lot of attention and I received a lot of positive feedback.
The project already got 50 stars within the first week after announcement.
The main idea is, that comments within the source code will be turned into PHPStan errors when a condition is satisfied, e.g. a date reached, a version met.
<?php
// TODO: 2023-12-14 This comment turns into a PHPStan error as of 14th december 2023
function doFoo() { /* ... */ }
// TODO: <1.0.0 This has to be in the first major release of this repo
function doBar() { /* ... */ }
// TODO: phpunit/phpunit:5.3 This has to be fixed when updating phpunit to 5.3.x or higher
function doFooBar() { /* ... */ }
// TODO: php:8 drop this polyfill when php 8.x is required
// TODO: APP-2137 A comment which errors when the issue tracker ticket gets resolved
function doBaz() { /* ... */ }
// TODO: #123 fix it when this GitHub issue is closed
// TODO: some-organization/some-repo#123 change me if this GitHub pull request is closed
A todo comment can also consist of just a constraint without any text, like // @todo 2023-12-14
.
When a text is given after the date, this text will be picked up for the PHPStan error message.
todo
, TODO
, tOdO
keyword is case-insensitivetodo
keyword can be suffixed or prefixed by a @
charactertodo@
:
or -
characters/* */
and /** */
comments are supportedThe comment can expire by different constraints, examples are:
YYYY-MM-DD
matched against the reference-timecomposer.lock
)Find more details and configuration options in the projects README.
In case you find my PHPStan contributions and/or this tool useful, please consider supporting my open source work.
]]>To be honest: The main motivation for this post is getting awareness for all the open source work happening in my free time. I am spending 20-40 hours per month and would love 💕 to even reduce hours on my primary job to support the open source community even more.
This will only be possible when more people support my open source work by becoming a sponsor.
At first, lets have a look back at 2022: I was able create 967 pull requests, of which 831 got merged. In comparison, at the time of writing I created ~900 pull requests to 70 open-source repositories in 2023, of which 753 got merged.
As you can see the numbers in 2022, are a bit lower than in 2023. I think this is due to the fact that last year the focus was on working through low-hanging fruits in PHPStan and Rector. With the experience and knowledge gained while working on these projects, I was able to contribute more advanced features and fixes this year.
The following table shows the distribution of contributions across the different projects I am working on.
project | merged pull requests | addressed issues |
---|---|---|
phpstan/phpstan* | ~116 (~188 in 2022) | 33 (83 in 2022) |
rector/rector* | ~178 | 13 |
FriendsOfREDAXO/rexstan | 88 | 24 |
FriendsOfREDAXO/rexfactor | 55 | 6 |
staabm/phpstandba | 44 (~300 in 2022) | 8 |
staabm/phpstan-todo-by | 33 (~300 in 2022) | 7 |
redaxo/redaxo | 27 (70 in 2022) | 5 |
TomasVotruba/unused-public | 28 | 1 |
staabm/phpstan-baseline-analysis | 22 | |
OskarStark/doctor-rst | 12 | - |
easy-coding-standard/easy-coding-standard | 9 | 1 |
staabm/annotate-pull-request-from-checkstyle | 8 | - |
PHP-CS-Fixer/PHP-CS-Fixer | 4 | - |
Roave/BetterReflection | 4 | - |
symfony/symfony | 3 | - |
qossmic/deptrac | 3 | - |
TomasVotruba/bladestan | 3 | - |
composer/composer | 2 (7 in 2022) | - |
sebastianbergmann/diff | 2 | - |
TomasVotruba/type-coverage | 2 | - |
vimeo/psalm | 1 (4 in 2022) | - |
mautic/mautic | 1 | - |
TomasVotruba/cognitive-complexity | 1 | - |
matomo-org/matomo | 1 | - |
nette/utils | 1 | - |
nikic/PHP-Parser | 1 | - |
briannesbitt/Carbon | 1 | - |
doctrine/orm | 1 | - |
… a lot more | - | - |
numbers crunched with staabm/oss-contribs
Additionally, to sourcecode contributions I also took the to time to blog about my work. In these 8 posts, I try to explain what I did, how problems have been approached and what I have learned along the way. That way I hope to inspire others to contribute to open source as well and share their journey.
If you don’t want to miss my articles, consider subscribing to my RSS feed, follow me on Twitter or mastodon.
Lets have a closer look at my personal highlights of 2023.
The PHPStan result cache is a key piece for a fast feedback loop. Why, how it works and how to debug problems with it was described in this blog post. I have dumbed everything I know about it into this article.
In june 2022 the first version of rexstan, a PHPStan backed REDAXO CMS Addon was released. Its open source from day 1 and supports developers working with REDAXO every day.
Since then I was able to publish 147 releases - what a ride.
Similar to rexstan, rexfactor is a new REDAXO CMS Addon. It’s backed by Rector and helps developers to migrate their codebase to newer REDAXO versions. Its open source from day 1 and was first released in March 2023.
The Addon allows using Rector with a simple web UI. Pick your rule/rule-set, define the target source code and get a nice preview of the changes. Push the “Apply” button and the changes are applied to your codebase.
Got interviewed by the Super Duper Developers Club about my open source work (German).
Running Rector on huge projects in a single run was not possible in the past. After implementing process and memory managment this is a fixed problem. Even huge projects like the Mautic codebase can be refactored with Rector now without out-of-memory issues.
phpstan-dba is one of my PHPStan extensions which got a bit of traction in 2023. It’s a PHPStan based SQL static analysis and type inference for the database access layer.
I was even keen enough to talk about it at the PHPUGFFM usergroup and the unKonf Barcamp. See the slides of said talk if you are curious.
As a regular reader of my blog you already know, that I have spent a few months across different well known projects to improve their performance. This includes PHPStan, PHPUnit, Symfony, Rector and more. All the details can be found in separate posts of my performance series.
A summary of my performance work and my vita was published on the blackfire.io Blog.
One of the craziest contributions this year. After days of in-depth analysis finally a one line fix resulted in fixing 5 bugs.
As highlighted in various tweets I was working on falsey-context type inference improvements in PHPStan. This was my most time-consuming and most rewarding contribution this year. It took me several tries to finally get it into a mergable state - this very first iteration closed 7 bugs, the oldest of them dating back to July 2020.
The main problem this contribution solves is, that PHPStan gets aware when/if variables are defined after a !isset($variable)
check.
To get this right, one needs to check whether the involved variables can get null
and whether they are defined in the current scope.
Most interesting is the case where we figured out that a variable which can never be null
, also means that it can never be defined in the falsey-context.
<?php declare(strict_types = 1);
class HelloWorld
{
public function sayHello(): void
{
$x = 'hello';
if (rand(0,1)) {
$x = 'world';
}
if (isset($x)) {
echo $x;
} else {
echo $x; // Undefined variable: $x
}
}
}
Getting this right additionally means that PHPStan gets smarter for the !empty($variable)
-case and the null coallescing operator ??
.
I have plans to work on !isset($array['offset'])
and !isset($object->property)
improvements in 2024.
I wish you all the best for the upcoming year. I am looking forward to continue my open source work and I hope you will support me in doing so.
If one of those open source projects is critical for your business, please consider supporting my work with your sponsoring 💕
]]>In this post I will describe one way to work thru the sometimes huge PHPStan baseline.
Not everyone has the luxury to use static analysis from the very start of a project.
When adding PHPStan to a existing project, you usually need to work thru the levels for an initial cleanup. Oftentimes the initial budget to setup static analysis is not big enough to level up to a point you are happy with.
When running out of budget, I usually try to find a PHPStan config/rule-set, which makes sure newly implemented code has a pretty high quality barrier. At the same time this means I need to baseline a lot of errors, because pre-existing code likely does not match these criteria.
Now we need to somehow figure out a way, how and when you want to work thru the remaining errors in the daily job. The bigger the baseline is, the more important is a good strategy, on which errors you want to work on first.
At first setup phpstan-baseline-analysis to keep track of the current state of the project. Using this tool we can analyze the project and get an overview of the current error distribution. In our projects we generate these numbers in a scheduled GitHub action and create trend reports for the dev-team.
Additionally, you may create graphs of the progress to have a visual representation. It can be a good foundation for a conversation with management people, to give an idea where we are and where we are heading.
Depending on your dev-team focus you might want to work on different PHPStan errors.
Starting with phpstan-baseline-analysis 0.12.4 you can filter the baseline by error classes. This means we can quickly focus on a certain area of errors.
One common problem in legacy projects is related to invalid PHPDocs. PHPStan might already be aware of said problems, but since you didn’t have the time yet to work on them, these errors are buried in your baseline.
Using the new filtering capabilities you can filter out these problems from your already existing baseline:
$ echo "$( phpstan-baseline-filter phpstan-baseline.neon --exclude=Invalid-Phpdocs )" > phpstan-baseline.neon
This means, we take the projects baseline run it thru the phpstan-baseline-filter
and we keep all errors except those matching the --exclude
filter.
Now you can trigger your regular phpstan analyze
command which no longer ignores the filtered errors.
That way you can work on the problems as you are used to based on PHPStan result list.
You can use multiple filter keys at once, by separating the keys by comma (,
) .
Alternatively to --exclude
you can also use --include
to filter the baseline, which only outputs the errors matching the filter-key.
This might be useful if you want to further process the filtered error list in a separate tool.
$ phpstan-baseline-filter phpstan-baseline.neon --include=Deprecations,Unknown-Types,Anonymous-Variables > result.neon
If you are curious just invoke the tools help command, to get an idea which filter keys are supported. At the time of writing it looks like:
$ phpstan-baseline-filter help
USAGE: phpstan-baseline-filter <GLOB-PATTERN> [--exclude=<FILTER-KEY>,...] [--include=<FILTER-KEY>,...]
valid FILTER-KEYs: Classes-Cognitive-Complexity, Deprecations, Invalid-Phpdocs, Unknown-Types, Anonymous-Variables, Unused-Symbols
In case you find my PHPStan contributions and/or this content useful, please consider supporting my open source work.
]]>In this post we will have a top level look on PHPStan performance from a enduser perspective.
While we are working hard on squeezing out every bit of performance out of PHPStan, you as an end user should foremost make sure that PHPStan can benefit from its result cache as often as it can.
In the projects I am working on, we usually see PHPStan analysis times dropping from 5-10 minutes to 10-30 seconds when everything is going according to plan and the tool can do its job utilizing the result cache.
But what could possibly go wrong? In this post I will write down what I learned from setting up PHPStan in a lot of different projects and environments.
You don’t need to enable result cache explicitly, as it’s enabled by default. PHPStan tries to be as smart as possible about invalidating the cache when required.
To find out when/whether PHPStan is using the result cache, you can use the -vvv
flags.
$ phpstan -vvv
Result cache not used because the cache file does not exist.
1562/1562 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100% 20 secs/20 secs
Result cache is saved.
[OK] No errors
Used memory: 2.13 GB
-> note the initial message, telling you about result cache usage.
-> note the analysis in this project is taking 20 seconds and 2.13 GB of memory.
$ phpstan -vvv
Note: Using configuration file /Users/staabm/workspace/phpstan-src/phpstan.neon.dist.
1562/1562 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100% < 1 sec/< 1 sec
Result cache is saved.
[OK] No errors
Used memory: 133.88 MB
-> the analysis process finished in under 1 second in comparison to 20 seconds before.
-> it took 134 MB of memory in comparison to 2.13 GB before.
$ phpstan -vvv
Note: Using configuration file /Users/staabm/workspace/phpstan-src/phpstan.neon.dist.
Result cache not used because the metadata do not match: projectConfig, composerLocks
1562/1562 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100% 19 secs/19 secs
Result cache is saved.
[OK] No errors
Used memory: 2.14 GB
-> you can see PHPStan realized the composerLocks
are different, which made it invalidate the cache.
Starting with PHPStan 1.10.36 we print the reason why invalidation happened.
-> There can be different reasons why the cache is invalidated or not used at all. Find all the details in the ResultCacheManager class.
clear-result-cache
command. This will also reveal the location of the result cache files:$ phpstan clear-result-cache -vvv
Note: Using configuration file /Users/staabm/workspace/phpstan-src/phpstan.neon.dist.
Result cache cleared from directory:
/Users/staabm/workspace/phpstan-src/tmp
--debug
option, it will not use the result cache:$ phpstan --debug -vvv
Note: Using configuration file /Users/staabm/workspace/phpstan-src/phpstan.neon.dist.
Result cache not used because of debug mode.
...
$ phpstan -vvv --generate-baseline
Note: Using configuration file /Users/staabm/workspace/phpstan-src/phpstan.neon.dist.
1562/1562 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100% < 1 sec/< 1 sec
Result cache is saved.
[OK] Baseline generated with 645 errors.
Used memory: 147.88 MB
Ondřej Pro-Tip: If you need to know in detail, why PHPStan decided to not use the result cache you can diff
the result-cache file before and after the run.
That can be especially helpful in CI environments, when debugging the problem at hand is pretty hard.
resultCachePath
PHPStan by default uses a singe result cache file for all projects on your machine. This means when you work and switch between multiple projects the very first run after the project-switch will need a full analysis scan.
To get a more efficient experience when switching between projects, you may consider using a different resultCachePath
file-name in every projects configuration file.
parameters:
resultCachePath: %tmpDir%/resultCache-project-X.php
resultCachePath
In case your CI server does not run projects in a isolated filesystem, you should use a dedicated resultCachePath
When using GitHub Actions you should consider using a cache action to persist the result cache between runs.
- name: "Cache result cache"
uses: actions/cache@v3
with:
path: ./tmp
key: "result-cache-v1-${{ matrix.php-version }}-${{ github.run_id }}"
restore-keys: |
result-cache-v1-${{ matrix.php-version }}-
./tmp
on linux based systems${{ github.run_id }}
you can make sure to re-use the most recent result cache${{ matrix.php-version }}
push
GitHub Actions event on the default-branch, to make sure newly created PRs will utilize a fresh cache from the default-branch.In case you are working with long running branches you may consider using separate actions/cache/restore@v3
and actions/cache/save@v3
steps instead, to make sure the result cache is also persisted on failling jobs:
- name: "Restore result cache"
uses: actions/cache/restore@v3
with:
path: ./tmp
key: "result-cache-v1-${{ matrix.php-version }}-${{ github.run_id }}"
restore-keys: |
result-cache-v1-${{ matrix.php-version }}-
# … run phpstan
- name: "Save result cache"
uses: actions/cache/save@v3
if: always()
with:
path: ./tmp
key: "result-cache-v1-${{ matrix.php-version }}-${{ github.run_id }}"
Update: The above tip regarding GitHub Actions cache handling works also for other tools, like e.g. RectorPHP.
In case you find my PHPStan contributions and/or this content useful, please consider supporting my open source work.
]]>The article describes how to utilize Rector to maximize type coverage of a legacy project. The more types are defined in the codebase the better the results of your IDE or static analysis tools will be.
This is usually the first thing you should do, before applying more advanced rector code transformations. Rector can be used in a similar way to apply other Rules or Rulesets.
Additionally more type coverage is a great first step after a PHPStan/Psalm setup, to make sure static analysis can find relevant bugs efficiently. Otherwise adding types to a old codebase can take a lot of time. Doing it manually is also prone to errors.
These are the top level steps I try to follow:
Analysing the baseline is technically not required. Crunching the numbers can help keep a dev team motivated or these can be used to convince managment people about your current state and potential goals.
The preparation steps and the linked articles in the “overall plan”-chapter should contain all you need.
Fixing the mentioned PHPStan errors to make sure Rector can trust your variables.
Start with Rector as described in the introduction. Make sure you have all relevant source paths configured and the setup works as expected.
We will run Rector in the command line on your workstation. Later on you may configure Rector as part of your CI pipeline, but that’s a topic for another article.
Working with Rector usually means you start by adding one Rector rule at a time. Let the tool do its magic and review the generated changes. Make sure you feel confident with them. If you get overwhelmed by the amount of changes, revert the working state and run your current Rector rule only against a few paths instead of the whole project.
Repeat using smaller steps as long as you feel the result is not reviewable. How often you need to divide the steps into smaller ones depends on the rule being applied and your codebase.
Between these steps you should commit the intermediate states. This also eases seeing the actual differences between the steps.
NOTE:
Especially in legacy projects its important to make sure rector is not relying on PHPDoc types. This is what *Strict*
rector rules are for. If you apply non-Strict rector rules, take special care your PHPDoc is precise.
It’s important to add return types first, as it’s the least risky change and should be backwards compatible most of the time.
final
classes first.__get
), review related changes properly.If rector changes things you don’t like, you may ignore source files for single rules or even skip the source file completly. You can re-visit the skipped cases later again. You may feel more confident after the codebase got enriched with types and PHPStan can better understand the code in question.
I had the most success using the ReturnTypeFromStrict*
Rector rules first.
Do so one rule at a time, like described above.
In the next step in my experience it’s best to add property types.
Start with private
properties and later move on to protected
ones of final
classes.
If you are not sure about nullability, keep using nullable types for now.
Last add types to protected
properties of non-final classes and public
properties.
Keep in mind that adding types to public/protected properties to classes which use inheritance can be BC break.
I had the most success using the PropertyTypeFromStrict*
Rector rules first.
After that try the TypedPropertyFrom*
rules.
Last but not least add parameter types. Be careful, as adding parameter usually breaks backwards compatibility. That’s especially important in case you work on library code, as it might force you to create a new major version.
I had the most success using the *ParamType*
Rector rules.
In case you find this content useful, please consider supporting my open source work.
]]>Since I have published the last performance article about Rector, Oskar Stark - one of my twitter followers got in touch with me:
@markusstaab we run OskarStark/doctor-rst on all PRs in symfony/symfony-docs 😃 maybe you will check this package for performance too 😍
He is a member of the symfony core team and is working on the symfony-docs.
DOCtor-RST is a linter used in the symfony-docs repo to check *.rst files. Like other static analysis tools it is scanning the sources at hand and provides feedback about common errors and best practices.
Disclaimer: I had never used this tool before and also have zero experience with RST file format.
At the time of writing, running the linter over the symfony-docs repo takes about 50 seconds in the GitHub Actions workflow. Lets run DOCtor-RST version 1.46.0 locally on my mac against symfony-docs@ff62e1203 to get a baseline:
$ time php bin/doctor-rst analyze ../symfony-docs/ --no-cache
31.35s user 0.30s system 99% cpu 31.689 total
As you already know my the next step when investigating performance is running the blackfire profiler on the workfload.
$ blackfire run --ignore-exit-status php bin/doctor-rst analyze ../symfony-docs/ --no-cache
The profile will be stored in your Personal environment. The "--environment" option can be used to specify the target environment.
Analyze *.rst(.inc) files in: /Users/staabm/workspace/symfony-docs
Used config file: /Users/staabm/workspace/symfony-docs/.doctor-rst.yaml
Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 4096 bytes) in /Users/staabm/workspace/doctor-rst/vendor/symfony/string/AbstractUnicodeString.php on line 236
PHP Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 4096 bytes) in /Users/staabm/workspace/doctor-rst/vendor/symfony/string/AbstractUnicodeString.php on line 236
It’s not that unusual that running a profiler requires more memory on a workload, therefore I raised the php memory limit to 16GB. Still I am running in out of memory errors… 🤔
For a sanity check, I added a memory debug out at the end of the analysis process into the AnalyzeCommand
and ran it again without blackfire:
$output->writeln(memory_get_peak_usage(true) / 1024 / 1024 . ' MB');
PHP reports a peak memory of 12MB, so it was not that high. At this point I concluded we are likely facing a memory issue in the profiler and reported the issue to the blackfire team.
To get the analysis process running nevertheless, I then decided to reduce the number of *.rst files to analyse. Therefore I locally deleted *.rst files in the my symfony-docs checkout until blackfire did run without memory issues. Its not a perfect situation but we could get at least a first idea of the performance characteristics of the workload.
As we already saw in previous investigations reducing IO is a good first thing.
In the following graph you can see a lot of calls to SplFileInfo->getRealPath()
:
We just had to introduce a local variable and call it a day.
DOCtor-RST internally uses symfony/string which heavily uses multi-byte string functions. These functions are known to be inefficient in PHP - even though with the latest PHP releases they got much better.
The profiles show us a memory bottleneck on said calls:
One experience I had in the past is that in most cases using regular string functions is way more efficient.
I had a look at all used ->matches(…)
invocations and decided to concentrate on a few simple ones, which can be expressed without regular expressions.
Rewriting these expression already yielded a great improvement, as these were invoked quite frequently:
Another case where I was able to reduce the use of regular expressions was in the ->isFootnote()
method.
In this case we had a expression trying to match a string starting with some certain characters.
I decided to add some quick checks which in most cases prevent the acutal regular expression to be executed.
These yielded another great improvement in memory consumption and a small improvement in runtime:
Even if these optimizations were focused on memory oftentimes it turns out they also improve runtime performance. PHP needs to handle huge amounts of data in memory and therefore this managment results in slower executed scripts. Also garbage collection needs to be heavily involved which takes time to track the memory.
I did a few more performance oriented pull requests but nothing of big interesst which needs further explaination.
After all the changes landed lets have another look at the workload:
$ time php bin/doctor-rst analyze ../symfony-docs/ --no-cache
20.35s user 0.30s system 99% cpu 21.689 total
We are now able to run the workload ~10 seconds faster then the initial ~30 seconds. This should reduce wait time when contributing to the symfony-docs.
As always, this improvements were crafted in my freetime. I am not a symfony framework user either. Please consider supporting my work, so I can make sure open source tools keeps as fast as possible and evolves to the next level.
Happy documenting! 📖
]]>You want to look into PHPStan performance instead?
Requires Rector 0.16.1 or later
First we need a single run across the whole project which collects some useful information we can later look into:
vendor/bin/rector -vvv --debug --no-diffs | tee rector.log
Analyse the generated rector.log
file with parse.php
1:
php parse.php
Now you get a list of files sorted by the time it took Rector to refactor which looks like:
Slowest files
4.90 seconds: [file] packages/Testing/PHPUnit/AbstractRectorTestCase.php
4.07 seconds: [file] packages/FamilyTree/Reflection/FamilyRelationsAnalyzer.php
2.99 seconds: [file] packages/Caching/ValueObject/CacheFilePaths.php
2.95 seconds: [file] packages/BetterPhpDocParser/Attributes/AttributeMirrorer.php
2.93 seconds: [file] packages/BetterPhpDocParser/PhpDocNodeVisitor/TemplatePhpDocNodeVisitor.php
2.68 seconds: [file] packages/FamilyTree/Reflection/FamilyRelationsAnalyzer.php
2.61 seconds: [file] packages/NodeTypeResolver/TypeAnalyzer/ArrayTypeAnalyzer.php
2.53 seconds: [file] packages/PHPStanStaticTypeMapper/TypeMapper/OversizedArrayTypeMapper.php
2.07 seconds: [file] packages/BetterPhpDocParser/ValueObject/PhpDocAttributeKey.php
1.71 seconds: [file] bin/clean-phpstan.php
1.52 seconds: [file] config/set/php80.php
1.18 seconds: [file] packages/PhpAttribute/NodeAnalyzer/ExprParameterReflectionTypeCorrector.php
1.01 seconds: [file] packages/Testing/PHPUnit/AbstractRectorTestCase.php
0.97 seconds: [file] config/set/php52.php
0.89 seconds: [file] packages/BetterPhpDocParser/PhpDocNodeVisitor/TemplatePhpDocNodeVisitor.php
0.83 seconds: [file] packages/PHPStanStaticTypeMapper/TypeMapper/ResourceTypeMapper.php
0.83 seconds: [file] packages/Caching/ValueObject/Storage/MemoryCacheStorage.php
0.78 seconds: [file] config/set/php80.php
0.69 seconds: [file] packages/PhpAttribute/AnnotationToAttributeMapper/ArrayItemNodeAnnotationToAttributeMapper.php
0.63 seconds: [file] packages/StaticTypeMapper/PhpDocParser/NullableTypeMapper.php
0.62 seconds: [file] rules/CodingStyle/Rector/ClassConst/VarConstantCommentRector.php
0.60 seconds: [file] packages/PhpAttribute/NodeAnalyzer/ExprParameterReflectionTypeCorrector.php
0.56 seconds: [file] packages/PhpAttribute/AnnotationToAttributeMapper/ArrayItemNodeAnnotationToAttributeMapper.php
...
Starting from here you can use your favorite profiler to analyse only the slowest files in isolation.
Example with blackfire and the path to a slow file:
blackfire run --ignore-exit-status php vendor/bin/rector -vvv --debug --no-diffs packages/Testing/PHPUnit/AbstractRectorTestCase.php
If performance analysis is not your thing, feel free to open an issue on Rector and bring with you all the information you already gathered in the above process. Its important you bring all files and configs required to reproduce your performance issue as part of the report.
In case you support my engagement with a GitHub sponsoring, I can have a look at your performance problem.
parse.php
script<?php // parse.php
declare(strict_types=1);
// inspired and adopted from https://gist.github.com/ruudk/41897eb59ff497b271fc9fa3c7d5fb27
$log = new SplFileObject("rector.log");
$logs = [];
$file = null;
while (! $log->eof()) {
$line = trim($log->fgets());
if ($line === '') {
continue;
}
if (str_starts_with($line, '[file]')) {
$file = $line;
continue;
}
if ($file === null) {
continue;
}
if (preg_match('/took (?<seconds>[\d.]+) s/', $line, $matches) === 1) {
$accu = 0.0;
if (array_key_exists($file, $logs)) {
$accu = $logs[$file][0];
}
$logs[$file] = [$accu + ((float) $matches['seconds']), $file];
$file = null;
}
}
usort($logs, fn(array $left, array $right) => $right[0] <=> $left[0]);
$logs = array_slice($logs, 0, 100);
echo "Slowest files" . PHP_EOL;
foreach ($logs as $log) {
echo sprintf("%.2f seconds: %s", $log[0], $log[1]) . PHP_EOL;
}
Script to analyse and sort the rector.log
↩