The next episode of the PHP performance series:
If you are interested in how I start a performance investigation, please read the previous mentioned articles beforehand. They will give you a idea on how I approach such a task.
Looking at the profiles of my workload I noticed that Rector had a bottleneck at file IO operations.
IO means input/output and is a term for operations which read or write data from/to a file, network, database, …
With this finding in mind I had a closer look at the file-finding stage and saw considerable time was spent there.
The file traversal utilizes symfony GlobResource class, and looking into it made me realize that we could re-order some operations. This means - where possible - I changed the code so file IO was only triggered after all other operations succeeded.
The result is a ~18% performance improvement in symfony GlobResource which in turn will make a lot of code faster relying on the symfony-config component - obviously even outside of Rector.
Later on this optimization was mentioned on the symfony blog: New in Symfony 6.3: Performance Improvements
The initial profile revealed a few more small costs, which I worked through with some small pull requests:
Bottom line of these changes is:
- Don’t do IO when not necessary
- Try to defer IO when possible
- Do non-IO related stuff before IO related stuff - cheap checks first
File IO is not only expensive but also very unpredictable. Executing the same workload over and over on the same machine can vary a lot.
Defer type resolving and AST traversal
In a similar fashion as in the paragraph before, there is another class of operations which can be slow in static analysis context.
We are talking about type resolving - e.g.
$scope->getType() and friends - or AST traversal - e.g.
Examples for this approach can be found in
and some of them were really fruitful:
In the above I described just a few things I had a look at. The sum of all these - and a lot more not mentioned here at all - lead to a really awesome Rector 0.16 release:
Also be aware that not all my changes improve things and some ideas will just be put into the trash-bin after a few hours. Feel free to look through the full list… not all things I try are successful or land in the project in the end.
💡Tip: Don’t apply the above concepts blindly to your code. Make sure you have evidence with some sort of timing tool (e.g. a profiler) before diving deep into the performance optimizing process. As you can see in most PRs: oftentimes only a few lines of code need to be tweaked. The actual challenge is to find those in between a few hundred thousand/million lines of code.
Chances are high, that you or your company is saving a lot of money with recent releases. Please consider supporting my work, so I can make sure open source tools keeps as fast as possible and evolves to the next level.
Found a bug? Please help improve this article.