Iterate or not to Iterate … that is the question!
Excuses for misquoting Shakespeare dear readers, but I had to grab your attention away from your current attention-grabbing-addiction somehow!
So, now that I’ve got your attention (presumably not having lost it by now!),
I would like to discuss the pressing topic of grabbing entire directory trees with a single command.
Granted, this would normally be a rather mundane task, so to add a further twist, I want to exclude certain directories at the same time.
Let's jump right in and have a look at the SPL class RecursiveDirectoryIterator
.
In the SPL (Standard PHP Library) there lives an incredibly useful iterator known as RecursiveDirectoryIterator
.
As a usage example, let's parse the directory structure of a typical Laminas project:
├── composer.json
├── config
│ ├── application.config.php
│ ├── autoload
│ │ ├── global.php
│ │ └── laminas-developer-tools.local-development.php
│ ├── development.config.php.dist
│ └── modules.config.php
├── COPYRIGHT.md
├── data
│ └── cache
├── docker-compose.yml
├── Dockerfile
├── LICENSE.md
├── module
│ ├── Application
│ │ ├── config
│ │ │ └── module.config.php
│ │ ├── src
│ │ │ ├── Controller
│ │ │ │ ├── IndexControllerFactory.php
│ │ │ │ └── IndexController.php
│ │ │ ├── Module.php
│ │ │ └── Service
│ │ │ └── Calendar.php
│ │ ├── test
│ │ │ └── Controller
│ │ │ └── IndexControllerTest.php
│ │ └── view
│ │ ├── application
│ │ │ └── index
│ │ │ ├── calendar.phtml
│ │ │ └── index.phtml
│ │ ├── error
│ │ │ ├── 404.phtml
│ │ │ └── index.phtml
│ │ └── layout
│ │ └── layout.phtml
├── public
│ ├── css
│ ├── img
│ ├── index.php
│ ├── js
│ └── web.config
├── README.md
├── Vagrantfile
└── vendor
├── autoload.php
├── bin
├── composer
│ ├── autoload_classmap.php
│ └── LICENSE
├── laminas
│ ├── laminas-component-installer
│ ├── laminas-config
etc.
As you can imagine, the vendor directory has a ton of open-source software installed via Composer.
Further, the public directory includes lots of stylesheets and JavaScript.
So the task at hand is to iterate through the directory structure, excluding these two directories.
Your first thought might be to define a path and create a RecursiveDirectoryIterator
and be done with it.
Throw in a simple foreach()
loop and we’re good to go, right? (Don’t answer! Rhetorical question.)
Before we dive into the code, please be aware that by default RecursiveDirectoryIterator
returns an iteration
with the full filename (including path) as a key, and an SplFileInfo
object as the value.
So, let’s get down to producing some code to achieve the desired results.
A good place to start might be to define a function (or class method) that determines the acceptance criteria.
In this case we want to be able to exclude one or more directory paths from the final output.
Thus we define a simple function accept()
that returns FALSE
if the given path
includes any of the directory paths in the $excludes
array:
function accept(string $name, array $excludes = [])
{
$result = TRUE;
if ($excludes) {
foreach ($excludes as $item) {
if (strpos($name, $item) !== FALSE) {
$result = FALSE;
break;
}
}
}
return $result;
}
Next we define a function show()
that performs the actual iteration,
using accept()
to include or exclude iteration entries.
function show(string $path, Iterator $iteration, array $excludes)
{
$output = '';
foreach ($iteration as $key => $value)
if (accept($key, $excludes))
$output .= str_replace($path, '', $key) . "\n";
return $output;
}
Finally, we create the RecursiveDirectoryIterator
instance, giving it the initial path,
and a flag to eliminate the “dot” directories (e.g. “.” and “..”).
$path = '/path/to/laminas_project';
$excludes = ['/vendor','/public'];
$iteration = new RecursiveDirectoryIterator($path, FilesystemIterator::SKIP_DOTS);
echo show($path, $iteration, $excludes);
And here is the resulting output:
README.md
module
phpcs.xml
composer.phar
COPYRIGHT.md
composer.json
docker-compose.yml
.gitignore
Vagrantfile
phpunit.xml.dist
data
config
composer.json.bak
CHANGELOG.md
Dockerfile
LICENSE.md
composer.lock
Wait, you might cry out (well, maybe not, but for the sake of argument, imagine an outraged developer screaming
insults at the PHP engine :-), what happened to all the subdirectories and associated files?
Good question!
Oddly, RecursiveDirectoryIterator
was doing its job!
It returns the first entry in the path specified, and the recursively continues to provide subsequent entries in the path specified.
So, in the case of RecursiveDirectoryIterator
, its recursion isn’t that it goes "deep",
but rather that it goes through the entire directory path specified before it stops.
To up its game so to speak, we need to call upon the mighty
RecursiveIteratorIterator
class.
The relationship between any given iterator and RecursiveIteratorIterator
is like that of a bodybuilder to steroids.
This class causes the associated iterator to continue to iterate until all child nodes have been explored.
When associated with RecursiveDirectoryIterator
, it is perfect for parsing entire directory sub-trees.
One word of caution, however, is that if you point it to the wrong path, especially paths with thousands of files and
hundreds of subdirectories … you can quickly enter PHP Fatal Error territory.
That consideration aside, RecursiveIteratorIterator
is a really cool classname, isn’t it?
It gives one a warm fuzzy Department-Of-Redundancy-Department kind of feeling doesn’t it? (Monty Python Fans take note!)
All joking aside, let’s have a look at its application to the code described above.
Really, the only thing that needs to be done is to wrap the RecursiveDirectoryIterator
instance into a
RecursiveIteratorIterator
instance, and we’re good to go. The modified code might appear as follows:
$iteration =
new RecursiveDirectoryIterator($path, FilesystemIterator::SKIP_DOTS);
$recurse = new RecursiveIteratorIterator($iteration);
echo show($path, $recurse, $excludes);
And here are the results we expected from the start:
README.md
module/Signups/view/signups/index/events.phtml
module/Signups/view/signups/index/index.phtml
module/Signups/src/Module.php
module/Signups/src/Controller/IndexController.php
module/Signups/config/module.config.php
module/Test/view/test/list/index.phtml
module/Test/view/test/index/index.phtml
module/Test/src/Module.php
module/Test/src/Controller/ListController.php
module/Test/src/Controller/IndexController.php
module/Test/config/module.config.php
module/Test/config/module.config.php.bak
module/Application/test/Controller/IndexControllerTest.php
module/Application/view/application/index/calendar.phtml
module/Application/view/application/index/index.phtml
module/Application/view/error/404.phtml
module/Application/view/error/index.phtml
module/Application/view/layout/layout.phtml
module/Application/src/Module.php
module/Application/src/Models/EventsModel.php
module/Application/src/Factory/AdapterFactory.php
module/Application/src/Factory/EventsModelFactory.php
module/Application/src/Controller/IndexControllerFactory.php
module/Application/src/Controller/IndexController.php
module/Application/src/Service/Calendar.php
module/Application/config/module.config.php
phpcs.xml
composer.phar
COPYRIGHT.md
composer.json
docker-compose.yml
.gitignore
Vagrantfile
phpunit.xml.dist
data/cache/.gitkeep
config/application.config.php
config/modules.config.php
config/autoload/README.md
config/autoload/laminas-developer-tools.local-development.php
config/autoload/db.local.php
config/autoload/development.local.php
config/autoload/global.php
config/autoload/.gitignore
config/autoload/local.php.dist
config/autoload/development.local.php.dist
config/development.config.php
config/development.config.php.dist
composer.json.bak
CHANGELOG.md
Dockerfile
LICENSE.md
composer.lock
But wait … there’s more! Let me pose you a question: wouldn’t it be nice to do all this with a single iterator?
Hah! That got your attention, didn’t it?
So, without further ado, let’s have a look at the last iterator class to be discussed in this article: FilterIterator
.
As with RecursiveIteratorIterator
, the FilterIterator
class cannot stand alone: it provides a wrapper for an existing iterator.
But there’s a bigger problem: this class is marked abstract
which means you cannot use it directly!
This makes perfect sense when you understand that the abstract method accept()
(sound familiar?)
simply cannot be defined by the PHP core development team. They have no idea what needs to be filtered.
Accordingly its definition is left to the developer. This still doesn’t stop it from being super-annoying, however!
Why do I need to develop an entirely new class which extends FilterIterator
just because it’s abstract?
Arghhhh!!! Hang on folks … there is another way!
Many of us tend to forget one of the most discussed new feature of PHP 7.0: the anonymous class. This feature was discussed endlessly, and the subject of many an article or blog post. Eventually it was forgotten and faded into obscurity. But, it just so happens that an anonymous class might be just the ticket in the situation we are discussing in this article.
Imagine the following:
The only real change that needs to be made in accept()
is to not accept any arguments, and
substitute parent::current()
in place of $name
.
If $excludes
becomes a public static property, it can be assigned from the calling program.
Here is how the alternative code solution might appear:
$iteration
= new RecursiveDirectoryIterator($path, FilesystemIterator::SKIP_DOTS);
$recurse = new RecursiveIteratorIterator($iteration);
$filter = new class($recurse) extends FilterIterator {
public static $excludes = [];
public function accept() {
if (!self::$excludes) return TRUE;
$actual = 0;
foreach (self::$excludes as $item)
if (strpos(parent::current(), $item) !== FALSE) $actual++;
return !((bool) $actual);
}
};
$filter::$excludes = $excludes;
foreach ($filter as $key => $value)
echo str_replace($path, '', $key) . "\n";
An added benefit is that we no longer need the show()
function.
In this example the iteration itself already includes filtering, so all we need to do is to iterate through the pre-filtered result.
The resulting output is not shown here as it’s identical to the output from the previous code example.
So, to summarize, RecursiveDirectoryIterator
by itself will only parse a single directory structure.
If you wrap it in RecursiveIteratorIterator
, it can traverse an entire directory tree.
Adding a single iterator, FilterIterator
, allows you to produce a single iteration
that doesn’t need any additional logic if you need to filter the results.
That’s about all for today dear readers. Happy coding!