Refactoring a massive function into many files
I was trying to refactor a "bit" of code that I was previously developing. Basically, the project was my answer to the fact that I didn't know how to use XSLT efficiently, so I developed an XML transformation system in PHP. The program reads the tags of the XML file and does something along these lines to convert it to HTML:
private function getTemplate(...) {
switch ($nodeName) {
case "a" :
// code here to generate a link tag
break;
case "box" :
// code here to generate the divs and whatnot to create a box
break;
case "ref" :
// look up an external reference file and include a bibliography
break;
default :
// do the default thing
}
}
This all worked fine, except that I had 26 branches on my switch, and that when the switch was over 1000 lines of code. Needless to say, this made maintenance a little more difficult.
What I have done now is to output the code of each branch into its own file (named "a.php", "box.php", "ref.php" ...) and include
which file each time:
if (file_exists("templates/$nodeName.php")) {
include "templates/$nodeName.php";
} else {
// do the default thing
}
Again, this works, but benchmarking shows that it slows down processing times by 50%. I assume this is because now up to 4000 include
is being executed.
What I considered was to put the code for each template in a function and if the function was not declared then include the file and then run that function - the only problem is that the existing code has been written within the framework the original function using $this
etc.
Given that this code does not run in real time (eg: it just processes XML into static HTML files that are stored - it doesnβt do it on the fly), do you have any advice for me here?
source to share
caveat: I don't know PHP, but I can google and I like function pointers better than huge switch statements, so if you can't just use XSLT ...
... one option would be to adopt a naming convention for your "work" functions, which included the html node tag name, eg. <tagname> _converter, includes all functions and replaces the switch statement as follows:
private function getTemplate(...)
{
$func = $nodeName + "_converter";
if (function_exists($func)) //see if function is defined, somehow...
{
$func(); //call conversion function (pass whatever is needed)
}
}
source to share
Try something like this:
//utils.php function handle_box ($ node) { // ... } function handle_link ($ node) { // .... } ? \>
then
require_once 'templates / utils.php'; function getTemplate () { switch ($ node) { case "a": handle_link ($ node, $ otherParams); break; } } }
Basically this will refactor all the functions for each node into its own function. From here, you can work on a more generalized solution. It's very similar to what you had the original, but the switch / case statements will be much more manageable without a ton of code inside them. Then you can take a look at a potential implementation of the Stratgy design pattern if you think it's necessary.
source to share
Without getting involved in XSLT and focusing on the refactoring issue: you've already covered many different strategies. I will offer optimization advice on one of the approaches you've already experimented with, and another approach that might work for you.
When you said there was a 50% decrease in performance when using many files. I assume you don't have PHP Op-code cache. You do this and the 50% loss should disappear.
In another approach, I would suggest creating a base class NodeProcessor
or so. And then create a new class for each node ANodeProcessor
, RefNodeProcessor
, ... etc.
Alternatively, you can only create one class NodeProcessor
that contains methods in the form of: processTag
. Then:
- You may call
$this->processNode($tag);
- In this method:
$name = 'process'. $tag;
- Then:
return $this->nodeProcessor->{$name};
-
NodeProcessor
must have a__call()
default method for tags without a specific methodprocessTag
.
Again, don't forget to add opcode caching.
source to share