Refactoring a massive function into many files

Question

Refactoring a massive function into many files

I was trying to refactor a "bit" of code that I was previously developing. Basically, the project was my answer to the fact that I didn't know how to use XSLT efficiently, so I developed an XML transformation system in PHP. The program reads the tags of the XML file and does something along these lines to convert it to HTML:

private function getTemplate(...) {
    switch ($nodeName) {
        case "a" :
            // code here to generate a link tag
            break;
        case "box" :
            // code here to generate the divs and whatnot to create a box
            break;
        case "ref" :
            // look up an external reference file and include a bibliography
            break;
        default :
            // do the default thing
    }
}

This all worked fine, except that I had 26 branches on my switch, and that when the switch was over 1000 lines of code. Needless to say, this made maintenance a little more difficult.

What I have done now is to output the code of each branch into its own file (named "a.php", "box.php", "ref.php" ...) and include

which file each time:

if (file_exists("templates/$nodeName.php")) {
    include "templates/$nodeName.php";
} else {
    // do the default thing
}

Again, this works, but benchmarking shows that it slows down processing times by 50%. I assume this is because now up to 4000 include

is being executed.

What I considered was to put the code for each template in a function and if the function was not declared then include the file and then run that function - the only problem is that the existing code has been written within the framework the original function using $this

etc.

Given that this code does not run in real time (eg: it just processes XML into static HTML files that are stored - it doesn’t do it on the fly), do you have any advice for me here?

+1

xml include php refactoring

nickf 05 dec. '08 at 3:50 am

source to share

3 answers

Steven A. Lowe · Answer 1 · 2008-12-05T04:01:47+0000

caveat: I don't know PHP, but I can google and I like function pointers better than huge switch statements, so if you can't just use XSLT ...

... one option would be to adopt a naming convention for your "work" functions, which included the html node tag name, eg. <tagname> _converter, includes all functions and replaces the switch statement as follows:

private function getTemplate(...) 
{
    $func = $nodeName + "_converter";
    if (function_exists($func))    //see if function is defined, somehow...
    {
        $func();    //call conversion function (pass whatever is needed)
    }
}

grepsedawk · Answer 2 · 2008-12-05T04:43:50+0000

Try something like this:

//utils.php
function handle_box ($ node)
{
// ...
}

function handle_link ($ node)
{
// ....
}
? \>

then

require_once 'templates / utils.php';


function getTemplate ()
{
  switch ($ node)
  {
      case "a":
        handle_link ($ node, $ otherParams);
      break;
  }
}

}

Basically this will refactor all the functions for each node into its own function. From here, you can work on a more generalized solution. It's very similar to what you had the original, but the switch / case statements will be much more manageable without a ton of code inside them. Then you can take a look at a potential implementation of the Stratgy design pattern if you think it's necessary.

Amr Mostafa · Answer 3 · 2008-12-05T12:04:12+0000

Without getting involved in XSLT and focusing on the refactoring issue: you've already covered many different strategies. I will offer optimization advice on one of the approaches you've already experimented with, and another approach that might work for you.

When you said there was a 50% decrease in performance when using many files. I assume you don't have PHP Op-code cache. You do this and the 50% loss should disappear.

In another approach, I would suggest creating a base class NodeProcessor

or so. And then create a new class for each node ANodeProcessor

, RefNodeProcessor

, ... etc.

Alternatively, you can only create one class NodeProcessor

that contains methods in the form of: processTag

. Then:

You may call $this->processNode($tag);
In this method: $name = 'process'. $tag;
Then: return $this->nodeProcessor->{$name};
NodeProcessor

must have a __call()

default method for tags without a specific method processTag

.

Again, don't forget to add opcode caching.

Refactoring a massive function into many files

More articles: