Tree traversal in C ++ and Haskell
I am new to Haskell. I am trying to understand how well haskell can handle a recursive function call along with its lazy evaluation. The experiment I did was just building a binary search tree in both C ++ and Haskell and pass them accordingly in postorder. The C ++ implementation is standard with a helper stack. (I just print the item as soon as I visit it).
Here is my haskell code:
module Main (main) where
import System.Environment (getArgs)
import System.IO
import System.Exit
import Control.Monad(when)
import qualified Data.ByteString as S
main = do
args <- getArgs
when (length args < 1) $ do
putStrLn "Missing input files"
exitFailure
content <- readFile (args !! 0)
--preorderV print $ buildTree content
mapM_ print $ traverse POST $ buildTree content
putStrLn "end"
data BSTree a = EmptyTree | Node a (BSTree a) (BSTree a) deriving (Show)
data Mode = IN | POST | PRE
singleNode :: a -> BSTree a
singleNode x = Node x EmptyTree EmptyTree
bstInsert :: (Ord a) => a -> BSTree a -> BSTree a
bstInsert x EmptyTree = singleNode x
bstInsert x (Node a left right)
| x == a = Node a left right
| x < a = Node a (bstInsert x left) right
| x > a = Node a left (bstInsert x right)
buildTree :: String -> BSTree String
buildTree = foldr bstInsert EmptyTree . words
preorder :: BSTree a -> [a]
preorder EmptyTree = []
preorder (Node x left right) = [x] ++ preorder left ++ preorder right
inorder :: BSTree a -> [a]
inorder EmptyTree = []
inorder (Node x left right) = inorder left ++ [x] ++ inorder right
postorder :: BSTree a -> [a]
postorder EmptyTree = []
postorder (Node x left right) = postorder left ++ postorder right ++[x]
traverse :: Mode -> BSTree a -> [a]
traverse x tree = case x of IN -> inorder tree
POST -> postorder tree
PRE -> preorder tree
preorderV :: (a->IO ()) -> BSTree a -> IO ()
preorderV f EmptyTree = return ()
preorderV f (Node x left right) = do
f x
preorderV f left
preorderV f right
My test result shows that C ++ is vastly superior to Haskell:
C ++ performance: (note that first15000.txt is about 5 times larger than first3000.txt)
time ./speedTestForTraversal first3000.txt > /dev/null
real 0m0.158s
user 0m0.156s
sys 0m0.000s
time ./speedTestForTraversal first15000.txt > /dev/null
real 0m0.923s
user 0m0.916s
sys 0m0.004s
Haskell with the same input file:
time ./speedTestTreeTraversal first3000.txt > /dev/null
real 0m0.500s
user 0m0.488s
sys 0m0.008s
time ./speedTestTreeTraversal first15000.txt > /dev/null
real 0m3.511s
user 0m3.436s
sys 0m0.072s
What I expected haskell shouldn't be too far from C ++. Did I make some mistake? Is there a way to improve my haskell code?
thank
Edit: Oct 18, 2014
After testing the serval cases, the haskell traversal is still significantly slower than the C ++ implementation. I would like to give Cirdec a full credit answer as it points out the inefficiency of my haskell implementation. However, my initial question is about comparing C ++ and haskell implementation. So I would like to leave this question open and post my C ++ code for further discussion.
#include <iostream>
#include <string>
#include <boost/algorithm/string.hpp>
#include <fstream>
#include <stack>
using namespace std;
using boost::algorithm::trim;
using boost::algorithm::split;
template<typename T>
class Node
{
public:
Node(): val(0), l(NULL), r(NULL), p(NULL) {};
Node(const T &v): val(v), l(NULL), r(NULL), p(NULL) {}
Node* getLeft() {return l;}
Node* getRight(){return r;}
Node* getParent() {return p;}
void setLeft(Node *n) {l = n;}
void setRight(Node *n) {r = n;}
void setParent(Node *n) {p = n;}
T &getVal() {return val;}
Node* getSucc() {return NULL;}
Node* getPred() {return NULL;}
private:
T val;
Node *l;
Node *r;
Node *p;
};
template<typename T>
void destoryOne(Node<T>* n)
{
delete n;
n = NULL;
}
template<typename T>
void printOne(Node<T>* n)
{
if (n!=NULL)
std::cout << n->getVal() << std::endl;
}
template<typename T>
class BinarySearchTree
{
public:
typedef void (*Visit)(Node<T> *);
BinarySearchTree(): root(NULL) {}
void delNode(const T &val){};
void insertNode(const T &val){
if (root==NULL)
root = new Node<T>(val);
else {
Node<T> *ptr = root;
Node<T> *ancester = NULL;
while(ptr && ptr->getVal()!=val) {
ancester = ptr;
ptr = (val < ptr->getVal()) ? ptr->getLeft() : ptr->getRight();
}
if (ptr==NULL) {
Node<T> *n = new Node<T>(val);
if (val < ancester->getVal())
ancester->setLeft(n);
else
ancester->setRight(n);
} // else the node exists already so ignore!
}
}
~BinarySearchTree() {
destoryTree(root);
}
void destoryTree(Node<T>* rootN) {
iterativePostorder(&destoryOne);
}
void iterativePostorder(Visit fn) {
std::stack<Node<T>* > internalStack;
Node<T> *p = root;
Node<T> *q = root;
while(p) {
while (p->getLeft()) {
internalStack.push(p);
p = p->getLeft();
}
while (p && (p->getRight()==NULL || p->getRight()==q)) {
fn(p);
q = p;
if (internalStack.empty())
return;
else {
p = internalStack.top();
internalStack.pop();
}
}
internalStack.push(p);
p = p->getRight();
}
}
Node<T> * getRoot(){ return root;}
private:
Node<T> *root;
};
int main(int argc, char *argv[])
{
BinarySearchTree<string> bst;
if (argc<2) {
cout << "Missing input file" << endl;
return 0;
}
ifstream inputFile(argv[1]);
if (inputFile.fail()) {
cout << "Fail to open file " << argv[1] << endl;
return 0;
}
while (!inputFile.eof()) {
string word;
inputFile >> word;
trim(word);
if (!word.empty()) {
bst.insertNode(word);
}
}
bst.iterativePostorder(&printOne);
return 0;
}
Edit: Oct 20, 2014 Chris answers below, very carefully, and I can repeat the result.
I generated a file containing all 4-letter ASCII lowercase strings abcdefghijklmnopqrstuvwxyz
separated by spaces; and I think I got it in the correct order so that the tree your code generates is perfectly balanced.
I chose this length because it takes 3.4 seconds on my computer, just like your 3.5s Haskell. I named it 26_4.txt
for obvious reasons. It looks like your dataset is close to 26 words 4 so it compares in length as well.
The bottom border at runtime will look something like this:
import System.IO
main = do
mylist <- readFile "26_4.txt"
mapM_ putStrLn (words mylist)
and that for my dataset boils down to taking 0.4s (stdout on /dev/null
). So we can't expect more than, say, a speedup factor of 10 on such an issue from Haskell, it seems. However, this factor is within your problem; C ++ takes twice as long as this super simple program.
But no amount of processing is an unrealistic goal. We can get an estimate that is more realistic if we use a data structure that has already been optimized for us by professionals who understand Haskell better:
import System.IO
import qualified Data.Map.Strict as Map
balancedTree = Map.fromList . map (\k -> (k, ()))
serializeTree = map fst . Map.toList
main = do
mylist <- readFile "26_4.txt"
mapM_ putStrLn (serializeTree $ balancedTree $ words mylist)
This works for anything larger than 1.6s on my machine. It's not as fast as your C ++, but your C ++ won't balance the tree as far as I can tell.
I made a Cirdec modification to the code and your code dropped to 3.1s, so it only shaved off about 10% of that file runtime.
However, on my machine this file doesn't even work unless you give it more memory using RTSopts. And that points to a really important optimization: tail call optimization. The code from both you and Cirdec: suboptimal in a special way: it is not tail-recursive, which means it cannot be turned into a GHC loop. We can make it tail-recursive by writing an explicit "things to do" stack we descend with:
postorder :: BSTree a -> [a]
postorder t = go [] [t]
where go xs [] = xs
go xs (EmptyTree : ts) = go xs ts
go xs (Node x a b : ts) = go (x : xs) (b : a : ts)
This change seems to bring it up to 2.1.
Another time-consuming difference between C ++ and Haskell is that the Haskell version will allow you to lazily build your search tree, whereas your C ++ code won't. We can make Haskell code strict to deal with this by providing something like:
data BSTree a = EmptyTree
| Node !a !(BSTree a) !(BSTree a) deriving (Show)
This change, combined with Cirdec, brings us down to 1.1 seconds, which means we are compatible with your C ++ code, at least on my machine. You should check this on your machine to see if these are the underlying issues as well. I think that further optimizations cannot be done "out of the chair" and should be done using a proper profiler.
Remember the ghc -O2
code, otherwise tail calls and other optimizations may fail.
Concatenating lists with ++
is slow, every time it happens ++
its first argument has to be traversed to the end to find where to add the second argument. You can see how the first argument goes to []
in the definition ++
from the standard prelude :
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)
When ++
used recursively, this traversal must be repeated for each level of recursion, which is inefficient.
There is another way to create lists: if you know what will be at the end of the list before you start building it, you can build it with completion in place. Let's look at the definitionpostorder
postorder :: BSTree a -> [a]
postorder EmptyTree = []
postorder (Node x left right) = postorder left ++ postorder right ++ [x]
When we do postorder left
, we already know what happens after it, it will postorder right ++ [x]
, so it makes sense to build a list for the left side of the tree from the right side, and the value from the node is already in place. Similarly, when we do postorder right
, we already know what should happen after it, namely x
. We can do just that by creating a helper function that passes the accumulated value to the rest
list
postorder :: BSTree a -> [a]
postorder tree = go tree []
where
go EmptyTree rest = rest
go (Node x left right) rest = go left (go right (x:rest))
This is about twice as fast on my machine when running with a 15k lexicon dictionary as input. Let's look at this a little more to see if we can get a deeper understanding. If we rewrite our definition postorder
using the function construct ( .
) and appended ( $
) instead of the nested parenthesis, we would have
postorder :: BSTree a -> [a]
postorder tree = go tree []
where
go EmptyTree rest = rest
go (Node x left right) rest = go left . go right . (x:) $ rest
We can even drop the rest
function's argument and application $
, and write it in a slightly quieter style.
postorder :: BSTree a -> [a]
postorder tree = go tree []
where
go EmptyTree = id
go (Node x left right) = go left . go right . (x:)
Now we can see what we have done. We have replaced the list [a]
with a function [a] -> [a]
that adds the list to the existing list. An empty list is replaced by a function that adds nothing to the top of the list, which is an identification function id
. Parsing a list [x]
it is replaced by a function that adds x
to the beginning of the list (x:)
. List concatenation a ++ b
is replaced by function composition f . g
- first add things to g
add to the beginning of the list, then add things to f
add to the beginning of this list.