Tree traversal in C ++ and Haskell

I am new to Haskell. I am trying to understand how well haskell can handle a recursive function call along with its lazy evaluation. The experiment I did was just building a binary search tree in both C ++ and Haskell and pass them accordingly in postorder. The C ++ implementation is standard with a helper stack. (I just print the item as soon as I visit it).

Here is my haskell code:

module Main (main) where

import System.Environment (getArgs)
import System.IO
import System.Exit
import Control.Monad(when)
import qualified Data.ByteString as S

main = do
     args <- getArgs
     when (length args < 1) $ do
          putStrLn "Missing input files"

     content <- readFile (args !! 0)
     --preorderV print $ buildTree content
     mapM_ print $ traverse POST $ buildTree content
     putStrLn "end"

data BSTree a = EmptyTree | Node a (BSTree a) (BSTree a) deriving (Show)
data Mode = IN | POST | PRE

singleNode :: a -> BSTree a
singleNode x = Node x EmptyTree EmptyTree

bstInsert :: (Ord a) => a -> BSTree a -> BSTree a
bstInsert x EmptyTree = singleNode x
bstInsert x (Node a left right)
          | x == a = Node a left right
          | x < a  = Node a (bstInsert x left) right
          | x > a  = Node a left (bstInsert x right)

buildTree :: String -> BSTree String
buildTree = foldr bstInsert EmptyTree . words

preorder :: BSTree a -> [a]
preorder EmptyTree = []
preorder (Node x left right) = [x] ++ preorder left ++ preorder right

inorder :: BSTree a -> [a]
inorder EmptyTree = []
inorder (Node x left right) = inorder left ++ [x] ++ inorder right

postorder :: BSTree a -> [a]
postorder EmptyTree = []
postorder (Node x left right) = postorder left ++  postorder right ++[x]

traverse :: Mode -> BSTree a -> [a]
traverse x tree = case x of IN   -> inorder tree
                            POST -> postorder tree
                            PRE  -> preorder tree

preorderV :: (a->IO ()) -> BSTree a -> IO ()
preorderV f EmptyTree = return ()
preorderV f (Node x left right) = do 
                                     f x
                                     preorderV f left
                                     preorderV f right


My test result shows that C ++ is vastly superior to Haskell:

C ++ performance: (note that first15000.txt is about 5 times larger than first3000.txt)

time ./speedTestForTraversal first3000.txt > /dev/null 

real    0m0.158s
user    0m0.156s
sys     0m0.000s
time ./speedTestForTraversal first15000.txt > /dev/null 

real    0m0.923s
user    0m0.916s
sys     0m0.004s


Haskell with the same input file:

time ./speedTestTreeTraversal first3000.txt > /dev/null 

real    0m0.500s
user    0m0.488s
sys     0m0.008s
time ./speedTestTreeTraversal first15000.txt > /dev/null 

real    0m3.511s
user    0m3.436s
sys     0m0.072s


What I expected haskell shouldn't be too far from C ++. Did I make some mistake? Is there a way to improve my haskell code?


Edit: Oct 18, 2014

After testing the serval cases, the haskell traversal is still significantly slower than the C ++ implementation. I would like to give Cirdec a full credit answer as it points out the inefficiency of my haskell implementation. However, my initial question is about comparing C ++ and haskell implementation. So I would like to leave this question open and post my C ++ code for further discussion.

#include <iostream>
#include <string>
#include <boost/algorithm/string.hpp>
#include <fstream>
#include <stack>
using namespace std;
using boost::algorithm::trim;
using boost::algorithm::split;

template<typename T>
class Node
    Node(): val(0), l(NULL), r(NULL), p(NULL) {};
    Node(const T &v): val(v), l(NULL), r(NULL), p(NULL) {}
    Node* getLeft() {return l;}
    Node* getRight(){return r;}
    Node* getParent() {return p;}
    void  setLeft(Node *n) {l = n;}
    void  setRight(Node *n) {r = n;}
    void  setParent(Node *n) {p = n;}
    T  &getVal() {return val;}
    Node* getSucc() {return NULL;}
    Node* getPred() {return NULL;}
    T val;
    Node *l;
    Node *r;
    Node *p;

template<typename T>
void destoryOne(Node<T>* n)
    delete n;
    n = NULL;

template<typename T>
void printOne(Node<T>* n)
    if (n!=NULL)
    std::cout << n->getVal() << std::endl;

template<typename T>
class BinarySearchTree
    typedef void (*Visit)(Node<T> *);

    BinarySearchTree(): root(NULL) {}
    void delNode(const T &val){};
    void insertNode(const T &val){
    if (root==NULL)
        root = new Node<T>(val);
    else {
        Node<T> *ptr = root;
        Node<T> *ancester = NULL;
        while(ptr && ptr->getVal()!=val) {
        ancester = ptr;
        ptr = (val < ptr->getVal()) ? ptr->getLeft() : ptr->getRight(); 
        if (ptr==NULL) {
        Node<T> *n = new Node<T>(val);
        if (val < ancester->getVal())
        } // else the node exists already so ignore!
    ~BinarySearchTree() {
    void destoryTree(Node<T>* rootN) {

    void iterativePostorder(Visit fn) {
    std::stack<Node<T>* > internalStack;
    Node<T> *p = root;
    Node<T> *q = root;
    while(p) {
        while (p->getLeft()) {
        p = p->getLeft();
        while (p && (p->getRight()==NULL || p->getRight()==q)) {
        q = p;
        if (internalStack.empty())
        else {
            p =;
        p = p->getRight();

    Node<T> * getRoot(){ return root;}
    Node<T> *root;

int main(int argc, char *argv[])
    BinarySearchTree<string> bst;
    if (argc<2) {
    cout << "Missing input file" << endl;
    return 0;
    ifstream inputFile(argv[1]);
    if ( {
    cout << "Fail to open file " << argv[1] << endl;
    return 0;
    while (!inputFile.eof()) {
    string word;
    inputFile >> word;
    if (!word.empty()) {


    return 0;


Edit: Oct 20, 2014 Chris answers below, very carefully, and I can repeat the result.


source to share

2 answers

I generated a file containing all 4-letter ASCII lowercase strings abcdefghijklmnopqrstuvwxyz

separated by spaces; and I think I got it in the correct order so that the tree your code generates is perfectly balanced.

I chose this length because it takes 3.4 seconds on my computer, just like your 3.5s Haskell. I named it 26_4.txt

for obvious reasons. It looks like your dataset is close to 26 words 4 so it compares in length as well.

The bottom border at runtime will look something like this:

import System.IO
main = do
    mylist <- readFile "26_4.txt"
    mapM_ putStrLn (words mylist)


and that for my dataset boils down to taking 0.4s (stdout on /dev/null

). So we can't expect more than, say, a speedup factor of 10 on such an issue from Haskell, it seems. However, this factor is within your problem; C ++ takes twice as long as this super simple program.

But no amount of processing is an unrealistic goal. We can get an estimate that is more realistic if we use a data structure that has already been optimized for us by professionals who understand Haskell better:

import System.IO
import qualified Data.Map.Strict as Map

balancedTree = Map.fromList . map (\k -> (k, ()))

serializeTree = map fst . Map.toList

main = do
    mylist <- readFile "26_4.txt"
    mapM_ putStrLn (serializeTree $ balancedTree $ words mylist)


This works for anything larger than 1.6s on my machine. It's not as fast as your C ++, but your C ++ won't balance the tree as far as I can tell.

I made a Cirdec modification to the code and your code dropped to 3.1s, so it only shaved off about 10% of that file runtime.

However, on my machine this file doesn't even work unless you give it more memory using RTSopts. And that points to a really important optimization: tail call optimization. The code from both you and Cirdec: suboptimal in a special way: it is not tail-recursive, which means it cannot be turned into a GHC loop. We can make it tail-recursive by writing an explicit "things to do" stack we descend with:

postorder :: BSTree a -> [a]
postorder t = go [] [t]
    where go xs [] = xs
          go xs (EmptyTree : ts) = go xs ts
          go xs (Node x a b : ts) = go (x : xs) (b : a : ts)


This change seems to bring it up to 2.1.

Another time-consuming difference between C ++ and Haskell is that the Haskell version will allow you to lazily build your search tree, whereas your C ++ code won't. We can make Haskell code strict to deal with this by providing something like:

data BSTree a = EmptyTree
          | Node  !a !(BSTree a) !(BSTree a) deriving (Show)


This change, combined with Cirdec, brings us down to 1.1 seconds, which means we are compatible with your C ++ code, at least on my machine. You should check this on your machine to see if these are the underlying issues as well. I think that further optimizations cannot be done "out of the chair" and should be done using a proper profiler.

Remember the ghc -O2

code, otherwise tail calls and other optimizations may fail.



Concatenating lists with ++

is slow, every time it happens ++

its first argument has to be traversed to the end to find where to add the second argument. You can see how the first argument goes to []

in the definition ++

from the standard prelude :

(++) :: [a] -> [a] -> [a]
[]     ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)


When ++

used recursively, this traversal must be repeated for each level of recursion, which is inefficient.

There is another way to create lists: if you know what will be at the end of the list before you start building it, you can build it with completion in place. Let's look at the definitionpostorder

postorder :: BSTree a -> [a]
postorder EmptyTree = []
postorder (Node x left right) = postorder left ++ postorder right ++ [x]


When we do postorder left

, we already know what happens after it, it will postorder right ++ [x]

, so it makes sense to build a list for the left side of the tree from the right side, and the value from the node is already in place. Similarly, when we do postorder right

, we already know what should happen after it, namely x

. We can do just that by creating a helper function that passes the accumulated value to the rest


postorder :: BSTree a -> [a]
postorder tree = go tree []
        go EmptyTree rest = rest
        go (Node x left right) rest = go left (go right (x:rest))


This is about twice as fast on my machine when running with a 15k lexicon dictionary as input. Let's look at this a little more to see if we can get a deeper understanding. If we rewrite our definition postorder

using the function construct ( .

) and appended ( $

) instead of the nested parenthesis, we would have

postorder :: BSTree a -> [a]
postorder tree = go tree []
        go EmptyTree rest = rest
        go (Node x left right) rest = go left . go right . (x:) $ rest


We can even drop the rest

function's argument and application $

, and write it in a slightly quieter style.

postorder :: BSTree a -> [a]
postorder tree = go tree []
        go EmptyTree = id
        go (Node x left right) = go left . go right . (x:)


Now we can see what we have done. We have replaced the list [a]

with a function [a] -> [a]

that adds the list to the existing list. An empty list is replaced by a function that adds nothing to the top of the list, which is an identification function id

. Parsing a list [x]

it is replaced by a function that adds x

to the beginning of the list (x:)

. List concatenation a ++ b

is replaced by function composition f . g

- first add things to g

add to the beginning of the list, then add things to f

add to the beginning of this list.



All Articles