Packed big bit vector with effective number of xor and bits in Haskell

I'm looking for an efficient (in both space and time) datatype that can hold a 384 bit vector and supports efficient XOR and "bit" operations (the number of bits set to 1).

Below is my demo program. The operations I need are of the type class SOQuestionOps

and I have implemented it for Natural

and Data.Vector.Unboxed.Bit

. Especially the latter seems to be ideal as it has an operation zipWords

that should allow me to do operations such as "counting bits" and "spelling" XOR instead of bitwise. It also claims to store a bit packed (8 bits per byte).

{-# LANGUAGE FlexibleInstances #-}
import Data.Bits
import Data.List (foldl')
import Numeric.Natural
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed.Bit as BV

class SOQuestionOps a where
    soqoXOR :: a -> a -> a
    soqoBitCount :: a -> Int
    soqoFromList :: [Bool] -> a

alternating :: Int -> [Bool]
alternating n =
    let c = n `mod` 2 == 0
     in if n == 0
           then []
           else c : alternating (n-1)

instance SOQuestionOps Natural where
    soqoXOR = xor
    soqoBitCount = popCount
    soqoFromList v =
        let oneIdxs = map snd $ filter fst (zip v [0..])
         in foldl' (\acc n -> acc `setBit` n) 0 oneIdxs

instance SOQuestionOps (BV.Vector BV.Bit) where
    soqoXOR = BV.zipWords xor
    soqoBitCount = BV.countBits
    soqoFromList v = BV.fromList (map BV.fromBool v)

main =
    let initialVec :: BV.Vector BV.Bit
        initialVec = soqoFromList $ alternating 384
        lotsOfVecs = V.replicate 10000000 (soqoFromList $ take 384 $ repeat True)
        xorFolded = V.foldl' soqoXOR initialVec lotsOfVecs
        sumBitCounts = V.foldl' (\n v -> n + soqoBitCount v) 0 lotsOfVecs
     in putStrLn $ "folded bit count: " ++ show (soqoBitCount xorFolded) ++ ", sum: " ++ show sumBitCounts

      

So, let's calculate the numbers for the best case: lotsOfVecs

no need to allocate much, because it is only 10,000,000 times the same vector initialVec

. Foldl obviously creates one of these vectors for the addition operation, so it must create 10,000,000 bit vectors. The bit count must create anything other than 10,000,000 Int

s. Therefore, at best, my program should use very small (and persistent) memory, and the overall allocations should be approximately 10,000,000 * sizeof (vector bit) + 10,000,000 * sizeof (int) = 520,000,000 bytes.

Ok, run the program for Natural

:

do initialVec :: Natural

, compile with

ghc --make -rtsopts -O3 MemStuff.hs

      

(this is with GHC 7.10.1):

$ ./MemStuff +RTS -sstderr
folded bit count: 192, sum: 3840000000
1,280,306,112 bytes allocated in the heap
201,720 bytes copied during GC
80,106,856 bytes maximum residency (2 sample(s))
662,168 bytes maximum slop
78 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed)  Avg pause  Max pause
Gen  0      2321 colls,     0 par    0.056s   0.059s     0.0000s    0.0530s
Gen  1         2 colls,     0 par    0.065s   0.069s     0.0346s    0.0674s

INIT    time    0.000s  (  0.000s elapsed)
MUT     time    0.579s  (  0.608s elapsed)
GC      time    0.122s  (  0.128s elapsed)
EXIT    time    0.000s  (  0.002s elapsed)
Total   time    0.702s  (  0.738s elapsed)

%GC     time      17.3%  (17.3% elapsed)

Alloc rate    2,209,576,763 bytes per MUT second

Productivity  82.7% of total user, 78.7% of total elapsed


real    0m0.754s
user    0m0.704s
sys 0m0.037s

      

which has 1,280,306,112 bytes allocated in the heap

that in a ball (2x) expected figure. Btw on GHC 7.8 this allocates 353,480,272,096 bytes and works at an absolute age as popCount

not very efficient for GHC 7.8s Natural

.

EDIT : I changed the code a bit. In the original version, every other vector was 0

in stock. This gave much better distribution rates for the version Natural

. I changed it so that the vector alternates between different representations (with many bits) and now we see the 2x

distributions expected. This is another disadvantage Natural

(s Integer

): the rate of distribution depends on the values.

But maybe we can do better, let's try the tightly packed Data.Vector.Unboxed.Bit

:

This is initialVec :: BV.Vector BV.Bit

both re-compiled and re-run with the same parameters.

$ time ./MemStuff +RTS -sstderr
folded bit count: 192, sum: 1920000000
75,120,306,536 bytes allocated in the heap
54,914,640 bytes copied during GC
80,107,368 bytes maximum residency (2 sample(s))
664,128 bytes maximum slop
78 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed)  Avg pause  Max pause
Gen  0     145985 colls,     0 par    0.543s   0.627s     0.0000s    0.0577s
Gen  1         2 colls,     0 par    0.065s   0.070s     0.0351s    0.0686s

INIT    time    0.000s  (  0.000s elapsed)
MUT     time   27.679s  ( 28.228s elapsed)
GC      time    0.608s  (  0.698s elapsed)
EXIT    time    0.000s  (  0.002s elapsed)
Total   time   28.288s  ( 28.928s elapsed)

%GC     time       2.1%  (2.4% elapsed)

Alloc rate    2,714,015,097 bytes per MUT second

Productivity  97.8% of total user, 95.7% of total elapsed


real    0m28.944s
user    0m28.290s
sys 0m0.456s

      

This is very slow and about 100 times the distribution: (.

So let's recompile and project both runs ( ghc --make -rtsopts -O3 -prof -auto-all -caf-all -fforce-recomp MemStuff.hs

):

Version Natural

:

COST CENTRE         MODULE  %time %alloc
main.xorFolded      Main     51.7   76.0
main.sumBitCounts.\ Main     25.4   16.0
main.sumBitCounts   Main     12.1    0.0
main.lotsOfVecs     Main     10.4    8.0

      

Version Data.Vector.Unboxed.Bit

:

COST CENTRE         MODULE  %time %alloc
soqoXOR             Main     96.7   99.3
main.sumBitCounts.\ Main      1.9    0.2

      

Is it really the Natural

best option for a fixed size bit vector? What about GHC 6.8? And is there anything better that my type class can implement SOQuestionOps

?

+3


source to share


1 answer


Check out the module Data.LargeWord

in the package Crypto

:

http://hackage.haskell.org/package/Crypto-4.2.5.1/docs/Data-LargeWord.html



It provides instances Bits

for large words of different sizes eg. 96 to 256 bits.

+1


source







All Articles