The cryptic word ("LPS") appears in the Haskell output list

I am new to Haskell and am trying to tinker with some test cases that I usually come across in the real world. Let's say I have a text file "foo.txt" that contains the following:

45.4 34.3 377.8
33.2 98.4 456.7
99.1 44.2 395.3

      

I am trying to create an output

[[45.4,34.3,377.8],[33.2,98.4,456.7],[99.1,44.2,395.3]]

      

My code is below, but I am getting a dummy "LPS" in the output ... not sure what it represents.

import qualified Data.ByteString.Lazy.Char8 as BStr
import qualified Data.Map as Map

readDatafile = (map (BStr.words) . BStr.lines)

testFunc path = do
    contents <- BStr.readFile path
    print (readDatafile contents)

      

When calling testFunc "foo.txt" the output is

[[LPS ["45.4"],LPS ["34.3"],LPS ["377.8"]],[LPS ["33.2"],LPS ["98.4"],LPS ["456.7"]],[LPS ["99.1"],LPS ["44.2"],LPS ["395.3"]]]

      

Any help is appreciated! Thank you. PS: Using ByteString as this will be used on large files in the future.

EDIT:

I am also puzzled as to why the output list is grouped as above (with each number associated in []) when in ghci the bottom line gives a different location.

*Main> (map words . lines) "45.4 34.3 377.8\n33.2 98.4 456.7\n99.1 44.2 395.3"
[["45.4","34.3","377.8"],["33.2","98.4","456.7"],["99.1","44.2","395.3"]]

      

+2


source to share


5 answers


What you see is really a constructor. When you read the file, the result is of course a list of Bytestrings lists, but you want a list of floats.

What can you do:

readDatafile :: BStr.ByteString -> [[Float]]
readDatafile = (map ((map (read .  BStr.unpack)) . BStr.words)) . BStr.lines

      



This unpacks the Bytestring (i.e. converts it to a string). Reading converts the string to float.

Not sure if using bytestrings here even helps your performance.

+8


source


This points to the internal lazy view type bytestring pre-1.4.4.3 (search on the "LPS" page). LPS is a constructor.



+2


source


readDatafile returns [[ByteString]], and what you see is a "packed" representation of all those characters that you are reading.

readDatafile = map (map Bstr.unpack . bStr.words) . Bstr.lines

      

Here's a ghci run example demonstrating the problem. My result is different from yours because I am using GHC 6.10.4:

*Data.ByteString.Lazy.Char8> let myString = "45.4"
*Data.ByteString.Lazy.Char8> let myByteString = pack "45.4"
*Data.ByteString.Lazy.Char8> :t myString
myString :: [Char]
*Data.ByteString.Lazy.Char8> :t myByteString
myByteString :: ByteString
*Data.ByteString.Lazy.Char8> myString
"45.4"
*Data.ByteString.Lazy.Char8> myByteString
Chunk "45.4" Empty
*Data.ByteString.Lazy.Char8> unpack myByteString
"45.4"

      

+2


source


It's just a lazy bytestring constructor. You are not yet parsing these strings into integers, so you will see the main line. Note that lazy bytes are not the same as String, so they have a different printable representation when "Show'n".

+1


source


LPS was the old constructor for the old Lazy ByteString type. It has since been replaced with an explicit data type, so the current behavior is slightly different.

When you call Show on a Lazy ByteString, it outputs code that generates about the same lazy byte string you gave it. However, normal imports to work with ByteStrings do not export LPS - or, in later versions, Chunk / Empty constructors. Thus, it shows it with the LPS constructor wrapped around a list of bytestring strong chunks that are printed as strings.

On the other hand, I'm wondering if a lazy ByteString Show instance should do the same as most other instance instances for complex data structures, and say something like:

fromChunks ["foo","bar","baz"]

      

or even:

fromChunks [pack "foo",pack "bar", pack "baz"]

      

as the former is believed to rely on {-# LANGUAGE OverloadedStrings #-}

that the resulting code snippet is indeed parsed as Haskell code. On the other hand, printing bytes as if they were strings is really convenient. Alas, both options are more verbose than the old LPS syntax, but more concise than the current Chunk "Foo" Empty. After all, Show just needs to be left reversible with Read, so it's probably best not to guess around things that change so it doesn't accidentally break up a ton of serialized data.;)

As for your problem, you get [[ByteString]]

instead [[Float]]

by matching words along your lines. You need to unpack this ByteString and then call read

on the resulting string to generate floating point numbers.

+1


source







All Articles