ファイルの読み込み: [Char] と ByteString (つづき)

コメントをどうぞ

前回に引き続いて，今度は文字列の操作．一行72文字のファイルを読み込ませ，aの文字数を数える，というのと，aが4回以上現れる行数を数える，というのを書いてみた．ByteString の場合には，B.fold* を使うことになるようだ．今回は，ByteString.Char8 と ByteString.Lazy.Char8 も比較した．

結果. 文字数を数える:

	10万行	100万行	200万行	500万行
[Char]	0.19	0.39	2.7	6.7
ByteString.Char8	0.08	0.08	0.46	1.12
ByteString.Lazy.Char8	0.07	0.08	0.29	0.63

結果. 行数を数える:

	10万行	100万行	200万行	500万行
[Char]	1.38	3.38	6.7	17
ByteString.Char8	0.27	0.33	0.60	1.38
ByteString.Lazy.Char8	0.20	0.28	0.50	1.15

測定対象プログラム

import qualified Data.ByteString.Char8 as B
import qualified Data.ByteString.Lazy.Char8 as BL
import Data.List
import System.Environment

type TpFold s a = ((a -> Char -> a) -> a -> s -> a)

opFunc :: Int -> Char -> Int
opFunc acc c = if c == 'a' then acc + 1 else acc

testFn1 :: TpFold s Int -> s -> Int
testFn1 fold = fold opFunc 0

testFn2 :: TpFold s Int -> [s] -> Int
testFn2 fold = sum . map (\s -> if fold opFunc 0 s >= 4 then 1 else 0)

main :: IO ()
main = do
  [arg] <- getArgs
  case read arg of 
    1 -> print . testFn1 foldl'    =<< getContents 
    2 -> print . testFn1 B.foldl'  =<< B.getContents 
    3 -> print . testFn1 BL.foldl  =<< BL.getContents 
    4 -> print . testFn2 foldl'    . lines    =<< getContents 
    5 -> print . testFn2 B.foldl'  . B.lines  =<< B.getContents 
    6 -> print . testFn2 BL.foldl' . BL.lines =<< BL.getContents 
    _ -> error "unknown option"

keyword: haskell performance

コメントを残すコメントをキャンセル