bytestring-0.10.8.1: Fast, compact, strict and lazy byte strings with a list interface

Copyright(c) Don Stewart 2006-2008
(c) Duncan Coutts 2006-2011
LicenseBSD-style
Maintainerdons00@gmail.com, duncan@community.haskell.org
Stabilitystable
Portabilityportable
Safe HaskellTrustworthy
LanguageHaskell98

Data.ByteString.Char8

Contents

Description

Manipulate ByteStrings using Char operations. All Chars will be truncated to 8 bits. It can be expected that these functions will run at identical speeds to their Word8 equivalents in Data.ByteString.

More specifically these byte strings are taken to be in the subset of Unicode covered by code points 0-255. This covers Unicode Basic Latin, Latin-1 Supplement and C0+C1 Controls.

See:

This module is intended to be imported qualified, to avoid name clashes with Prelude functions. eg.

import qualified Data.ByteString.Char8 as C

The Char8 interface to bytestrings provides an instance of IsString for the ByteString type, enabling you to use string literals, and have them implicitly packed to ByteStrings. Use {-# LANGUAGE OverloadedStrings #-} to enable this.

Synopsis

The ByteString type

data ByteString #

A space-efficient representation of a Word8 vector, supporting many efficient operations.

A ByteString contains 8-bit bytes, or by using the operations from Data.ByteString.Char8 it can be interpreted as containing 8-bit characters.

Instances

Eq ByteString # 
Data ByteString # 

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> ByteString -> c ByteString Source #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c ByteString Source #

toConstr :: ByteString -> Constr Source #

dataTypeOf :: ByteString -> DataType Source #

dataCast1 :: Typeable (* -> *) t => (forall d. Data d => c (t d)) -> Maybe (c ByteString) Source #

dataCast2 :: Typeable (* -> * -> *) t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c ByteString) Source #

gmapT :: (forall b. Data b => b -> b) -> ByteString -> ByteString Source #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> ByteString -> r Source #

gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> ByteString -> r Source #

gmapQ :: (forall d. Data d => d -> u) -> ByteString -> [u] Source #

gmapQi :: Int -> (forall d. Data d => d -> u) -> ByteString -> u Source #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> ByteString -> m ByteString Source #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> ByteString -> m ByteString Source #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> ByteString -> m ByteString Source #

Ord ByteString # 
Read ByteString # 
Show ByteString # 
IsString ByteString # 
Semigroup ByteString # 
Monoid ByteString # 
NFData ByteString # 

Methods

rnf :: ByteString -> () Source #

Introducing and eliminating ByteStrings

empty :: ByteString #

O(1) The empty ByteString

singleton :: Char -> ByteString #

O(1) Convert a Char into a ByteString

pack :: String -> ByteString #

O(n) Convert a String into a ByteString

For applications with large numbers of string literals, pack can be a bottleneck.

unpack :: ByteString -> [Char] #

O(n) Converts a ByteString to a String.

Basic interface

cons :: Char -> ByteString -> ByteString infixr 5 #

O(n) cons is analogous to (:) for lists, but of different complexity, as it requires a memcpy.

snoc :: ByteString -> Char -> ByteString infixl 5 #

O(n) Append a Char to the end of a ByteString. Similar to cons, this function performs a memcpy.

append :: ByteString -> ByteString -> ByteString #

O(n) Append two ByteStrings

head :: ByteString -> Char #

O(1) Extract the first element of a ByteString, which must be non-empty.

uncons :: ByteString -> Maybe (Char, ByteString) #

O(1) Extract the head and tail of a ByteString, returning Nothing if it is empty.

unsnoc :: ByteString -> Maybe (ByteString, Char) #

O(1) Extract the init and last of a ByteString, returning Nothing if it is empty.

last :: ByteString -> Char #

O(1) Extract the last element of a packed string, which must be non-empty.

tail :: ByteString -> ByteString #

O(1) Extract the elements after the head of a ByteString, which must be non-empty. An exception will be thrown in the case of an empty ByteString.

init :: ByteString -> ByteString #

O(1) Return all the elements of a ByteString except the last one. An exception will be thrown in the case of an empty ByteString.

null :: ByteString -> Bool #

O(1) Test whether a ByteString is empty.

length :: ByteString -> Int #

O(1) length returns the length of a ByteString as an Int.

Transformating ByteStrings

map :: (Char -> Char) -> ByteString -> ByteString #

O(n) map f xs is the ByteString obtained by applying f to each element of xs

reverse :: ByteString -> ByteString #

O(n) reverse xs efficiently returns the elements of xs in reverse order.

intersperse :: Char -> ByteString -> ByteString #

O(n) The intersperse function takes a Char and a ByteString and `intersperses' that Char between the elements of the ByteString. It is analogous to the intersperse function on Lists.

intercalate :: ByteString -> [ByteString] -> ByteString #

O(n) The intercalate function takes a ByteString and a list of ByteStrings and concatenates the list after interspersing the first argument between each element of the list.

transpose :: [ByteString] -> [ByteString] #

The transpose function transposes the rows and columns of its ByteString argument.

Reducing ByteStrings (folds)

foldl :: (a -> Char -> a) -> a -> ByteString -> a #

foldl, applied to a binary operator, a starting value (typically the left-identity of the operator), and a ByteString, reduces the ByteString using the binary operator, from left to right.

foldl' :: (a -> Char -> a) -> a -> ByteString -> a #

'foldl\'' is like foldl, but strict in the accumulator.

foldl1 :: (Char -> Char -> Char) -> ByteString -> Char #

foldl1 is a variant of foldl that has no starting value argument, and thus must be applied to non-empty ByteStrings.

foldl1' :: (Char -> Char -> Char) -> ByteString -> Char #

A strict version of foldl1

foldr :: (Char -> a -> a) -> a -> ByteString -> a #

foldr, applied to a binary operator, a starting value (typically the right-identity of the operator), and a packed string, reduces the packed string using the binary operator, from right to left.

foldr' :: (Char -> a -> a) -> a -> ByteString -> a #

'foldr\'' is a strict variant of foldr

foldr1 :: (Char -> Char -> Char) -> ByteString -> Char #

foldr1 is a variant of foldr that has no starting value argument, and thus must be applied to non-empty ByteStrings

foldr1' :: (Char -> Char -> Char) -> ByteString -> Char #

A strict variant of foldr1

Special folds

concat :: [ByteString] -> ByteString #

O(n) Concatenate a list of ByteStrings.

concatMap :: (Char -> ByteString) -> ByteString -> ByteString #

Map a function over a ByteString and concatenate the results

any :: (Char -> Bool) -> ByteString -> Bool #

Applied to a predicate and a ByteString, any determines if any element of the ByteString satisfies the predicate.

all :: (Char -> Bool) -> ByteString -> Bool #

Applied to a predicate and a ByteString, all determines if all elements of the ByteString satisfy the predicate.

maximum :: ByteString -> Char #

maximum returns the maximum value from a ByteString

minimum :: ByteString -> Char #

minimum returns the minimum value from a ByteString

Building ByteStrings

Scans

scanl :: (Char -> Char -> Char) -> Char -> ByteString -> ByteString #

scanl is similar to foldl, but returns a list of successive reduced values from the left:

scanl f z [x1, x2, ...] == [z, z `f` x1, (z `f` x1) `f` x2, ...]

Note that

last (scanl f z xs) == foldl f z xs.

scanl1 :: (Char -> Char -> Char) -> ByteString -> ByteString #

scanl1 is a variant of scanl that has no starting value argument:

scanl1 f [x1, x2, ...] == [x1, x1 `f` x2, ...]

scanr :: (Char -> Char -> Char) -> Char -> ByteString -> ByteString #

scanr is the right-to-left dual of scanl.

scanr1 :: (Char -> Char -> Char) -> ByteString -> ByteString #

scanr1 is a variant of scanr that has no starting value argument.

Accumulating maps

mapAccumL :: (acc -> Char -> (acc, Char)) -> acc -> ByteString -> (acc, ByteString) #

The mapAccumL function behaves like a combination of map and foldl; it applies a function to each element of a ByteString, passing an accumulating parameter from left to right, and returning a final value of this accumulator together with the new list.

mapAccumR :: (acc -> Char -> (acc, Char)) -> acc -> ByteString -> (acc, ByteString) #

The mapAccumR function behaves like a combination of map and foldr; it applies a function to each element of a ByteString, passing an accumulating parameter from right to left, and returning a final value of this accumulator together with the new ByteString.

Generating and unfolding ByteStrings

replicate :: Int -> Char -> ByteString #

O(n) replicate n x is a ByteString of length n with x the value of every element. The following holds:

replicate w c = unfoldr w (\u -> Just (u,u)) c

This implemenation uses memset(3)

unfoldr :: (a -> Maybe (Char, a)) -> a -> ByteString #

O(n), where n is the length of the result. The unfoldr function is analogous to the List 'unfoldr'. unfoldr builds a ByteString from a seed value. The function takes the element and returns Nothing if it is done producing the ByteString or returns Just (a,b), in which case, a is the next character in the string, and b is the seed value for further production.

Examples:

unfoldr (\x -> if x <= '9' then Just (x, succ x) else Nothing) '0' == "0123456789"

unfoldrN :: Int -> (a -> Maybe (Char, a)) -> a -> (ByteString, Maybe a) #

O(n) Like unfoldr, unfoldrN builds a ByteString from a seed value. However, the length of the result is limited by the first argument to unfoldrN. This function is more efficient than unfoldr when the maximum length of the result is known.

The following equation relates unfoldrN and unfoldr:

unfoldrN n f s == take n (unfoldr f s)

Substrings

Breaking strings

take :: Int -> ByteString -> ByteString #

O(1) take n, applied to a ByteString xs, returns the prefix of xs of length n, or xs itself if n > length xs.

drop :: Int -> ByteString -> ByteString #

O(1) drop n xs returns the suffix of xs after the first n elements, or [] if n > length xs.

splitAt :: Int -> ByteString -> (ByteString, ByteString) #

O(1) splitAt n xs is equivalent to (take n xs, drop n xs).

takeWhile :: (Char -> Bool) -> ByteString -> ByteString #

takeWhile, applied to a predicate p and a ByteString xs, returns the longest prefix (possibly empty) of xs of elements that satisfy p.

dropWhile :: (Char -> Bool) -> ByteString -> ByteString #

dropWhile p xs returns the suffix remaining after takeWhile p xs.

span :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) #

span p xs breaks the ByteString into two segments. It is equivalent to (takeWhile p xs, dropWhile p xs)

spanEnd :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) #

spanEnd behaves like span but from the end of the ByteString. We have

spanEnd (not.isSpace) "x y z" == ("x y ","z")

and

spanEnd (not . isSpace) ps
   ==
let (x,y) = span (not.isSpace) (reverse ps) in (reverse y, reverse x)

break :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) #

break p is equivalent to span (not . p).

breakEnd :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) #

breakEnd behaves like break but from the end of the ByteString

breakEnd p == spanEnd (not.p)

group :: ByteString -> [ByteString] #

The group function takes a ByteString and returns a list of ByteStrings such that the concatenation of the result is equal to the argument. Moreover, each sublist in the result contains only equal elements. For example,

group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]

It is a special case of groupBy, which allows the programmer to supply their own equality test. It is about 40% faster than groupBy (==)

groupBy :: (Char -> Char -> Bool) -> ByteString -> [ByteString] #

The groupBy function is the non-overloaded version of group.

inits :: ByteString -> [ByteString] #

O(n) Return all initial segments of the given ByteString, shortest first.

tails :: ByteString -> [ByteString] #

O(n) Return all final segments of the given ByteString, longest first.

stripPrefix :: ByteString -> ByteString -> Maybe ByteString #

O(n) The stripPrefix function takes two ByteStrings and returns Just the remainder of the second iff the first is its prefix, and otherwise Nothing.

stripSuffix :: ByteString -> ByteString -> Maybe ByteString #

O(n) The stripSuffix function takes two ByteStrings and returns Just the remainder of the second iff the first is its suffix, and otherwise Nothing.

Breaking into many substrings

split :: Char -> ByteString -> [ByteString] #

O(n) Break a ByteString into pieces separated by the byte argument, consuming the delimiter. I.e.

split '\n' "a\nb\nd\ne" == ["a","b","d","e"]
split 'a'  "aXaXaXa"    == ["","X","X","X",""]
split 'x'  "x"          == ["",""]

and

intercalate [c] . split c == id
split == splitWith . (==)

As for all splitting functions in this library, this function does not copy the substrings, it just constructs new ByteStrings that are slices of the original.

splitWith :: (Char -> Bool) -> ByteString -> [ByteString] #

O(n) Splits a ByteString into components delimited by separators, where the predicate returns True for a separator element. The resulting components do not contain the separators. Two adjacent separators result in an empty component in the output. eg.

splitWith (=='a') "aabbaca" == ["","","bb","c",""]

Breaking into lines and words

lines :: ByteString -> [ByteString] #

lines breaks a ByteString up into a list of ByteStrings at newline Chars. The resulting strings do not contain newlines.

words :: ByteString -> [ByteString] #

words breaks a ByteString up into a list of words, which were delimited by Chars representing white space.

unlines :: [ByteString] -> ByteString #

unlines is an inverse operation to lines. It joins lines, after appending a terminating newline to each.

unwords :: [ByteString] -> ByteString #

The unwords function is analogous to the unlines function, on words.

Predicates

isPrefixOf :: ByteString -> ByteString -> Bool #

O(n) The isPrefixOf function takes two ByteStrings and returns True if the first is a prefix of the second.

isSuffixOf :: ByteString -> ByteString -> Bool #

O(n) The isSuffixOf function takes two ByteStrings and returns True iff the first is a suffix of the second.

The following holds:

isSuffixOf x y == reverse x `isPrefixOf` reverse y

However, the real implemenation uses memcmp to compare the end of the string only, with no reverse required..

isInfixOf :: ByteString -> ByteString -> Bool #

Check whether one string is a substring of another. isInfixOf p s is equivalent to not (null (findSubstrings p s)).

Search for arbitrary substrings

breakSubstring #

Arguments

:: ByteString

String to search for

-> ByteString

String to search in

-> (ByteString, ByteString)

Head and tail of string broken at substring

Break a string on a substring, returning a pair of the part of the string prior to the match, and the rest of the string.

The following relationships hold:

break (== c) l == breakSubstring (singleton c) l

and:

findSubstring s l ==
   if null s then Just 0
             else case breakSubstring s l of
                      (x,y) | null y    -> Nothing
                            | otherwise -> Just (length x)

For example, to tokenise a string, dropping delimiters:

tokenise x y = h : if null t then [] else tokenise x (drop (length x) t)
    where (h,t) = breakSubstring x y

To skip to the first occurence of a string:

snd (breakSubstring x y)

To take the parts of a string before a delimiter:

fst (breakSubstring x y)

Note that calling `breakSubstring x` does some preprocessing work, so you should avoid unnecessarily duplicating breakSubstring calls with the same pattern.

findSubstring #

Arguments

:: ByteString

String to search for.

-> ByteString

String to seach in.

-> Maybe Int 

Deprecated: findSubstring is deprecated in favour of breakSubstring.

Get the first index of a substring in another string, or Nothing if the string is not found. findSubstring p s is equivalent to listToMaybe (findSubstrings p s).

findSubstrings #

Arguments

:: ByteString

String to search for.

-> ByteString

String to seach in.

-> [Int] 

Deprecated: findSubstrings is deprecated in favour of breakSubstring.

Find the indexes of all (possibly overlapping) occurances of a substring in a string.

Searching ByteStrings

Searching by equality

elem :: Char -> ByteString -> Bool #

O(n) elem is the ByteString membership predicate. This implementation uses memchr(3).

notElem :: Char -> ByteString -> Bool #

O(n) notElem is the inverse of elem

Searching with a predicate

find :: (Char -> Bool) -> ByteString -> Maybe Char #

O(n) The find function takes a predicate and a ByteString, and returns the first element in matching the predicate, or Nothing if there is no such element.

filter :: (Char -> Bool) -> ByteString -> ByteString #

O(n) filter, applied to a predicate and a ByteString, returns a ByteString containing those characters that satisfy the predicate.

Indexing ByteStrings

index :: ByteString -> Int -> Char #

O(1) ByteString index (subscript) operator, starting from 0.

elemIndex :: Char -> ByteString -> Maybe Int #

O(n) The elemIndex function returns the index of the first element in the given ByteString which is equal (by memchr) to the query element, or Nothing if there is no such element.

elemIndices :: Char -> ByteString -> [Int] #

O(n) The elemIndices function extends elemIndex, by returning the indices of all elements equal to the query element, in ascending order.

elemIndexEnd :: Char -> ByteString -> Maybe Int #

O(n) The elemIndexEnd function returns the last index of the element in the given ByteString which is equal to the query element, or Nothing if there is no such element. The following holds:

elemIndexEnd c xs ==
(-) (length xs - 1) `fmap` elemIndex c (reverse xs)

findIndex :: (Char -> Bool) -> ByteString -> Maybe Int #

The findIndex function takes a predicate and a ByteString and returns the index of the first element in the ByteString satisfying the predicate.

findIndices :: (Char -> Bool) -> ByteString -> [Int] #

The findIndices function extends findIndex, by returning the indices of all elements satisfying the predicate, in ascending order.

count :: Char -> ByteString -> Int #

count returns the number of times its argument appears in the ByteString

count = length . elemIndices

Also

count '\n' == length . lines

But more efficiently than using length on the intermediate list.

Zipping and unzipping ByteStrings

zip :: ByteString -> ByteString -> [(Char, Char)] #

O(n) zip takes two ByteStrings and returns a list of corresponding pairs of Chars. If one input ByteString is short, excess elements of the longer ByteString are discarded. This is equivalent to a pair of unpack operations, and so space usage may be large for multi-megabyte ByteStrings

zipWith :: (Char -> Char -> a) -> ByteString -> ByteString -> [a] #

zipWith generalises zip by zipping with the function given as the first argument, instead of a tupling function. For example, zipWith (+) is applied to two ByteStrings to produce the list of corresponding sums.

unzip :: [(Char, Char)] -> (ByteString, ByteString) #

unzip transforms a list of pairs of Chars into a pair of ByteStrings. Note that this performs two pack operations.

Ordered ByteStrings

sort :: ByteString -> ByteString #

O(n) Sort a ByteString efficiently, using counting sort.

Reading from ByteStrings

readInt :: ByteString -> Maybe (Int, ByteString) #

readInt reads an Int from the beginning of the ByteString. If there is no integer at the beginning of the string, it returns Nothing, otherwise it just returns the int read, and the rest of the string.

readInteger :: ByteString -> Maybe (Integer, ByteString) #

readInteger reads an Integer from the beginning of the ByteString. If there is no integer at the beginning of the string, it returns Nothing, otherwise it just returns the int read, and the rest of the string.

Low level CString conversions

Copying ByteStrings

copy :: ByteString -> ByteString #

O(n) Make a copy of the ByteString with its own storage. This is mainly useful to allow the rest of the data pointed to by the ByteString to be garbage collected, for example if a large string has been read in, and only a small part of it is needed in the rest of the program.

Packing CStrings and pointers

packCString :: CString -> IO ByteString #

O(n). Construct a new ByteString from a CString. The resulting ByteString is an immutable copy of the original CString, and is managed on the Haskell heap. The original CString must be null terminated.

packCStringLen :: CStringLen -> IO ByteString #

O(n). Construct a new ByteString from a CStringLen. The resulting ByteString is an immutable copy of the original CStringLen. The ByteString is a normal Haskell value and will be managed on the Haskell heap.

Using ByteStrings as CStrings

useAsCString :: ByteString -> (CString -> IO a) -> IO a #

O(n) construction Use a ByteString with a function requiring a null-terminated CString. The CString is a copy and will be freed automatically.

useAsCStringLen :: ByteString -> (CStringLen -> IO a) -> IO a #

O(n) construction Use a ByteString with a function requiring a CStringLen. As for useAsCString this function makes a copy of the original ByteString.

I/O with ByteStrings

ByteString I/O uses binary mode, without any character decoding or newline conversion. The fact that it does not respect the Handle newline mode is considered a flaw and may be changed in a future version.

Standard input and output

getLine :: IO ByteString #

Read a line from stdin.

getContents :: IO ByteString #

getContents. Read stdin strictly. Equivalent to hGetContents stdin The Handle is closed after the contents have been read.

putStr :: ByteString -> IO () #

Write a ByteString to stdout

putStrLn :: ByteString -> IO () #

Write a ByteString to stdout, appending a newline byte

interact :: (ByteString -> ByteString) -> IO () #

The interact function takes a function of type ByteString -> ByteString as its argument. The entire input from the standard input device is passed to this function as its argument, and the resulting string is output on the standard output device.

Files

readFile :: FilePath -> IO ByteString #

Read an entire file strictly into a ByteString. This is far more efficient than reading the characters into a String and then using pack. It also may be more efficient than opening the file and reading it using hGet.

writeFile :: FilePath -> ByteString -> IO () #

Write a ByteString to a file.

appendFile :: FilePath -> ByteString -> IO () #

Append a ByteString to a file.

I/O with Handles

hGetLine :: Handle -> IO ByteString #

Read a line from a handle

hGetContents :: Handle -> IO ByteString #

Read a handle's entire contents strictly into a ByteString.

This function reads chunks at a time, increasing the chunk size on each read. The final string is then realloced to the appropriate size. For files > half of available memory, this may lead to memory exhaustion. Consider using readFile in this case.

The Handle is closed once the contents have been read, or if an exception is thrown.

hGet :: Handle -> Int -> IO ByteString #

Read a ByteString directly from the specified Handle. This is far more efficient than reading the characters into a String and then using pack. First argument is the Handle to read from, and the second is the number of bytes to read. It returns the bytes read, up to n, or empty if EOF has been reached.

hGet is implemented in terms of hGetBuf.

If the handle is a pipe or socket, and the writing end is closed, hGet will behave as if EOF was reached.

hGetSome :: Handle -> Int -> IO ByteString #

Like hGet, except that a shorter ByteString may be returned if there are not enough bytes immediately available to satisfy the whole request. hGetSome only blocks if there is no data available, and EOF has not yet been reached.

hGetNonBlocking :: Handle -> Int -> IO ByteString #

hGetNonBlocking is similar to hGet, except that it will never block waiting for data to become available, instead it returns only whatever data is available. If there is no data available to be read, hGetNonBlocking returns empty.

Note: on Windows and with Haskell implementation other than GHC, this function does not work correctly; it behaves identically to hGet.

hPut :: Handle -> ByteString -> IO () #

Outputs a ByteString to the specified Handle.

hPutNonBlocking :: Handle -> ByteString -> IO ByteString #

Similar to hPut except that it will never block. Instead it returns any tail that did not get written. This tail may be empty in the case that the whole string was written, or the whole original string if nothing was written. Partial writes are also possible.

Note: on Windows and with Haskell implementation other than GHC, this function does not work correctly; it behaves identically to hPut.

hPutStr :: Handle -> ByteString -> IO () #

A synonym for hPut, for compatibility

hPutStrLn :: Handle -> ByteString -> IO () #

Write a ByteString to a handle, appending a newline byte