Created: 2023-01-11 10:11
Persistent data structures vs Ephemeral data structures
Here’s a quick overview of some of the most common data structures in Haskell.
List
is O(n) for most operations. Efficient access to the begging. Good for stacks since it’s implemented as a linked list.- Binary trees (used for
Map
/Set
in Haskell) are O(log n) for most operations. Seq
(sequence) provides efficient access to both ends, is good for queues. Uses Finger trees for the implementation.Array
(Vector
) provides random access with O(1) to the elements. C like sequences of memory. In Haskell coping of such data structures is quite expensive, since we want to avoid mutation so we cannot share memory on updates. Arrays are useful when often lookups are required on an stable list/table.Data.Vector.Unboxed
=> the elements must be members of theUnboxed
type class, which can be implemented for types that can be unboxed. This vector is equivalent to a C array, just a chunk of memory with the actual elements in there, instead of pointers to the elements as the regular Haskell Vector.String
=> bad,List
of characters. Eg. forcing the string to WHNF provides almost no benefit since the WHNF of a list only fully evaluates the head (ie. a single character).Text
=> UTF-8 encoded text, using a boxed vector. Much better performance than strings.ByteString
=> unencoded sequence of bytes, should not be used for text, only for binary formats.- Builders (Text/ByteStrings) => much better performance than manually building big chunks of text (like HTML generation), by allocating a big buffer instead of a bunch of small ones.