Thanks to the FSharpx team I had the opportunity to add double-ended queues to FSharpx.Core. Think of double-ended queues, or deques (Okasaki’s spelling, which I like), as lists that work in both directions. They have mirror image values to the familiar head
and tail
, namely last
and init
, and a value doing the mirror image operation of cons
named snoc
(cons
spelled backwards). Imagine the possibilities of list
when you can also add and remove elements on the end. For instance use pattern matching on head
and last
.
match q with
| Cons(hd, Snoc(mid, lst)) -> hd, mid, lst //first, middle, and last
| _ -> ...
Which deque is right for you?
The four deques available in FSharpx.Core implement the following signature
inherit System.Collections.IEnumerable
inherit System.Collections.Generic.IEnumerable<'a>
Cons : 'a -> IDeque<'a>
Head : unit -> 'a
Init : unit -> IDeque<'a>
IsEmpty : unit -> bool
Last : unit -> 'a
Length : unit -> int
Snoc : 'a -> IDeque<'a>
Tail : unit -> IDeque<'a>
Uncons : unit -> 'a * IDeque<'a>
Unsnoc : unit -> IDeque<'a> * 'a
They also all have a module level value of OfCatLists
which concatenates two lists to create a deque. Deques internally implement two lists (or lazy lists) with the second list reversed. That’s basically all there is to allowing concatenations at both ends. (For in depth discussion of double-ended queues see either Okasaki’s thesis or his book.) This makes the OfCatLists
operation a very fast way to concatenate two lists. Across all four deque implementations the operation is competitive with the FSharpx data structure DList
(Difference List) which implements an append
value that is truly O(1). OfCatLists
is not truly O(1), but two of the deque structures, Deque
and BatchedDeque
, actually out-perform DList
right through a scale of concatenating 10,000 element lists with DList
only barely out-performing them at 100,000. (To be sure we are talking time scales of less than 0.3 milliseconds on a 64-bit system.)
Two deques, BankersDeque
and RealTimeDeque
implement additional values
Lookup : int -> BankersDeque<'a>
Remove : int -> BankersDeque<'a>
TryRemove : int -> BankersDeque<'a> option
Rev : unit -> BankersDeque<'a>
Update : int -> 'a -> BankersDeque<'a> option
TryUpdate : int -> 'a -> BankersDeque<'a> option
Rev
is always O(1), a nice feature of deques.
Although nowhere near as adept at the update
operation as the AltBinaryRandomAccessList
data structure (specifically designed for update), deques exhibit good across the board performance in the index enabled operations Lookup
, Remove
and Update
. They have a theoretical worst case time complexity of ~O(n/2) on index operations because the elements are partitioned in two lists.
But watch out! The actual time complexity on index operations is O(d), where “d” is the displacement into the internal list containing the indexed element. The relative sizes of the internal lists can vary. BankersDeque
and RealTimeDeque
require a “front-back stream ratio constant” in their construction. I’ve implemented a default value of 2 where this is required, but you can also specify a higher value. Essentially it insures neither list is ever larger than a constant times the size of the other list. BatchedDeque
does nothing with the relative sizes until you request a Tail
or Init
, then it brings the lists into a relation of neither being more than twice the size of the other. And finally Deque
is a naive implementation that does not regulate the relative sizes at all.
They all implement IEnumerable
and have available all the Seq
module values, but be sure to always use the native Lookup
and not Seq.nth
. This is one of those cases where straight-up comparisons of time complexity (~O(n/4) versus O(n) over many random iterations) does not tell the whole performance story. Actual Seq.nth
performance is much worse than the time complexity would suggest.
So, which deque is right for your purposes? As is always the case with complexity that makes reasoning about performance hard, do your own profiling!