List as a Semantic Element
The singly-linked list is the fundamental data structure of functional programming. Suspending concern with how it is implemented (forward-linked elements) and dealing with it strictly as an abstract construct of its host language, List is an extension of the programming language, which in turn brings semantic structure to data. Growing and shrinking linear data representations is so powerful List alone suffices for solving a vast number of problems.
Eventually we do need to concern ourselves with implementation. There is a class of purely functional List-like data structures that extend List, either substituting more efficient versions of existing functions or supplementing additional functions for working with linear data representations.
The supplemental List-like data structures derive from familiar structures in imperative programming, but once constructed as purely functional (immutable) data structures, thinking of them as algebraic mutations of List reveals semantic cohesion.
List’s core functionality consists of three functions. (These are the names used in F#, and for clarity I will adhere mostly to the naming standard favored by Okasaki for extension functions, rather than the more familiar function names from imperative programming.)
cons
— returns List with new element added to left
head
— returns left-most element
tail
— returns List excluding left-most element
F# List also provides non-core functionality (again, disregarding implementation you could do without these).
rev
— returns reversed List
append
— returns List of first List elements followed by second List elements
length
— returns count of elements in the List
Extending Serial Representation
The extensions we will consider for our List-like vocabulary:
conj
— returns List with new element added to right
initial
— returns List excluding right-most element
last
— returns right-most element
iLookup
— returns element at index
iUpdate
— returns List with element at index updated
It is perhaps surprising that List is so useful given that only the most recently added element is accessible! Significantly, you can readily see the extension functions either invert one of the core functions or provide short-cut access to an interior element. RandomAccessList provides indexed lookup and update.
RandomAccessList = List + iLookup + iUpdate
LazyList is one List-like data structure that most probably comes from a functional programming heritage. It is the only List-like structure reviewed here incorporating functionality the core List functions alone cannot replicate, namely lazy evaluation. Significantly in F# it already uses the List-like functional naming standard.
LazyList = ListLazy
Heap is frequently overlooked as being a List-like data structure. Even Okasaki reverted to using function names from imperative programming that obscure the close relationship to List. Heap is always sorted and provides a more efficient, O(log n), append
(merge).
Heap = List + sorted + append
Difference List facilitates adding elements to the other end of the List, whether singly or another entire Difference List in O(1) time.
DList = List + conj + append
After introducing adding elements to both ends, now consider the fully symmetrical linear representation provided by a Double-ended Queue. Deque is the union of its tail
and tail’s compliment, initial
. This symmetry finds additional expression in an O(1) rev
.
Deque = List + conj + last + initial + rev = initial U tail
Queue, a workhorse in imperative programming, is simply a List that can only add elements on the right, but still extracts them on the left with head
.
Queue = List - cons + conj
While the algebra deriving Vector from List is the longest, it is simply an inverted RandomAccessList.
Vector = List - cons - head - tail + conj + last + initial + iLookup + iUpdate
Vector = RandomAccessList-1
Why the Cacophony of Names?
Perhaps perfect computer language implementations are an unrealizable ideal, but consistent nomenclature reveals semantics otherwise easily overlooked. Data structures from imperative programming take on a new character when made purely functional, and it is appropriate for them to give up part of their former identity (function names) in the functional realm.
All of the these F# Data Structures and more are available in the FSharpx.Collections and FSharpx.Datastructures namespaces here or from Nuget in FSharpx.Core.
UPDATE: Updated to naming standard adopted in FSharpx.Collections.
Nice one. Okasaki’s “Purely Functional Data Structures” is on my new years reading list.
Pingback: F# Weekly #1, 2013 « Sergey Tihon's Blog