F# Data Structure Libraries in FSharpx

Last week I blogged about supplementing the FSharpx library with a collection of purely function sequential data structures. This week the news is a restructuring of FSharpx providing for the first time a separate project for experimental data structures. I really like this project set-up as opposed to an experimental branch for accessing experimental structures at the same time as the structures in Core.Collections.

Submit your Data Structures to FSharpx

So let the experimenting begin! The experimental project is actually a copy of the old FSharpx.DataStructures namespace. That namespace has been marked “obsolete” going forward, so if you are using it, switch to FSharpx.Collections.Experimental, which is a separate NuGet package from FSharpx.Core.

Experimental is intended primarily for purely functional (immutable) structures, but feel free to add mutable structures of interest. Good candidates include F# ports of structures available in other functional languages and implementations from academic papers. The main criteria is the structure will be useful and interesting to others. Include some documenting comments and pointers to sources, if appropriate.

Do include tests with your submissions. Not only is that all the QA there is, they also serve as functionality samples.

Staging for the FSharpx.Core library

Collections.Experimental also serves as the staging area for promotion to the FSharpx.Collections and FSharpx.Collections.Mutable namespaces. There is higher acceptance criteria for pull requests here. See the core collections readme. Structures should generally have the look and feel of belonging to a professional library.

There are already several candidates in experimental (in my opinion) worthy of polishing for core collections. I have not taken the time to look closely at most of these, so please excuse me if some of my comments are “uninformed”:

BKTree

In addition to reviewing for core standards, this structure would benefit from review by someone experienced with practical application of distance measures, perhaps adding more distance measures. A quick search of recent F# blogs finds

Revisiting String Distances

F# on Algorithms – Wagner–Fischer algorithm

Edit Distance (EDIST) with F#

Levenshtein Distance and the Triangle Inequality

Metrics on Words

BottomUpMergeSort

Maybe. Needs polishing.

Zippers

There are a couple of zippers in experimental. I’ll leave it to someone who uses these more to determine what it takes to promote them (or not) to core.

IntMap

Needs review for function naming. (Should match naming standard set by MS.FSharp.Collections.Map where appropriate.) Remove some unnecessary module level functions. (E.g. ofArray, which only composes List.ofArray. Better not to hide this from users.) And hide behind an fsi file or entirely remove the “unsafe” functions.

RoseTree

Looks useful. Needs polishing.

TimeSeries

I don’t know about this one.

An exercise for the reader

I recently uploaded EagerRoseTree to experimental. (It’s “eager” because the first RoseTree in experimental implements LazyList for the children forest.) EagerRoseTree is unfinished in many ways.

For one thing it exposes an unsafe internal structure. A user could implement a dysfunctional tree. The children forest is a Vector. This allows for indexed lookups in the EagerRoseTree right out of the box, but updates to forest nodes will only return a new vector, not propagate up to returning a new tree.

Here’s a couple more possibilities:

1) Adapt a zipper to navigate in the tree.

2) Adapt the recently released FsCoreSerializer to serialize this, and other structures.

Upcoming talk on F# Data Structures

If you are in the area, come to my data structures talk at SF F# User Group Monday night, February 4. It’s a great and knowledgeable user group.

3 thoughts on “F# Data Structure Libraries in FSharpx

  1. If you want to build an eager rose tree, I’d just replace the LazyList with a regular list. In fact, the first draft I wrote used regular lists ( https://github.com/fsharp/fsharpx/commit/ed5966bd49e8ef47889945fb55a3736b6d718aec ), which has the nice property of making equality in RoseTree automatic. Ultimately I decided to use a LazyList to model the Haskell tree more closely, allowing for infinite trees.
    You can change LazyList to list just by changing these two lines: https://github.com/fsharp/fsharpx/blob/master/src/FSharpx.Core/DataStructures/RoseTree.fs#L28-L30
    IMHO random access into a tree is a rare operation, it’s not worth using a Vector here.

    • I agree with what you say, but the use I have for this particular RoseTree requires indexed lookup and update. Perhaps I should call it “IndexedRoseTree”. Also my application requires a specialized equality, which I will probably not update to the library.

  2. Pingback: F# Weekly #5, 2013 « Sergey Tihon's Blog

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>