Twitter and the F# Programming Community

Twitter has been an instrumental medium in building the F# programming community, and like all information media, the client plays a crucial role. IMO the state of Twitter client technology is appalling compared to what it could be. I’m going to use MetroTwit as a straw man to describe how I would like to use a Twitter client, because I think MetroTwit’s UI paradigm starts off in the right direction.

Columns seem so natural, but…

…but MetroTwit has its own idea of what those columns should contain. How about giving me the tools to filter and arrange the columns the way I want? Nested include/exclude filter lists are a simple and proven technology. There is probably an OO design pattern for this. Here’s how I would use it.

My left-most column

Include one or more Twitter lists.
Include/exclude individuals, whether or not they are in lists.
Include/exclude tweets containing a link (filter available for lists and individuals).
Include/exclude tweets containing a hash tag (filter available for lists and individuals).
Include/exclude tweets containing a word or phrase (filter available for lists and individuals).

I might even find a use for filters like

Include/exclude retweets.
Include/exclude retweets by individual.

My second and third columns

I like having a column of tweets that mention me and one of direct tweets to me. Optional retweet aggregation here would be nice too.

Fourth column

This becomes the bucket for all the tweets that don’t make it to my first three columns. Even so, the option to exclude from here should be available.

Option for Retweet aggregation

I would like the option for retweets to appear with the aggregated information of who has retweeted them, but when the same tweet is retweeted again drop out previous retweet occurrences from my column.

Notifications

Personally, I only want desktop notifications for tweets that pass my filters for the first column, tweets mentioning me or direct to me, and I only want to see a notification for the first retweet. Notifications should have all the same filtering capabilities as columns.

Mobile

Of course I want the same client on my Android and IOS devices, inheriting my desktop configuration, but allowing me to tweak that configuration by device. I want to see all my columns on my tablet, but one column at a time on my phone.

Drag and Drop

Dragging a tweet from one column to another should pop-up the filter dialog prepopulated with the program’s best guess as to the filter change I intend to make. Easy intuitive filter changes should be a key design principle.

Where are my favorites, MetroTwit?

Oh I see, if I add a column I can make it a column of my favorites…but now I feel like I’m overloading my UI with too many columns, and I only want to reference my favorites occasionally. A “pop-up” style interface to favorites would suite me better. Sorting and filtering are a plus.

ProTip:

Use Twitter to build your personal professional brand. Keep your tweets more or less within a professional domain and keep them informative. (Many consider presence of a link a key indicator of information content.) Off-topic opinions and rants diminish your professional brand. Open another Twitter account for that brand.

FSharpx Data Structures Discussion

In the ten weeks since I began submitting pull requests contributing Functional Data Structures to the FSharpx open source project I’ve learned a lot, not just expanding my knowledge of the F# language, .NET Framework, and Functional Data Structures, but Git, GitHub, and perhaps most interesting of all, contributing to an open source project.

Steffen Forkmann and Mauricio Scheffer, two of the core FSharpx team, deserve special call-outs for helping me come up to speed. Steffen ported some of my work before I began doing Git pull requests, and they’ve both been very patient with my Git blunders and other mistakes. It took some time, but I’m starting to understand the somewhat anarchic governance structure to open source. With my background leading medium-sized production projects, I try to develop a big picture of the entire stack. I’m learning OS contributors tend to be much more focused on immediate issues. I think this is for good reason as commit history is crucial to the health and long-term benefit of the project. It works like a self-organizing system with big picture benefits as emergent phenomena.

I still can’t resist formulating a bigger picture framework for the FSharpx Data Structures sub-project. What follows is purely my opinion, and I hope FSharpx contributors and others will correct me where I err, and encourage me where I am right with their feedback.

FSharpx.DataStructures Namespace is Purely Functional

This namespace is for Purely Functional Data Structures. There is already one exception, TransientVector which shares the IVector interface with the purely functional PersistentVector.

Mutable structures should have their own namespace, which currently doesn’t appear to exist. There is a “Collections” folder, but its components are in the FSharpx namespace. It includes extension modules to functional and mutable structures as well as one functional structure, NonEmptyList.

Structure Type and Module

The structure’s type should exist directly in the DataStructures namespace, as well as a module with the same name as the type, but not the type under the module. There are a few structures in the latter situation with no non-breaking remedy, except perhaps duplicating the entire type at the namespace level. The type should have all the members that operate on an instantiated type. The module should have let bindings to all the the type members as well as let bindings for all other “utility” functions on the type, like ofSeq and empty.

Understand the difference between a property and a method

Property Usage Guidelines One rule of thumb, if it is not O(1), it is not a property. (The inverse does not necessarily hold.)

Pattern Discriminator

Include any relevant pattern discriminators in the module. Hint: if the type implements TryUncons or TryUnsnoc you can implement pattern discriminators.

Try to Avoid Exceptions

If a structure’s method can throw an exception, include a TryMethodName method that returns option.

Necessary Members

Every data structure should have a length member (even if length is O(n)) and an IsEmpty member in the type and ofSeq in the module.

Additional Members

Of course implement all the members central to the structure, but there can be some nice O(1) or O(log n) values just waiting to be implemented that perhaps were not directly addressed in the research paper because they were not interesting. A couple of examples: the internal structure of the type allows O(1) rev or ofList.

Necessary Interface

Implement IEnumerable in the type. It should sequence elements in the inherent order of the data structure. For instance if a structure implements the head/tail paradigm, IEnumerable should sequence elements in the order of unconsing the structure down to empty. Most every structure has some logical, if not physical, order to its elements when flattened. (I can’t think of an exception right now.)

Implementing ofSeq and IEnumerable makes practically every Data Structure composable into every other, a very useful feature!

Type Interface

In theory data structures implementing the same paradigm, e.g. queue, should implement an interface that both restricts upcasting the interface to the original type and is a usable interface. Meeting both the restrictive and usable requirements has proven difficult. I haven’t given up on this project yet, just deferred it.

Hide unsafe underlying structures

When I first went to work on data structures in the FSharpx project, I noticed some of the structures exposing unsafe (in the sense that arbitrary use could lead to a dysfunctional structure) discriminated unions. I didn’t fully appreciate at the time how an OS project like FSharpx incrementally works toward ideal practices, and considered this an intentional practice, which I emulated.

As a result of Mauricio’s initiative to test for monoid law adherence, this [mal]practice has been exposed here and here.

This is a fair amount of clean-up work. Still, the exposed union does provide one more handle for the determined hacker to be inventive with, so perhaps the correct solution is to expose the union in the type and module in some manner that is clearly identified as unsafe. Or even collecting all of the unsafe unions in a single unsafe module. “Unsafe” or “unchecked” would be ideal names, except they are already C# key words.

Still thinking this over, and hope to get feedback on this.

Intellisense

Every public member and let binding should have intellisense documentation to make it user friendly and not require a literature search or inspection of the code to determine functionality. At the type level there should be intellisense documenting the purpose and use of the structure. Include URLs to additional literature where appropriate.

Data structures should give developers useful tools, not research projects.

Time Complexity

Adopting a practice from Haskell documentation, I put the time complexity for members and let bindings at the beginning of their intellisense documentation. I even went so far as to put O(1) on empty and singleton to accustom users to seeing the time complexity in the intellisense, even though strictly speaking time complexity for these bindings has no meaning.

Take time to evaluate the naming standard

While it’s a good practice to keep the name a structure is best known as, consider renaming properties and methods if it makes sense to fit in with F# paradigms. Use the FSharp.Collections and existing FSharpx.DataStructures namepaces as models, don’t just unthinkingly adopt the names from a research paper or the implementation in another language. Use ofList and ofArray not fromList or fromArray. I intentionally renamed the Heap members findMin and deleteMin to head and tail because I think that accurately conveys the head/tail paradigm like several other data structures (and I implemented Heaps as ascending or descending, so “min” is inappropriate). Stick with cons, snoc, head, tail, init, last wherever possible.

IReadOnlyList and IReadOnlyDictionary

IReadOnlyCollection and IReadOnlyDictionary, new to .NET 4.5, would appear to be useful to implement, however backwards compatibility of FSharpx raises a problem. A discussion of this issue is here.

Should FSharpx take over the PowerPack Data Structures?

Discussion: here

There are related threads for migrating other functions from PowerPack, for instance here and here

I don’t know what the right answer is.

Conclusion

Data Structures make F# a richer multi-technology-stack language. I keep a running collection of the structures from Okasaki’s 1998 book not yet ported to FSharpx here. (There’s at least one or two covered in the book not in this collection.) Feel free to request expediting porting any or all, or work on the port yourself (I’m happy to provide any assistance I can). There’s lots of other useful structures still “in the wild”.

Simple Lookups in F#, Choices and Trade-offs

What is your go-to data structure for lookups in F#? Most any choice for key-based lookups will work just fine, but if your requirements go beyond simple lookup or you have a demanding edge case what other criteria might you consider? Here’s an overview of the available lookup data structures:

Map

Purely functional and persistent. Over 20 methods providing various custom folds, mappings, filters, etc.

IntMap

Purely functional and persistent, but restricted to integer keys. Dozens of specialized methods including folds, mappings, filters, and inequality lookups (e.g. greater than key). Part of the FSharpx open source project.

Dictionary

Mutable. Almost always the fastest lookup. The fastest object to initialize with data, provided you initialize the dictionary with the expected capacity. If the capacity is not given data loading performance will degrade significantly in the 100 to 10,000 element range. It has none of the ancillary functional methods available to native F# data structures.

Hashtable

Mutable. Very good lookup performance and data initialization performance on string keys. If the capacity is not given data loading performance will degrade significantly at all ranges of element size. None of the ancillary functional methods available to native F# data structures.

HashMultiMap

Mutable. Available in the PowerPack.Core.Community library. The distinguishing feature of this data structure is a FindAll method returning a list of all values bound to a hash key. A fold method is also available.

So what are the raw lookup performance numbers? (All timings are in milliseconds.)

Integer Key by Number of Elements in Lookup Structure, 10,000 random lookups

  Map IntMap Dict HashTbl HshMltMap
102 0.7 0.7 0.2 0.3 1.8
103 0.9 0.9 0.2 0.3 1.8
104 1.2 1.3 0.3 0.4 1.9
105 2.1 2.2 0.6 0.8 2.3
106 3.3 3.1 0.9 1.0 2.6

String Key by Number of Elements in Lookup Structure, 10,000 random lookups

  Map IntMap Dict HashTbl HshMltMap
102 1.3 n/a 0.4 0.3 1.5
103 1.7 n/a 0.4 0.4 1.5
104 3.0 n/a 0.7 0.7 1.8
105 5.3 n/a 1.5 1.2 2.4
106 8.4 n/a 1.6 1.5 6.3

Good old System.Collections.Generic.Dictionary consistently performs lookups about 4 times faster than Map, but all the numbers are in the very low milliseconds or even tenths. This couldn’t make a performance difference unless you are doing a bazillion lookups in some long-running machine learning system.

So what else distinguishes the lookup choices? Maybe your app repeatedly loads data into new lookup tables.

Integer Key by Number of Elements loaded into Lookup Structure

* indicates capacity was not set for Dictionary and Hashtable

  Map IntMap Dict * HashTbl * HshMltMap
102 1.6 2.6 0.6 1.7 0.3 1.4 1.2
103 1.8 2.8 0.6 1.7 0.3 2.7 1.3
104 5.6 5.0 0.9 2.2 0.8 4.7 1.9
105 79.0 49.2 4.3 7.4 11.6 68.7 11.3

String Key by Number of Elements loaded into Lookup Structure

  Map IntMap Dict HashTbl HshMltMap
102 4.0 n/a 0.6 0.3 1.2
103 4.5 n/a 0.7 0.4 1.3
104 12.1 n/a 1.3 1.3 2.4
105 144.0 n/a 11.2 13.5 19.6

Once again Dictionary outperforms the competition, and more so if you set the capacity properly in the constructor.

If you are doing updates performance differences start becoming very pronounced. This is especially due to mutable structures’ advantage of doing updates in place, and in the case of Map there is no update-equivalent method, so updates require remove followed by add, returning new map instances and generating work for garbage collection.

Integer Key by Number of Elements in Lookup Structure, 10,000 random updates

  Map IntMap Dict HashTbl HshMltMap
102 4.3 2.4 0.2 0.3 0.8
103 6.8 3.1 0.2 0.3 0.8
104 10.6 4.3 0.2 0.4 0.9
105 17.9 9.3 0.6 0.6 1.3

String Key by Number of Elements in Lookup Structure, 10,000 random updates

  Map IntMap Dict HashTbl HshMltMap
102 7.5 n/a 0.3 0.4 0.8
103 12.0 n/a 0.4 0.5 0.8
104 17.9 n/a 0.6 0.7 1.1
105 34.6 n/a 1.2 1.2 1.7

In conclusion, for those rare applications doing huge volumes of lookups and/or repeatedly creating new lookup tables, Dictionary is your best choice, unless you require “functional” characteristics of the lookup table. Dictionary also enjoys a clear performance advantage in iterating over its contents as IEnumerable (but I’ve already bored you with enough tables), and because of this you can overcome the lack of “functional” methods (fold, etc.) by enumerating to the Seq module and use its methods.

Note on methodology: Integer keys are in the range 1 to number of elements and string keys range in length from 1 to 26 characters, evenly distributed. Both key types are shuffled before loading the data structure (i.e. the order is somewhat random). Data is presented in array format for loading. Timed events are isolated to the greatest extent possible. Each timed event is performed 50 times and the best 40 times are averaged. This is to discard arbitrarily long timed events which inevitably occur and would skew the results if included in the average. Taking the median of 50 timed events would serve the same purpose. Original timings are in ticks and divided by 10,000 to arrive at milliseconds. Timings measured using DS_Benchmark. The machine of record is a low-end dual-core processor running 64-bit Windows 7. Not covered by this analysis is garbage collection of large monolithic structures like large dictionaries or hashtables.

Let’s Make a Bayesian Deal

Let’s have some fun with the Monty Hall problem. The set-up is three closed doors. Behind one is a nice prize, behind the other two are dud prizes. You (having no prior knowledge) choose a door, at which point the game’s host (with prior knowledge) reveals a dud prize behind one of the doors you did not choose and asks if you want to stay with the door you chose or switch and choose the other door that is still closed.

There are many published articles on why the proper strategy is to switch doors, including the article above, but when I tried to explain this to someone a few months ago I was told that I was “full of sh*t, the odds don’t change”. Well actually the odds do change, because the game’s host has revealed new information. It all has to do with Bayes’ Theorem whose basic formula

P(A|B) = P(B|A)P(A) / P(B)

reads as follows:

The probability of event “A” occurring, given event “B” has occurred
equals
The probability of event “B” occurring, given event “A” has occurred
times
the probability of event “A”
divided by
The probability of event “B”.

Since I am apparently a failure at explaining this, I wrote an F# Monte Carlo program to do my talking for me. The F# code really expresses the way the game works. There is not much to it. Here is the type for an individual game:

It constructs each game given doorWithPrize, randomly generated, fstChoice, also randomly generated, and revealedDoor the door the host opens during the game play, for which there are two rules for choosing this door.

If the player happened to choose the door hiding the real prize then the host arbitrarily chooses between the other two doors:

Otherwise the host is constrained to open one and only one of the doors:

Over a large enough sample of games the host is constrained two-thirds of the time in his choice of doors to open because the player will make the wrong first choice two-thirds of the time. So the act of exposing one of the dud prizes reveals valuable information to you the player. (At least it tells you something with better than 50/50 odds.) And that’s why this works.

For your friends who still won’t believe and think you have played some sort of trick the program comes with an audit function that will print the records of up to 100 games. The source code is available here or on GitHub. Have fun with it.

The Miracle of F# Continuation Passing

One thing I consistently include in preparing functional data structures for the FSharpx.DataStructures library is the IEnumerable interface, which facilitates function composition by seamlessly flattening structures. Using Brian McNamara’s indispensable series on catamorphisms as a guide I set about to flatten the BinomialHeap.

Ignoring for now isMaximalist : bool we see a recursive structure of

list -> tuple -> list -> tuple -> etc.

and we want to flatten the elements 'a to 'a list. A classic case for a continuation passing style (CPS) fold.

This code shows a first attempt at hard-coding the flatten we want to accomplish. There are only four patterns to account for, so reasoning about the steps to take is simple. The fifth “can’t get here” pattern is purely to prevent a warning message. (The compiler doesn’t see there are only four paths.)

This code does what we want, producing an 'a list of all the heap’s elements, but the continuation on the last pattern is appending lists, an inefficient procedure we want to avoid…but how?

Let’s see if we gain anything by abstracting out the flatten logic to make this a true generic fold.

So far I have been treating this problem as a classic example from part three of Brian’s series. In abstracting a CPS fold over a discriminated union, you end-up with one continuation function for each case in the union. Our data type is not a four-case discriminated union, but I have been treating the number of match patterns as a proxy for a discriminated union. (The second and third patterns take the same function. That should be a hint there is something wrong about this approach.)

Ultimately abstracting over (a proxy for) a discriminated union has been a dead-end. We do have an abstracted fold method using CPS, but there seems to be no way to get rid of that annoying list append.

The correct approach is through the tree CPS method Brian teaches us in part two. Except for the name “BinomialTree” it is not immediately obvious how a recursive structure like

list -> tuple -> list -> tuple -> etc.

is any kind of tree, let alone a binary tree. Only when we write out the pattern

Node(_, a, h')::tl

do we see the binary tree:

    (‘a element to flatten)    
  h’   tl  
  (list within the element’s tuple)   (tail of list of which element is head)  

So the tree structure has been apparent from the time we wrote out the patterns for our match algorithm, and we know those patterns represent the complete set when iterating through our heap, but to accomplish a binary tree CPS fold you need a “null” pattern (a pattern which does not contribute data to the fold operation) at the bottom of each continuation in order to thread through the identity function. This is why you see so many tree examples in functional languages ending in “leaf”, a seemingly useless end of branch marker carrying no data.

There is no “leaf” in our data structure, but lets go through the motions of implementing the classic tree CPS fold, and maybe that will help us figure out something.

And then a miracle occurs

Don’t believe in software miracles, but all of a sudden our newest generation of the fold starts failing with “can’t get here”. We are now passing something new through our loop: empty list. It turns out there was a “null” pattern all along. Change

| _ -> failwith "can't get here"

to

| _ -> cont leafV

and we have a working CPS fold on the BinomialHeap. (I’m using Brian’s naming standard for the fold parameters.)

Let me break apart the inOrder call to foldHeap to make the fold mechanism more explicit:

A heap sidebar

The code is now perfectly suited for exploring CPS execution. The inOrder member we have been evolving does an in order tree traversal of the heap’s internal data representation, which in the case of heaps is entirely meaningless. We only care that a complete traversal take place. We will sort the resulting list to get it into proper head/tail order.

You don’t believe the internal order is meaningless? Here’s the internal structure we will walk through:


Node (0, "b", [])
Node (1, "a", [Node (0, "c", [])])

In this case the isMaximalist : bool I told you to ignore above is set to false, so the resulting list should be

["a"; "b"; "c"]

but not as result of the traversal, only after we sort the output list of the fold ourselves. The in order traversal results in

["b"; "c"; "a"]

Now back to our story

The step four code makes it easier to follow the logic and clearly shows how the empty lists satisfy the null pattern. Let’s walk through it.

               current Node   executing code  
 
cont =
  (fun x -> x)
 
  (0, “b”, [])   Node(_, a, [])::tl -> loop [] (fun lacc ->
  (head of top list)  
 
cont =
  (fun lacc ->
    loop [] (fun racc ->
    (nodeF a lacc racc) |> (fun x -> x))
 
  cont leafV  
 
The remaining in order tree traversal builds-out increasingly nested continuations.
 
  Node(_, a, [])::tl -> loop [] (fun lacc ->
  loop [] (fun racc ->
 
  (1, “a”, …)   Node(_, a, h’)::[] -> loop h’ (fun lacc ->
  (head of top right list)  
 
  (0, “c”, [])   Node(_, a, [])::[] -> loop [] (fun lacc ->
 
  cont leafV  
 
  Node(_, a, [])::tl -> loop [] (fun lacc ->
  loop [] (fun racc ->
 
  cont leafV  
 
  (1, “a”, …)   Node(_, a, h’)::[] -> loop h’ (fun lacc ->
  loop [] (fun racc ->
 
  cont leafV  

Swapping in the nodeF parameter and substituting its parameter names and the element parameter name for “x” we get the crucial code fragment. I am also using the result passing symbol for readability.

(fun a lacc racc acc -> lacc (a :: (racc acc)))|> (fun x -> x))

Another substitution, using strikethroughs to show what I changed. Coincidentally the symbol “a” from the code gets replaced by the value “a”.

(fun a lacc racc acc -> ("a" :: (racc acc)))|> (fun acc -> acc) |> (fun x -> x))

One more result passing symbol for clarity.

(fun a lacc racc acc -> ("a" :: (acc |> racc)))|> (fun acc -> acc) |> (fun x -> x))

At the bottom of all the right trees we eventually hit an empty list, so we make one more substitution.

(fun acc -> ("a" :: (acc |> (fun acc -> acc)))) |> (fun acc -> acc) |> (fun x -> x))

(fun x -> x) starts the whole continuation unwinding (executing). Notice the anonymous function from nodeF only has three of its four parameters filled in. This is where the code we broke-out in inOrder comes into play. I teased apart inOrder to explicitly show how the CPS pattern wraps the call to foldHeap in a function taking the implicit parameter acc : 'a list.

x []

We accomplished building a partial function with one parameter. [] provides the seed empty list passed down the nested acc parameters.

Here is a reasonable facsimile of the generated function and execution:

And here is the finished product:

If not for adding line breaks for better readability, this is seven lines of code. Six to create a generic fold structure, and one to perform an in-order traversal. However this is a case where F#’s famous expressiveness suffers from the succinctness. If you did not know beforehand how this works, good luck figuring it out!

The complete source code for the BinomialHeap can be found here.

Crisis in Open Source Land

Crisis #1 – WTF Happened to the Git Reference

For those of us who have not had the “aha” Git moment, and do not have every Git command memorized, a complete reference is indispensable. For G_d knows how long more topics in the only known complete reference have been broken than I care to count. Looking up every Git issue we encounter on Stackoverflow without a reference to put it all in context is a helluva way to come up to speed. This is like the Git cognoscenti putting a big sign on their clubhouse, “Hey, we learned it on our own”.

UPDATE: Github fixed this October 30. Thanks Github!

Crisis #2 — Microsoft Research Abandons Power Pack Collections Documentation

And the same goes for the Collections.Tagged documentation page. For now you can still get collections and collections.tagged on the Way Back Machine.

OK, not that many people care about F#, but if you come to my blog there is a good chance you do. MSR has probably been intending to take this page down for a long time. It was documenting an old release of the Power Pack Collections, but still better than nothing.

HashSet – Who cares. I think this got deprecated to something in the .Net generics collection.

HashMultiMap – Seems no one cares about this one either. Even though it is not a functional data structure, someone might find it useful, if only it were documented.

LazyList – <hyperbole>Everybody</hyperbole> uses LazyList. In fact, I’m baffled why the FSharp Team hasn’t integrated LazyList into FSharp.Core. It is in the Microsoft.FSharp.Collections namespace. So it’s bad enough F# newbies wonder why they can’t call up the docs in Visual Studio Help Viewer.

The Unifying Crisis

What I consider a bigger deal is the difficulty in formatting open source project documentation. Let’s face it. People who like to code open source projects like to code, and generally dislike writing documentation. My interest here is primarily .NET open source projects (my wish list applies to .NET projects), and admittedly I have not dug very deep into the solution. Open source authors and consumers would benefit greatly, and make project adoption much easier, if there were good tools for generating nicely formatted documentation outlines.

Ideally a markup language to generate Visual Studio Help Viewer format. (And ideally a way to integrate it into VS Help Viewer.)

Ideally a tool to generate a reference skeleton from the project code.

Less ideal would be generation and markup for any decent pseudo-standardized reference format. (There is probably something already available that will do this into Java Docs format, but really, do we have to?)

And I am sure there are some who are thinking “Unix man pages are just fine”.

My Silicon Valley Code Camp Wrap-up

We are living in the Golden Age of Software Engineering, so it’s a special fate to be living and working in the Bay Area connected to this industry. Do you want to have this conversation in your old age as you bounce your grandchild on your knee?

“Grandpa, what did you do during the Golden Age of Software Engineering?”

“Well…I wired-up CRUD apps in Milpitas.”

Get out there and benefit from the learning and socializing opportunities in our craft!

A couple weekends ago was the 7th annual (but my first) Silicon Valley Code Camp. I’ve been missing out, and so are you if you didn’t go. It’s two days of hundreds of technical presentations by informed speakers and the fellowship of over 3,000 techies, mostly from the Bay Area, but I met attendees from as far away as Germany.

This is my personal retrospective of the event, covering the sessions I attended and the one session I hosted.

A special shout-out to Peter Kellner for not only organizing SVCC, but being the inspiration behind it.

Scala for Java Developers

Ramnivas Laddad

My main interest was learning enough about Scala to compare it and the Java environment to F# and .NET, my area of interest. I learned Scala has a flatten operation useful as a catamorphism, for instance, for extracting values from the Some option and eliminating None from a list. This would be useful in F#, where you would have to write your own function over a fold to do this. There also is a specific operation for tail recursion that will give a compile time error if your operation is not tail recursive. It takes experience to easily recognize the difference, so I can see the usefulness of this especially for beginners.

Coming from an F# perspective I was surprised to see so much emphasis on the OO aspects of Scala at the beginning, however it has been years since I first started learning functional programming. Ramnivas’ introduction was well-structured and effective. As I expected the lack of “white space as syntax” in Scala jumped out at me. Anyone designing a new language should really consider this feature. In our information overloaded lives extraneous symbols are noise providing no utility.

Why ‘Friends of Friends’ matters: Applying graph databases to much more than social

I’ve wanted to learn about graph databases for a long time. David Montag’s session was a much better “intro” than focusing on nuts and bolts Neo4j operations to would have been. I’ve wanted to learn about graph databases for a long time, and jumping right into real use cases for the entire session was a very effective introduction to the technology.

Improving your .NET build process with Nant and CruiseControl.NET

I wanted to wrap my head around .NET build/deploy again, and Joe Enos provided a good session concentrating on CruiseControl.NET, which I have used before, but I learned Team City is perhaps the more modern way to go.

Working with Functional Data Structures, Practical F# Application

At least 8 (no more than 10) people showed up for my talk, which wasn’t too bad considering the somewhat esoteric nature of an intermediate F# topic, and my time slot was paired with the only other F# talk of the entire conference. It was, however, the only talk attended by two pigeons (more on that below).

First, what went wrong

This was the first time I gave this talk, in fact the first time I spoke at any conference. I needlessly worried my session wouldn’t last long enough. So I got off-plan and babbled through the first several slides. I apologize to my audience for that.

Next, there was one early slide where I intentionally mis-used a technical term intending to make a point. Bad mistake. A presentation is no place to try to pull that trick…so that slide gets reworked.

Third thing that went really bad was the first demo code session. I rarely use fsi, and this demo relied heavily on fsi. By the time I put the finishing touches on the presentation and demo code I was really tired, and opted for a good night’s sleep before the conference (right choice) instead of working through the entire presentation one more time. Everyone wants to see code demos, but to pull it off really takes a lot of practice, and/or they have to be kept really simple. My plan now is to change the first two of three demos into additional slides, which will keep the story moving.

Then the pigeons. Two pigeons flew in and roosted at the back of the room right at the end of what felt like a horribly botched first code demo. They continued ruffling their feathers and generally making a nuisance for what seemed like about two-thirds of my talk. Finally a member of the audience, with much trouble, managed to encourage them back out the door they flew in.

Because I didn’t stay succinct enough time ran out on the finale of my presentation, but not by much.

Then, what went right

I think I perfectly judged the amount of material to present in an hour and a quarter. I just babbled too much. Next time I will be confident in staying succinct.

As a whole my material and the presentation tells an interesting technical story that flows to a good conclusion. With a little reworking the next time, still unscheduled at San Francisco Bay Area F# User Group, should be a success.

{For Those About to Mock](http://www.siliconvalley-codecamp.com/Sessions.aspx?ForceSortBySessionTime=true&AttendeeId=583)

Mathias Brandewinder provided an effective session conveying an understanding of unit test mocking and dependency injection by working through an example unit test situation. Dependency injection is surrounded by obfuscation in the literature. This session cleared it up for me. Not a lot to say other than this was a very worthwhile session.

The Art & Science of Managing a Web Presence

As Alyson Harrold predicted at the beginning of her presentation, assisted by colleague Massimo Paolini, I only paid attention to 7% of Alyson’s words. The rest of my attention was reflecting on my efforts at building a web presence and personal brand.

This Blog

One of my first motivations for starting this blog was a way of organizing notes for myself. In terms of branding, this is rather appalling.

My second motivation was learning by teaching. Most of the topics I blog about I have just learned myself, or I am in the process of learning. Writing an article deepens my understanding. The trade-off is I proceed in covering new material at a snail’s pace. Teaching is not a natural instinct with me. It has required a lot of effort, and I believe I have progressed moderately at improving the quality and interest of my articles.

But all in all I find improving the quality of my blog is a slow laborious process.

Twitter

I think I do a good job of controlling my Twitter presence. Just factual tweets with content on a consistent range of tech topics, no politics, culture, or blah-blah about how I’m feeling right now. Massimo was adamant that a tweet without a link is worthless.

Still looking for a Twitter client. I think I’ll write a whole post on this topic.

Google+ … or my lack of presence on Google+

Google is and will continue to be for the foreseeable future the dominant external force in shaping my web presence, and I had absolutely no presence on Google+ and never used it. This is probably the right venue for on-line note publishing (not my blog).

SQL TDD

SQL Design Patterns

Just learning from Jim Bears about tSQLt.org in the TDD session would have been worth the price of admission (had it not been free of course) to both his talks for me. He could have just powered through more design patterns, but I did at least find course material on his website.

Sensor Fusion on mobile devices for Lifecare

I could say I enjoyed attending a “blue sky” session as a fitting finale to SVCC, except the exciting material Siamak Ashrafi presented about applications using the sensors in your smart phone is already reality today. The night before at the speakers and volunteers dinner I sat with some mobile developers who were positively excited about developing mobile sensor apps that benefit mankind, and this session made me understand that feeling. The session ended with lively discussions among speaker and audience. Very exciting stuff.

SVCC F# Working with Functional Data Structures Bibliograpy

Bibliography for my presentation, Working with Functional Data Structures, Practical F# Application, October 6 at Silicon Valley Code Camp.

Slides

Companion F# Demo

Purely Functional Data Structures (1996)
Purely Functional Data Structures (1998)
Purely Functional Stepchildren
DS_Benchmark
Benchmarking F# Data Structures — Introduction
Benchmarking F# Data Structures — Loading Data
FSharpx.DataStructures
Double-ended Queues for F#
Eric Lippert’s Comma Quibbling
Nick Palladinos Comma Quibbling
GC Performance Counters
LazyList
Are functional languages inherently slow?
The benefits of purely functional data structures
Functional Pearls Red-Black Trees in a Functional Setting
F# vs Mathematica: red-black trees
Red-Black Tree Remove
Implimenting IComparable in F#
Catamorphism

My Suggestions for New F# Language Features

I recently had the opportunity to communicate my suggestions for F# to a member of the F# team. Here is my message.

Sorry I haven't used your unicode operator add-on yet. As I explained,
right now I am doing all open source development, so I currently need to
write source everyone can use. I do think this would be a good language
integrated feature. I am thinking of your examples of math-oriented
analysts like actuaries, etc. able to integrate the math and logic
symbols they use right into the language. (So that is feature #1.)

Feature #2a. I have been busy adding Functional Data Structures to
FSharpx.Core.Datastructures. There are now many (and more to come)
structures that implement "cons", "head", and "tail". It would be very
good if the List cons operator :: and cons pattern discriminator ::
could be reused by other data structures that implement cons, head, and
tail. Perhaps this could be done through an interface, call it ICons or
IHeadTail, or something like that. (Reference IRandomAccessList, IDeque,
and IHeap in FSharpx.Core.Datastructures.)

Feature #2b. Several FSharpx.Core.Datastructures structures (more to
come) implement snoc (the inverse of cons). A reusable snoc operator and
pattern discriminator would be good (I propose ;;). Once again maybe an
interface could enable this. (Reference IDeque and DList [conj in DList,
which could be implemented with a proper name of "snoc"] in
FSharpx.Core.Datastructures.)

Feature #2c. Easy reuse of some other operators. I haven't experimented
much with others, so I might be giving you a wrong example, but I think
there are issues with re-using + (plus), for example.

Feature #3. Not really a language feature. I have never used
Mathematica, but I have a vague understanding of Mathematica Notebooks.
An integrated scripts / type provider explorer / graphics environment
with intellisense and integrated help like this would be great.

Thanks for your interest in my opinion and your great work on F#.

I went to post one of these suggestions to Visual Studio User Voice, only to find I have run out of votes. Anyone interested (especially in item #2a) is invited to post these themselves.

It’s great the F# team is so interested in outreach and continues improving the language.

Silicon Valley Code Camp 2012

I’m pleased to announce I will be speaking at this year’s Silicon Valley Code Camp on Saturday, October 6 at 3:30 PM. My session, Working with Functional Data Structures, Practical F# Application, extends my recent articles on the topic. Here is the session description


Beyond the bread-and-butter singly linked list are dozens of practical Functional Data Structures available to mask complexity, enable composition, and open possibilities in pattern matching. This session focuses on data structures available today in F#, but has practical application to most any functional language. The session covers what makes these structures functional, when to use them, why to use them, choosing a structure, time complexity awareness, garbage collection, and practical insights into creating and profiling your own Functional Data Structure. This session is intended for intermediate level functional programmers and above, but should be of interest to functional programming newbies as well.

SVCC 2012 is two full days of tech talk, including breakfasts and lunches, and it’s all FREE! But you have to register. Dozens of interesting and useful sessions, great speakers, thousands of coders / hackers / technophiles, and did I mention free food?

So click my session above (it helps promote my talk) then check out the rest of the SVCC site and register.