Failure Is Not An Option (In F#)

Why all the hate on Option types? You said you wouldn’t pass them from a function. Why not? What’s wrong with options?[1]

Programming should be fun. All you need is a good set of values, some skill, and the right attitude. F# is the most fun I’ve had in a programming language in years. This essay series is about that: having fun. Most books and essays are about writing awesome code. There’s not a lot of material about writing code aweseomely. The code you’ll find here has bugs! Just like your code. Can you find them? If you’re looking for the final version of the code, you can find it in the project’s GitHub page. You can find the story of how all of this started on the series index page.

There’s nothing wrong with options. I love options. I use options all of the time. Options are my friend.

I just use them in the right place. Understanding where the right place is? That’s the purpose of this essay.

But first we gotta talk about the Unix Philosophy and Total Programming. Or at least talk enough about them that the option stuff makes sense. Warning: I am going to vastly over-simplify a bunch of stuff so that this essay isn’t 40,000 words long.

When I started writing true microservices, I learned a lot of things I wouldn’t have learned otherwise. Microservices need to run at a certain time. They need to work with one another. They need to use the same data types and storage/transfer mechanism. (Some folks use a database for this but there are all kinds of problems with that. I will not go into them here.) Heck, they need to stop running — no matter what happens.

Most of the things I learned rested on a weak version of the Unix Philosophy and Total Programming. Instead of my trying to convince you that the things I learned were good, I’ll just point you to those.

The Unix Philosophy. The usual summary of the Unix Philosophy goes like this:

There’s a lot more in the linked article. I have a much simpler working definition: Make it work like the unix command “ls”. That is, when I type “ls” into a linux prompt? It just works. I can’t think of any time it’s failed. It always does something. And I can join ls together with a bunch of other linux commands to do useful things. It’s just one program, but when combined with other programs it becomes tremendously more valuable than it is by itself. It does one thing, does it well, never fails, and infinitely connects to other programs to make more useful stuff than I could have imagined while building it.

That’s what I want in my code. Tiny pieces of rock-solid stuff that I can assemble later into various things I might need without having to write (much) code.

Total Programming. This is another rabbit hole you can dive down if you have a lot of time on your hands. The dumbed-down version goes like this: your program has to provably stop running at some point. When you start it up, you have to know — without a doubt — that it is going to stop. Once again, it doesn’t have to do anything. It just can’t hang.

This spins off into all kinds of type and category theory stuff. I am not a Computer Science person. I’m just some old fart that likes to code, so the dumb version is enough for me. There looks like some cool stuff there, though, if a person wanted to study up on it. Where you end up is mathematically creating a type system that is deterministic. Put differently, you stick all the rules, flows, validation, and the rest of it into the type system so that it is impossible for the program not to complete in some way. That’s where we’re headed. (But that’s not where we’re starting)

We’re are not laser-focused on the type system. Instead, like we said in the last essay, we’re always focusing on output. Behavior. Flow. What are you doing for me? Not structure. Structure is always derivative — but that’s material from the Info-Ops book and too much to go into here. For now, if we can remember never to focus on structure, to instead do things that “force” ourselves to create structure based on other constraints? We’ll go far.

But wait! What the heck are you going on about, Daniel?!? Console apps? Text streams? I don’t want to write no stinking console apps! Linux commands? What the heck? I’m doing modern web development. It’s not the dark ages anymore. This command-line crap doesn’t look like it has anything at all to do with my day-to-day work. What’s next, building our own C compiler out of coconuts?

It’s a fair point, but misguided. The purpose is to structure your code as if it were a command-line app. Not that it’s deployed that way. I would even make it work as a command-line app for testing purposes, no matter where it went. This philosophy, along with the onion, will tell you where everything goes, no matter what kind of architecture you’re using. Like we said in the last essay. If you’re doing it right, the architecture doesn’t matter. Carpenters don’t spend all day staring at their hammer.

Remember the onion?

This is not about command-line apps. It’s about how good programs are constructed. How the pieces go to live in various places is not relevant here. In fact, if it’s relevant, you’re probably focusing on the wrong thing. The command-line just provides the simplest and easiest-to-use first platform for the code to live on. I can take an appropriately-structured program from the command-line and run it anywhere, on dozens of platforms. All I need to do is add some shell code, the grunt-work stuff in layer 1. (Which ends up being reusable. Yay!) And if I can’t? Then I haven’t structured the program correctly.

This is the entire point of programming, right? Write stuff once, then use it in a bunch of places. I’m lazy. I want to get the maximum value for the minimum amount of work. Don’t you?

Let’s make up some dummy example to walk through this. Let’s say I have a text file that’s supposed to have lines where there’s a value which equals something, like this:

Our data

The job is simple. Group together things by letter and total up the numbers. Then display the results to the console The first thing I do is create a program that takes one command-line parameter for an input file. I’ll use that boilerplate stuff I wrote years ago.

open Utils
/// Command-line parameters for this particular (OptionExample) program
type OptionExampleProgramConfig =
    member this.printThis() =
        printfn "OptionExample Parameters Provided"
let programHelp = [|"This is an example program for talking about option types."|]
let defaultBaseOptions = createNewBaseOptions "optionExample" "Does some thing with some stuff" programHelp defaultVerbosity
let defaultInputFile = 
    createNewConfigEntry "I" "Input File (Optional)" 
        [|"/I:<filename> -> full name of the file to use for input."|]
        ("OptionEssayExampleFile.txt", Option<System.IO.FileInfo>.None)
let loadConfigFromCommandLine (args:string []):OptionExampleProgramConfig =
    if args.Length>0 && (args.[0]="?"||args.[0]="/?"||args.[0]="-?"||args.[0]="--?"||args.[0]="help"||args.[0]="/help"||args.[0]="-help"||args.[0]="--help"then raise (UserNeedsHelp args.[0]) else
    let newVerbosity =ConfigEntry<_>.populateValueFromCommandLine(defaultVerbosity, args)
    let newConfigBase = {defaultBaseOptions with verbose=newVerbosity}
    let newVerbosity =ConfigEntry<_>.populateValueFromCommandLine(defaultVerbosity, args)
    let newInputFile = ConfigEntry<_>.populateValueFromCommandLine(defaultInputFile, args)
    {configBase = newConfigBase; inputFile=newInputFile}
let doStuff (opts:OptionExampleProgramConfig) =
let main argv = 
        let opts = loadConfigFromCommandLine argv                
        commandLinePrintWhileEnter opts.configBase (opts.printThis)
        doStuff opts
        0 // remember to return an integer exit code
        | :? UserNeedsHelp as hex ->
            printfn "%s: %s" defaultBaseOptions.programName hex.Data0
            printfn "========================"
            printfn "Command Line Options:"
            // Manually list program config entries here 
        | :? System.Exception as ex ->
            System.Console.WriteLine ("Program terminated abnormally " + ex.Message)
            System.Console.WriteLine (ex.StackTrace)
            if ex.InnerException = null
                    System.Console.WriteLine("---   Inner Exception   ---")
                    System.Console.WriteLine (ex.InnerException.Message)
                    System.Console.WriteLine (ex.InnerException.StackTrace)

This took me an hour! Why? Is my toolkit overblown? Do I not know anything at all about coding? Am I a moron? (Please don’t answer that last question). No. I had indented the first line one way and the rest of the code another. It took me 10-20 minutes to put the code in. The next 40 minutes I spent trying to figure out why my code wasn’t working.

My code was working. The spacing was off. Sometimes I say very unkind things about F#.

Looks like it’s working

And look! We’re not even a third of the way into the boilerplate code and there’s an Option type. It’s this line:

("OptionEssayExampleFile.txt", Option.None)

Why do I need an option when all I’m doing is getting the input file? Because the input file might not exist. Why not just fail? Because that’s not the job of this code. This code just gets command-line parameters for whatever programs might need them. The programs themselves may fail — or not. That’s a program decision, not a decision for this library.

I’m working the outside of the onion, the part where my application touches the rest of the world. The rest of the world has unknowns and empty values! So the option type accurately reflects what might happen when I interact.

What’s next? Well, I have my parameters loading up, and I know I’m working from the outside of the onion inwards. What if the input file doesn’t exist?

That’s a decision that’s not part of the outer layer. It’s part of layer 2. At this point I consume the option and decide what I want to do. Maybe I provide default data. Maybe I just go away. It varies — so it’s not an outside layer question.

let inputFileDoesntExist = 
            (snd opts.inputFile.parameterValue).IsNone
            || (System.IO.File.Exists(fst opts.inputFile.parameterValue) = false)
        if inputFileDoesntExist
            (doStuff opts)

Now I’m transitioning from the nature of the outside world to how I want my program to run. And I want my program to run like a clock. No muss, no fuss. When I type in “ls” on the linux command line, it works. That’s what I want.

I’ve found that allowing option types past level 2 is basically a way of deferring important decisions until I’m in the middle of doing something else. This complicates things and is always a bad idea. I’m working on this other thing. Why the heck should I be concerned right now about whether the file is there or not? You get four or five option types floating around? They could take a simple five-line method and turn it into a 40-line logic monstrosity. That’s no fun. And it leads to crappy, muddled code with mixed responsibilities.

What’s the next step? Well I’m not going to do anything if there’s no file. What if there’s a file with spaces? Or bad lines? Or lines without the name-equals-number format?

Now I’m fully in level 2. I have successfully interacted with the outside world. I have some hunk of stuff in my hand that I have to do something with. Now I need to decide on how to clean, filter, sort, or replace data I don’t like. I’m transforming the outside world data into my application data.

There’s no right or wrong answer here. It’s up to you and your app. But you have to decide. Once we leave level 2, failure is not an option. That is, you only have stuff that you know you can process. So let’s add a little more code around our “doStuff” function. (It’s very important to use names that describe things. Ha!)

type OptionExampleFileLines = string[]
let makeStringListToProcess fileName  :OptionExampleFileLines=
        let textLines = System.IO.File.ReadAllLines fileName
        let textLinesWithoutEquals = textLines |> Array.filter(fun x->
        let textLinesWithEqualsAndWithoutAValueOnTheEnd = textLinesWithoutEquals |> Array.filter(fun x->
            let splitText = x.Split([|'='|])
            || fst (System.Int64.TryParse splitText.[2]) = false
        | :? System.Exception as ex ->
            printf "I am loading the file to process. I should never fail here, just return an empty array"
let doStuff (opts:OptionExampleProgramConfig) =
    let linesToProcess=makeStringListToProcess (fst opts.inputFile.parameterValue)

A few things to notice. First, I’ve added a type,OptionExampleFileLines. It doesn’t have a lot around it, but the day is still young. We’re just getting started. At level two we’re translating into our application types — so we need application types.

When I mentioned I was going to write about option types today, a friend said “I use them in smart constructors”

Smart constructors are a way to control how types are created so that you have more control over being sure the type isn’t going to blow up later on. (Apologies if I missed some details here.)

The crazy thing is, we’re saying the same thing. My friend is saying, “Look! You can make constructors such that you always know your types will run on your application. You have tight control, and by adding it to the type system you’re creating a program that cannot fail. Whereas I am saying, “Look! Once you begin interacting with the outside world, you’ll get messy things like null values and bad data. The first thing you have to do is add code to make sure your program cannot blow up.”

This is one of these things where you could end up violently agreeing. What you have to know is 1) It’s the same goal, and 2) You don’t have to choose one or the other. In fact, use both! There’s cleaning data to protect the program from the outside world, and then there’s cleaning data to protect the type system from bad data. Write some code to clean the data in general, then figure out where it should go.

After all, once I leave level 2, I want strongly typed data that I know won’t blow up. We’re headed the same way, the only difference is that my friend is looking at it from a type perspective and I’m looking at it from a data flow perspective. Remember! I’m always focusing on output. What are you doing for me? That drives structure, not the other way around.

The other thing to notice are my huge names and wordy code. Couldn’t I collapse that? Isn’t functional programming always supposed to look like “foo |> bar |> foobar”?

Yes and no. Functional programming can look dang near like anything you want it to. The computer doesn’t care. The important thing is whether or not you can look at a piece of code and immediately understand what it’s doing. When I visited my TDD guru friends Bob and James, one of the problems I noticed was that as good programmers, they’d almost immediately start refactoring, collapsing stuff, making the code cleaner.

That was a bad idea, because with FP, it all kinda collapses into nothingness. I end up losing track of what I’m doing. What’s the code supposed to do for people? Instead I’m taking some function and making it disappear. (It’s a nice trick. It just doesn’t help me reason about usefulness or not. Instead I’m wallowing around in how cool FP is.) Could I take that “makeStringListToProcess” code and make smaller? How about adding the program type right there, have it return a a name/value collection? Move it off to a generic function that takes any file and only returns the name/value parts of it?

The collapsing/refactoring game can go on almost forever, pushing both outward towards generic IO functions, inwards towards new language additions, and up the type chain to a more structure program type system.

These are all wonderful and great things, and I’ll be pushing the hell out of this code — once it starts doing something useful. Then I’ll use what it has to do (that’s useful) as a guideline for what to clean up first and how to clean it up. (Pushing as much as possible into the type system before you start is how you get Domain-Driven coding). Until then, however, I want to read what I’m doing in nice, human language. I especially want to think through all the outer onion issues around process data. I am a distracted, busy, forgetful, lazy programmer. A month from now I don’t want to load up the IDE and see something that looks like Klingon. Instead, I’ll refactor as I go and over time it all works out the same.

Finally, why not carry options into layer 3? What if I wanted to take the lines that had non-numeric values on them and output them to another file? What if I wanted to write a report for numeric entries and another for alpha entries?

This is where I got hung up a lot as an OO guy. What I was doing was focusing on the structure instead of the behavior. “I have this structure to read these files I want to do four or five things with. So I’ll keep the structure and just add in branches to do the other stuff.”

Nope nope nope nope nope. Remember the Unix Philosophy. “Write programs that do one thing and do it well” What I was doing was trying to be lazy and force re-use by taking the same structure and making it do multiple things. A divided house cannot stand, and it’s enough to do one thing and do it well. Then do the next thing.

And surprise! We get re-use, just like we wanted! We just get it by using shared libraries that we develop over time. In fact, I have never seen code resuability work so well as I have using F#, and each time I re-use it, it keeps getting better, because I developing and factoring various functions based on several real-world behaviors they have to support.

That’s extremely cool.

Option types are great, and there are multiple ways of looking at programming that are all valid: type-driven, flow, constraint-driven, test-driven, and so on. At the end of the day, however, everybody’s trying to do the same thing. So when you see somebody doing something one way talk about something, you can betcha the same thing happens when you’re doing it another way. Coding is coding. The important thing is not to forget any of this important stuff simply because you’ve decided to use one method of coding over another.

Most of all, have fun! And make stuff people want.

I hate to do a crummy commercial, but this essay is already huge and I really haven’t explained some critical things. If you’re interested in why I choose the approach of looking at F# coding in terms of behavior, value, and flow instead of types, you should read my Info-Ops book. Programming is one of many forms of project information. Once you learn how to organize all of your project information, not only will you be a better programmer, your paperwork and BS reports and meetings will decrease. Plus you’ll have more fun and make better stuff.

Daniel Markham sucks at programming, but he still loves it. What he’s good at is helping groups of technical people make stuff people want. He teaches them to do that using an understanding of value creation, maximizing time with users, minimal tool and paperwork overhead, and good technical practices, including things like ATDD and TDD. He’s been a fan of F# since it first came out.

1: This is paraphrased from a comment on a previous essay.

Follow the author on Twitter

July 12, 2018

Leave a Reply

Your email address will not be published. Required fields are marked *