The Two Kinds Of Technology Thinkers

Edward de Bono had his “Six Types of Thinking” hats to describe the different kinds of thinking that go into solving problems. Those are great, but there are two kinds of thinking happens on every technology team that are far more important: Platonic thinking and Pragmatic thinking.

Although most folks use both types of thinking, people have a favorite they rely on when problem-solving. It’s important to know what your “default setting” is. It’s also important to call out and identify each type of thinking as it’s used. Both types have both huge benefits and huge drawbacks.

Platonic thinkers like to think of things in the abstract, in their pure form. They’re conceptual thinkers and work with the pure and generic form of things, the way they should be. Once they oriented themselves in the abstract, they take what they’ve learned and try to make it work with real things.

Pragmatic thinkers may never orient themselves. In fact, some resist the idea that orienting yourself against abstractions is even a laudable goal. Instead, they think of things in the concrete, cause and effect. If under these circumstances I do this one thing? This other thing seems to happen a lot. I don’t know why. I probably don’t even have time to figure it out. I can use that implied causal relationship to make the other thing happen. Now I can move to the next problem.

Ever get excited about a new software platform, download and install it, only to have popular things work without a hitch and oddball things totally flake out? If so, you’ve been a victim of Platonic thinking. The general idea was good. The concepts are in place to do a great number of useful things. It’s just the actual application of those cool ideas was limited to things the developers thought were most important. What you have is a beautiful system of organization that looks like it might work but not all the little pieces required to prove that it actually does work.

Ever work with a piece of code that’s been around so long that nobody knows exactly what it does? Somebody asks for something trivial, like a new field on a report, and the team estimates it might take six months. Six months! What you have is a bunch of little pieces that all work separately that aren’t organized into anything that’s easy to understand and maintain.

You find this kind of thinking everywhere, not just in tech teams. Pick up some political essays about any random topic from any year in past. Some essays will argue platonically: these are our values, these are the reasons for these things existing, these reasons come together in this way to create/limit another high-level concept. Some essays will argue pragmatically: We do this certain set of things because they fix these other things. No, they are inconsistent with one another, but they’ve been working up until now. Sometimes we’re not even sure why they work, but they do.

There is no right or wrong way of thinking. It’s important to be able to use both of them seamlessly when working with systems of any complexity. As an example, in TDD and OO code, we first start with a pragmatic question: what’s the smallest thing that we want this thing to do? Then we write a test. Then we make the test pass. Finally we refactor, making sure everything is in the right place. We’ve started pragmatically and moved to platonic thinking once we’ve established value.

Oddly enough, there are all sorts of problems trying to do it backwards, starting with platonic ideas of what your code “should” look like, what the pure and perfect form is, then moving to the nuts and bolts of things. Back when I first started OO I managed to do it several times, but more often than not working from platonic to pragmatic ends up with architecture astronaut syndrome and software that promises everything for everybody and ends up doing almost nothing for a very few people.

I’m no master of functional programming, but I suspect in true functional programming this pattern might be reversed. First we ask the platonic question, what’s the smallest/simplest structure/types that support the next thing we want to do? Then we ask the pragmatic question, how can I make that work without increasing code paths? (Run from the ZOMBIES!). Finally we get extremely pragmatic by covering our code paths with tests.

We get the word platonic from the Greek philosopher Plato. Plato believed in a universal set of truths. There is a chair. There is another chair. There exists a universal idea of “chair” that all chairs reflect. It is this universal set of ideas where truth lies. Everything else, all that we see, are merely shadows of that universal truth.

Plato’s top student, Aristotle, disagreed. He was more concerned with things in the real world, watching them, understanding them, creating catalogs of what he saw. He was more concerned with understanding each thing he experienced than speculating on what the universal pure form of something might be. And if he kept organizing his thoughts as worked, didn’t he end up in basically the same place Plato was, only he gets there from the bottom-up instead of the top-down?

So when you have these conversations, you are not the first. You are not alone. This conflict has been going on since the dawn of recorded history. I’ve always said that creating technology is applied philosophy. You walk into a new domain. You have to understand it and the people who live there. You have to take that understanding and create a workable and provable set of hypotheses that drive out an executable theory of operations to deliver real value. Then you code it and begin the science of “making and keeping users happy” that’s different for every domain/product pairing.

When creating a new science, both kinds of thinking are critical. Don’t get stuck in a rut — and always remember the weakness of whichever one you’re currently using.

Hi, I’m Daniel Markham. I wrote a book called Info-Ops that talks about how to have conversations and organize what we’re doing so that we build the right thing without a lot of the usual BS. As it turns out, looking at what we do from an information standpoint tells us a lot more than simply talking about what activities people do every day or how various tools are configured or used.

June 13, 2018  Leave a comment

Why Don’t Organizations Use Their Own Defect-Tracking Systems?

Picture this: you’re working on a critical application for your company, used by countless people around the world. This morning, as the new update rolls out, a user in Detroit pushes the main button on your app — and nothing. Your app hangs. Something went wrong. Now nobody can use your app.

Now picture this: you walk into a new team. You’re the person who knows WhizzBang 7.0, and the team desperately needs your help. The new update is failing! You grab a seat and ask for a working computer.

But they can’t give you a working computer. Security policy says only the person assigned to each computer is allowed to use it. So you ask for your own computer. But that’s going to take a couple of days to sort out — the infrastructure staff is slammed right now with customer complaints from some kind of app deployment problem. You ask if it’s possible to get a new login on an existing computer. It’s possible, but usually takes a few hours. So then you ask if somebody could code on their computer while you tell them what to do. That works, but it’s technically against policy. Pair-programming has not been approved yet for all dev teams.

Both of these situations involve defects. The first defect is about a deployed product. People are expecting value from your product and are only finding frustration. Everybody’s familiar with that one. The second defect is about a broken organization. The people who make things happen in your organization are expecting to keep creating value and grow your business…but they are only finding frustration.

Why Don’t Organizations Use Their Own Defect-Tracking Systems?

I can only come up with three reasons:

  • Organizations actively avoid unpleasant conversations – Logging organization defects would require pointing out a lot of places where the emperor has no clothes. Then somebody would have to work on each item. There would be meetings, discussions. They would not be fun.
  • Organizations are lax about making people responsible for stuff – If you have an app in your hand and the button doesn’t work, there’s a team(s) responsible for that app. There’s probably even a person responsible for that button — or at least somebody who’s an expert in helping you meet your goals. In good organizations, everybody is either responsible for everything or they are clearly assigned to help users or developers meet specific and defined goals. In poor organizations, problems are swept up, organized by some orthogonal metric to value like technology or architectural tier, given a title, and assigned to somebody. What does Joe do? Joe’s the DBA Team Lead. That’s fine, but what does Joe do?. Things that involve databases? Without a connection to value delivery, how do I work with that? In this situation, everybody is responsible for nothing.
  • There’s no defect taxonomy – This is a little better. So you decide to start logging org defects. How do you group and classify them? It’s not an impossible thing, but it’s something only a few organizations are familiar with

Why are defect-tracking systems only good for the users and not ourselves? Isn’t it much more important to make sure the organization is running well? Doesn’t fixing that prevent a lot of the defects that our users end up finding downstream? Isn’t it much more important to fix and optimize the machine that makes stuff people want instead of constantly playing whack-a-mole with bad stuff its made?

Hi, I’m Daniel Markham. I wrote a book called Info-Ops that talks about how to have conversations and organize what we’re doing so that we build the right thing without a lot of the usual BS. As it turns out, looking at what we do from an information standpoint tells us a lot more than simply talking about what activities people do every day or how various tools are configured or used.

May 28, 2018  Leave a comment

Technical Debt Edge Cases

Is Technical Debt always bad?

Everybody talks about Technical Debt. Most of the time it’s always considered to be A. Bad. Thing.

Nobody talks about when it might not be a bad thing, or when some folks think Technical Debt exists and it doesn’t.

continue reading »

April 4, 2016  Leave a comment

Technical Story Slicing 3 of 3

Blog-Image-Project-web

Too many times user stories and backlogs are taught at such a high level of abstraction that folks can’t get value from them. So let’s take a real project, developed on AWS using Microservices, and walk though how the backlog is created, prioritized, and delivered — the whole thing. Including code. Due to space limitations, it will be a personal project done as a hobby over a few weeks.

The last of our videos runs about 35 minutes. It’s the first in a three-part series which are all published on this blog. Technologies covered to some degree include F#, Mono, Unbuntu, AWS, TDD, DevOps, Error Handling, scoping, MVP, and debugging. (Because this is a front-to-back real-world example, none of these are covered in depth. Although we see quite a bit of code, the videos are suitable both for programmers and for business people who have to interact with programmers.)

November 6, 2015  Leave a comment

Technical Story Slicing 2 of 3

Blog-Image-Project-web

Too many times user stories and backlogs are taught at such a high level of abstraction that folks can’t get value from them. So let’s take a real project, developed on AWS using Microservices, and walk though how the backlog is created, prioritized, and delivered — the whole thing. Including code. Due to space limitations, it will be a personal project done as a hobby over a few weeks.

This second video runs about 17 minutes. It’s the first in a three-part series which are all published on this blog. Technologies covered to some degree include F#, Mono, Unbuntu, AWS, TDD, DevOps, Error Handling, scoping, MVP, and debugging. (Because this is a front-to-back real-world example, none of these are covered in depth. Although we see quite a bit of code, the videos are suitable both for programmers and for business people who have to interact with programmers.)

November 5, 2015  Leave a comment

Technical Story Slicing 1 of 3

Blog-Image-Project-web

Too many times user stories and backlogs are taught at such a high level of abstraction that folks can’t get value from them. So let’s take a real project, developed on AWS using Microservices, and walk though how the backlog is created, prioritized, and delivered — the whole thing. Including code. Due to space limitations, it will be a personal project done as a hobby over a few weeks.

The video runs about 15 minutes. It’s the first in a three-part series which are all published on this blog. Technologies covered to some degree include F#, Mono, Unbuntu, AWS, TDD, DevOps, Error Handling, scoping, MVP, and debugging. (Because this is a front-to-back real-world example, none of these are covered in depth. Although we see quite a bit of code, the videos are suitable both for programmers and for business people who have to interact with programmers.)

November 4, 2015  1 Comment

Potemkin Village Agile

An interesting trend I’m noticing as Agile becomes more popular is seeing teams that say and do things that look good, but don’t seem to be accomplishing much or going anywhere. There are two types of these teams: Cargo Cult teams and Potemkin Village teams. You’ve probably heard of the first one, but the second one might be new to you.

“Potemkin Village Agile” is where everything is set up to appear as if the team has a high level of agility, but in fact it’s just another old-school project. Note that this is different from Cargo Cult Agile. Cargo Cult Agile is where teams go through the motions of doing stuff because they think that by going through the motions without understanding what they’re doing that somehow magic will happen and things will get better.

Potemkin Village Agile is under no such illusions. The goal here is just to create a nice-looking facade so that managers or whoever else will get off the team’s back so they can “get some real work done”.

These are the guys that show up at the nightclub in the cool car and expensive clothes — all of it rented. They’re the ones who want to appear smart online on a certain topic so they Google something and just rehash what Wikipedia says. These are Agile posers. In some ways, as long as they’re honest with themselves, they’re actually much better off than the Cargo Cult guys, because at least they’re under no illusion that any of it is going to amount to much.

Cargo Culters, on the other hand, are true believers who want to make it look like X because once it looks like X all of our problems will be solved. Potemkin Villagers are just putting on a puppet show for anybody who comes by to visit.

    In either case, it’s not unusual to be presented with a team that looks like it’s doing the right things but where performance sucks and folks outside the team don’t understand why. I thought it would be interesting to put together a quick list to sort out the Cargo Cult/Potemkin Village folks from the folks who may just be having a bad sprint or two. [Standard Disclaimer: As with any other list, I’m not saying teams have to conform to any of this. I’m not saying these things describe the perfect Agile team.] If you’re presented with a team that’s supposed to have a lot of agility but isn’t getting the results they’re supposed to get, take a look at this list and see what matches up and what doesn’t. You might have a CC or PVA situation on your hands.

  1. Is the team physically working alongside each other, or are they retreating to cubes or solo areas?
  2. During whatever daily chat the team has, are they focused on advancing the work, or finding work to fit whatever job roles each of them thinks they have?
  3. Do people code together, writing tests before they write solutions? And by “together”, I mean: is there a customer in the room working along with everybody else?
  4. Is the room quiet, like a library, where everybody is in their own cave, or is there a constant low-key burble while people are having a good time?
  5. Are these people you would want to hang out with?
  6. Does the team talk about important things like helping people, or are they focused on tooling and process tooling? Does most of the conversation revolve around solutions and benefits?
  7. Is everybody automatically “sharpening the saw” — making the build faster, refactoring old code, figuring out how not to have the same config issue twice — or is everybody just doing whatever is directly put in front of them without thinking of the bigger picture?
  8. Does this team look like a team that should be trusted by the people who are paying them? James Bonds or stapler guys?
  9. Does one person dominate everything, or do team members switch off during the day, leading or following as necessary to keep momentum going?
  10. Do team members easily admit ignorance and weakness to each other?

If you’re working with a Cargo Cult team, there’s a ton of literature out there about things to do. No need to rehash that here.

Oddly enough, I’ve seen several PVA teams that eventually turned out to be teams embracing real agility. It kinda goes like this: if they’re honest enough to make a goal of simply making it look good without lying to themselves or others, then they usually end up doing some things like standups or mobbing to make things look good. Funny thing, if you have an open mind and don’t actually expect anything, good or bad, pretty soon good things will happen — and you’ll be in a good mental place to take advantage of them. Whereas if you’re a Cargo Cult team, you’re thinking very rigidly. Many times are unable to see good stuff right in front of you because you’re focused on the wrong things.

If you’re working with a potential PVA team, you need to take some time to figure out whether or not they’re BSing all outsiders, i.e. a true PVA situation, or whether some folks are posers and some folks are Potemkin Villagers. The two groups have completely different ideas of where they are and what needs to happen — and each group needs to be treated differently. Complicating things is the fact that it’s not unusual to have a mixed team. (A naive suggestion would be something like “Well just sit them down and ask them all to identify with which group they’re in”. Problem: PVA folks, by definition, just want to make it look good to outsiders. They are unlikely to want to confront other team members, much less some outsider, with their belief that it’s all just a charade.) Fun times.

And if you’re on a team full of posers? The trick with being a poser is just to have fun and be honest about it! The more honesty and fun you bring to it, the better chance you might accidentally end up doing some pretty cool stuff.

August 26, 2015  Leave a comment

Real World F# Programming Part 2: Types

Ran into a situation last week that showed some more of the differences facing OO programmers moving to F#.

So I’ve got two directories. The program’s job is to take the files from one directory, do some stuff, then put the new file into the destination directory. This is a fairly common pattern.

To kick things off, I find the files. Then I try to figure out which files are in the source directory but not in the destination directory. Those are the ones I need to process. The code goes something like this:

doStuff, the initial version
  1. let doStuff (opts:RipProcessedPagesProgramConfig) =
  2.     let sourceDir = new System.IO.DirectoryInfo(opts.sourceDirectory.parameterValue)
  3.     let filesThatMightNeedProcessing = sourceDir.GetFiles()
  4.     let targetDir = new System.IO.DirectoryInfo(opts.destinationDirectory.parameterValue)
  5.     let filesAlreadyProcessed = targetDir.GetFiles()
  6.     let filesToProcess = filesThatMightNeedProcessing |> Array.filter(fun x->
  7.         (filesAlreadyProcessed |> Array.exists(fun y->x.Name=y.Name)
  8.         )
  9.     )
  10.     // DO THE "REAL WORK" HERE
  11.     printfn "%i files to process" filesToProcess.Length
  12.     ()

So I plopped this code into a couple of apps I’ll code later, then I went to work on something else for a while. Since it’s all live — but not necessarily visible to anybody — a few days later I took a look to see if the app thought it had any files to process.

It did not.

Now, of course, I can see that my Array.filter is actually backwards. I want to take the filesThatMightNeedProcessing and eliminate the filesAlreadyProcessed. What’s remaining are the filesToProcess. Instead, I check to see if the second set exists in the first. It does not, so the program never thinks there is anything to do. Instead of Array.exists, I really need something like Array.doesNotExist.

So is this a bug?

I’m not trying to be cute here, but I think that’s a matter of opinion. It’s like writing SQL. I described a transform. The computer ran it correctly. Did I describe the correct transform? Nope. But the code itself is acting correctly. I simply don’t know how many files might need processing. There is no way to add a test in here. Tests, in this case, would exist at the Operating System/DevOps level. So let’s put off testing for a bit, because it shouldn’t happen here. If your description of a transform is incorrect, it’s just incorrect.

So I need to take one array and “subtract” out another array — take all the items in the first array and remove those items that exist in the second array. Is there something called Array.doesNotExist?

No there is not.

Meh.

Ok. What kind of array do I have? Intellisense tells me it’s a System.IO.FileInfo[]

My first thought: this cannot be something that nobody else has seen. I’m not the first person doing this. This is just basic set operations. So I start googling. After a while, I come across this beautiful class called, oddly enough, “Set”. It’s in Microsoft.FSharp.Collections Damn, it’s sweet-looking class. It’s got superset, subset, contains, difference (which I want). It’s got everything.

So, being the “hack it until it works” kind of guy that I am, I look at what I have: an array of these FileInfo things. I look at what I want: a set. Can’t I just pipe one to the other? Something like this?

2014-09-15 fsharp 1

What the hell? What’s this thing about System.IComparable?

In order for the Set module to work correctly, it needs to be able to compare items inside your set. How can it tell if one thing equals another? All it has is a big bag of whatever you threw in there. Could be UI elements, like buttons. How would you sort buttons? By color? By size? There’s no right way. Integers, sure. Strings? Easy. But complex objects, like FileInfo?

Not so much.

As it turns out, this is a common pattern. In the OO world, we start with creating a type, say a Plain Old Java Object, or POJO. It’s got a constructor, 2 or 3 private members, some getters and setters, and maybe few methods. Life is good.

But then we want to do things. Bad things. Things it was never meant to do. Things involving other libraries. Things like serialize our object, compare it to others, add two objects together. It’s not enough that we have a new type. We need to start flushing out that type by supporting all sorts of standard methods (interfaces). If we support the right interface, our object will work magically with people who write libraries to do things we want.

Welcome to life in the world of I-want-to-make-a-new-type. Remember that class you had with three fields? Say you want to serialize it? You add in the interface IPersist. Now you have a couple more methods to fill out. Have some resources that must be cleaned up? Gotta add in IDisposable. Now you have another method to complete. Handling a list of something somebody else might want to walk? Plop in IEnumerable. Now you have even more methods to complete.

This is life in OO-land and frankly, I like it. There’s nothing as enjoyable as creating a new type and then flushing it all out with the things needed to make it part of the ecosystem. Copy constructors, operator overrides, implicit conversion constructors. I can, and have, spent all day or a couple of days creating a fully-formed, beautiful new type for the world, as good as any of the CLR types. Rock solid stuff.

But.

Funny thing, I’m not actually solving anybody’s problem while I’m doing this. I’m just fulfilling my own personal need to create order in the world. Might be nice for a hobby, but not so much when I’m supposed to stay focused on value.

There’s also the issue of dependencies which is the basis for much of the pain and suffering in OO world. Now that my simple POJO has a dozen interfaces and 35 methods, what the hell is going on with the class when I create method Foo and start calling it? Now I’ve got all these new internal fields like isDirty or versionNum that are connected to everything else.

You make complex objects, you gotta do TDD. Otherwise, you’re just playing with fire. Try putting a dozen or so of these things together. It works this time? Yay! Will it work next time? Who knows?

This is the bad part of OO — complex, hidden interdependencies that cause the code to be quite readable but the state of the system completely unknown to a maintenance programmer. (Ever go down 12 levels in an object graph while debugging to figure out what state something is in? Fun times.)

So my OO training, my instinct, and libraries themselves, they all want me to create my own type and start globbing stuff on there. This is simply the way things are done.

DO NOT DO THIS.

Instead, FP asks us a couple of questions: First, do I really need to change my data structures? Because that’s going to be painful.

No. Files are put into directories based on filename. You can’t have two files in the same directory with the same name. So I already have the data I need to sort things out. Just can’t figure out how to get to it.

Second: What is the simplest function I can write to get what I need?

Beats me, FP. Why do you keep asking questions? Look, I need to take what I have and only get part of the list out.

I spent a good hour thrashing here. You get used to this. It’s a quiet time. A time of introspection. I stared out the window at a dog licking its butt. I wanted to go online and find somebody who was wrong and get into a flame war, but I resisted. At some point I may have started drooling.

In OO you’re always figuring out where things go and wiring stuff up. Damn you’re a busy little beaver! Stuff has to go places! Once you do all the structuring and wiring? The code itself is usually pretty simple.

In FP you laser directly in on the hard part: the code needed to fix the problem. Aside from making sure you have the data you need, the hell with structure. That’s for refactoring. But this means that all the parts are there at one time. Let me repeat that. THE ENTIRE PROBLEM IS THERE AT ONE TIME. This is a different kind of feeling for an OO guy used to everything being in its place. You have to think in terms of data structure and function structure at the same time. For the first few months, I told folks I felt like I was carrying a linker around in my head. (I still do at times)

Eventually I was reduced to muttering to myself “Need to break up the set. Need to break up the set.”

So I do what I always do when I’m sitting there with a dumb look on my face and Google has failed me: I started bringing up library classes, then hitting the “dot” button, then having the IDE show me what that class could do.

I am not proud of my skills. But they suffice.

Hey look, the Array class also has Array.partition, which splits up an array. Isn’t that what I want? I need to split up an array into two parts: the part I want and the part I do not want. I could have two loops. On the outside loop, I’ll spin through all the files in the input directory. In the inside loop, I’ll see if there’s already a file with the same name in the output directory. The Array.partition function will split my array in two pieces. I only care about those that exist in the input but not the output. Something like this:

Hey! It works!
  1. let doStuff (opts:RipProcessedPagesProgramConfig) =
  2.     let sourceDir = new System.IO.DirectoryInfo(opts.sourceDirectory.parameterValue)
  3.     let filesThatMightNeedProcessing = sourceDir.GetFiles()
  4.     let targetDir = new System.IO.DirectoryInfo(opts.destinationDirectory.parameterValue)
  5.     let filesAlreadyProcessed = targetDir.GetFiles()
  6.     let filesToProcessSplit = filesThatMightNeedProcessing |> Array.partition(fun x->
  7.         (filesAlreadyProcessed |> Array.exists(fun y->y.Name=x.Name))
  8.         )
  9.     let filesToProcess = snd filesToProcessSplit
  10.  
  11.     // DO THE "REAL WORK" HERE
  12.     printfn "%i files to process" filesToProcess.Length
  13.     ()

Well I’ll be danged. Freaking A. That’s what I needed all along. I didn’t need a new class and a big honking type system hooked into it. I just needed to describe what I wanted using the stuff I already had available. My instinct to set up structures and start wiring stuff would have led me to OO/FP interop hell. Let’s not go there.

So if I’m not chasing things down to nail them in exactly one spot, how much should I “clean up”, anyway?

First, there’s Don’t Repeat Yourself, or DRY. Everything you write should be functionally-decomposed. There’s no free ride here. The real question is not whether to code it correctly, it’s how much to genericize it. All those good programming skills? They don’t go anywhere. In fact, your coding skills are going to get a great workout with FP.

I have three levels of re-use.

First, I’ll factor something out into a local structure/function in the main file I’m working with. I’ll use it there for some time — at least until I’m happy it can handle different callers under different conditions. (Remember it’s pure FP. It’s just describing a transform. Complexity is bare bones here. If you’re factoring out 50-line functions, you’re probably doing something wrong.)

Second, once I’m happy I might use it elsewhere, and it needs more maturing, I’ll move it up to my shared “Utils” module, which lives across all my projects. Then it gets pounded on a lot more, usually telling me things like I should name my parameters better, or handle weird OS error conditions in a reasonable way callers would expect. (You get a very nuanced view of errors as an FP programmer. It’s not black and white.)

Finally, I’ll attach it to a type somewhere. Would that be some kind of special FileInfo subtype that I created to do set operations?

Hell no.

As I mature the function, it becomes generic, so I end up with something that subtracts one kind of thing from another. In fact, let’s do that now, at least locally. That’s an easy refactor. I just need a source array, an array to subtract, and a function that can tell me which items match.

subtractArrays. Good enough for now.
  1. let subtractArrays sourceArray arrayToSubtract f =
  2.     let itemSplit = sourceArray |> Array.partition(fun x->
  3.         (arrayToSubtract |> Array.exists(fun y->(f x y)))
  4.         )
  5.     snd itemSplit
  6.  
  7. let doStuff (opts:RipFullPagesProgramConfig) =
  8.     let sourceDir = new System.IO.DirectoryInfo(opts.sourceDirectory.parameterValue)
  9.     let filesThatMightNeedProcessing = sourceDir.GetFiles()
  10.     printfn "%i files that might need processing" filesThatMightNeedProcessing.Length
  11.     let targetDir = new System.IO.DirectoryInfo(opts.destinationDirectory.parameterValue)
  12.     let filesAlreadyProcessed = targetDir.GetFiles()
  13.     printfn "%i files already processed" filesAlreadyProcessed.Length
  14.     let filesToProcess = subtractArrays filesThatMightNeedProcessing filesAlreadyProcessed (fun x y->x.Name=y.Name)
  15.     printfn "%i files to process" filesToProcess.Length
  16.     ()

Note the lack of types. Do I care what kind of array either the source or the one to subtract is? No. I do not. All I care is if I can distinguish the items in them. Hell, for all I care one array can be that System.IO.FileInfo thing, the other array can be filenames. What does it matter to the problem I’m solving?

What’s that sound? It’s the sound of some other FP guy busy at his computer, sending me a comment about how you could actually do what I wanted in 1 line of code. That’s fine. That’s the way these things work — and it’s why you don’t roll things up into types right away. Give it time. The important thing was that I stayed pure FP — no new data, no mutable fields, no for/next loops. I didn’t even use closures. As long as I stay clean, the code will continue to “collapse down” as it matures. Fun stuff. A different kind of fun than OO.

So where would this code end up, assuming it lives to become something useful and re-usable? In the array type, of course. Over time, functions migrate up into CLR types. If I want a random item from an array? I just ask it for one. Here’s the code for that.

Make Arrays give you random items
  1. type 'a “[]“ with
  2.     member x.randomItem =
  3.         let rnd = new System.Random()
  4.         let idx = rnd.Next(x.Length)
  5.         x.[idx]

Let me tell you, that was a painful function to work through! Happy I don’t have to ever worry about it again. Likewise, if I need to know how many times one string is inside another? I’ve got a string method for that. Basically anything I need to use a lot, I’ve automated it.

Over time, this gives me 40-50 symbols to manipulate in my head to solve any kind of problem. So while the coding part makes my brain hurt more with FP, maintenance and understanding of existing code is actually much, much easier. And with pure FP, everything I need is right there coming into the function. No dependency hell when I debug. It’s all right there in the IDE. Not that I debug using the IDE that much.

So does that mean I never create new types? Not at all! But that’s a story for another day…

 

 

September 15, 2014  Leave a comment

Real World F# Programming Part 1: Structuring Your Solution

I have several “side project” apps I work on throughout the year. One of those is Newspaper23.com. I have a problem with spending too much time online. Newspaper23 is supposed to go to all the sites I might visit, pull down the headlines and a synopsis of the article text. No ads, no votes, no comments, no email sign-ups. Just a quick overview of what’s happening. If I want more I can click through. (Although I might remove external links at a later date)

Right now newspaper23.com pulls down the headlines from the sites I visit and attempts to get text from the articles. It’s not always successful — probably gets enough text for my purposes about 70% or so. There’s some complicated stuff going on. And it doesn’t get the full text or collapse it into a synopsis yet. That’s coming down the road. But it’s a start. It goes through about 1600 articles a day from about a hundred sites. People are always saying F# isn’t a language for production, and you have to have some kind of framework/toolset to do anything useful, but that’s not true. I thought I’d show you how you can do really useful things with a very small toolset.

I’m running F# on Mono. The front-end is HTML5 and Jquery. There is no back-end. Or rather, the back end is text files. Right now it’s mostly static and supports less than a thousand users, but I plan on making it interactive and being able to scale up past 100K users. Although I have a brutally-minimal development environment, I don’t see any need in changing the stack to scale up massively. Note that this app is part-time hobby where I code for a few days just a couple of times a year. I don’t see my overall involvement changing as the system scales either. Server cost is about 40 bucks a month.

I come from an OO background so all of this is directed at all you JAVA/.Net types. You know who you are 🙂

 

Code Snippet
  1. module Types
  2.     open HtmlAgilityPack
  3.     open System.Text.RegularExpressions
  4.     type ‘a “[]“ with
  5.         member x.randomItem =
  6.             let rnd = new System.Random()
  7.             let idx = rnd.Next(x.Length)
  8.             x.[idx]
  9.     type System.String with
  10.         member x.ContainsAny (possibleMatches:string[]) =
  11.             let ret = possibleMatches |> Array.tryFind(fun y->
  12.                 x.Contains(y)
  13.                 )
  14.             ret.IsSome
  15.         member x.ContainsAnyRegex(possibleRegexMatches:string[]) =
  16.             let ret = possibleRegexMatches |> Array.tryFind(fun y->
  17.                 let rg = new System.Text.RegularExpressions.Regex(y)
  18.                 rg.IsMatch(x)
  19.                 )
  20.             ret.IsSome

(Yes, there will be code in this series)

The game here is small, composable executables. FP and small functions mean less code. Less code means less bugs. Deploy that smaller amount of code in samlller chunks and that means less maintenance. Little stand-alone things are inherently scalable. Nobody wonders whether they can deploy the “directory function” on multiple servers. Or the FTP program. Big, intertwined things are not. Start playing around sometime with sharding databases and load balancers. Ask some folks at the local BigCorp if they can actually deploy their enterprise software on a new set of servers easily.

I’ve found that you’ll write 4 or 5 good functions and you’re done with a stand-alone executable. In OO world you’ll spend forever wiring things up and testing the crap out of stuff to write the same code spread out over 8 classes. Then, because you’ve already created those classes, you’ll have a natural starting point for whatever new functionality you want. Which is exactly the opposite direction FP takes you in. In OO, the more you try to do, the more structure you have, the more structure, the more places to put new code, the more new code, the more brittle your solution. (And I’m not talking buggy, I’m talking brittle. This is not a testing or architecture issue. Composable things have composable architectures. Static graphs do not. [Yes, you can get there in OO, but many have ventured down this path. Few have arrived.])

The O/S or shell joins it all together. That’s right, you’re writing batch files. Just like a monad handles what happens between the lines of code in an imperative program, the shell is going to handle what happens between the composable programs in your project. A program runs. It creates output. The shell runs it at the appropriate time, the shell moves that output to where it can be processed as input by the next program, the shell monitors that the executable ran correctly. There is no difference between programming at the O/S level and at the application level. You work equally and in the same fashion in both.

Ever delete your production directory? You will. Having it all scripted makes this a non-event. You make this mistake once. Then you never make it again. DevOps isn’t some fashionable add-on; it’s just the natural way to create solutions. Automate early, and continue automating rigorously as you go along. DRY applies at all levels up and down the stack.

Each file has its own script file. There are config files and cron hooks it all up. This means CPU load, data size, and problem difficulty are all separate issues. Some of you might be thinking, isn’t this just bad old batch days? Are we killing transactional databases? No. In a way, the “bad old days” never left us. We just had little atomic batches of 1 that took an indeterminate amount of time because they had to traverse various object graphs across various systems. We can still have batches of 1 that run immediately — or batches of 10 that run every minute — or bathces of n that run every 5 minutes. It’s just that we’re in control. We’re simply fine-tuning things.

Note that I’m not saying don’t have a framework, database, or transactions. I’m saying be sure that you need one before you add one in. Too often the overhead from our tools outweighs the limited value we get, especially in apps, startups, and hobby projects. If you’re going to have a coder or network guy tweaking this system 3 years from now, it’s a different kind of math than if you’re writing an app on spec to put in the app store. Start simple. Add only the minimum amount you need, and only when you need it.

One of the things you’re going to need that’s non-negotiable is a progressive logging system that’s tweaked from the command line. Basically just a prinfn. I ended up creating something with about 20 LOC that works fine. Batch mode? All you need to see is a start/stop/error summary. Command line debug? You might need to see everything the program is doing. Remember, the goal: as much as possible, you should never touch the code or recompile. Every time you open the code up is a failure. If you can make major changes to the way the system works from the shell, you’re doing it right. If you can’t, you’re not.

Many of you might thing that you’re turning your code into a mess of printf statements like the bad old days. But if that’s what you’re doing, you’re doing it wrong. Each program shouldn’t have that many flags or things to kick out. Remember: you’re only doing a very small thing. For most apps, I’d shoot for less than 300LOC, and no more than 10-20 code paths. All code paths can be logged off or on from command line. To make this workable in a command-line format without creating a “War and Peace” of option flags means cyclomatic complexity must be fairly low. Which also keeps bugs down. Natch.

Of course Windows and various OO programs and frameworks have logging features too, but they run counter to the way things are done here. Usually you have to plug into a logging subsystem, throw things out, then go to some other tool, read the messages, do some analysis, and so on. In linux with command-line stuff, the logging, the analysis, and changing the code to stop the problem all happen from the same place, the command line. There’s no context-switching. Remember, most errors should be just configuration issues, not coding issues.

One of the ways I judge whether I’m on-track or not is the degree of futzing around I have to do when something goes wrong. Can I, from a cold start (after weeks of non-programming), take a look at the problem, figure out what’s happening, and fix it — all within 10 minutes or so? I should be able to. And, with only a few exceptions, that’s the way it has been working.

The way I do this is that I have the system monitor itself and report back to me whether each program is running and producing the expected output. Every 5 minutes it creates a web page (which is just another output file) that tells me how things are going. In the FP world, each program is ran several different ways with different inputs. Common error conditions have been hammered out in development. So most times, it runs fine in all but one or two cases. So before I even start I can identify the small piece of code and the type of data causing problems. The biggest part of debugging is done by simply looking at a web page while I eat my corn flakes.

System tests run all the time, against everything. It’s not just test-first development. It’s test-all-the-time deployment. Fixing a bug could easily involve writing a test in F#, writing another test at the shell level, and writing a monitor that shows me if there’s ever a problem again. Whatever you do, don’t trade off monolithic OO apps for monolithic shell apps. FP skills and thinking are really critical up and down the toolchain here.

It’s interesting that there’s no distinction between app and shell programming. This is good for both worlds. Once we started creating silos for application deployment, we started losing our way.

Instead of rock-solid apps, failure is expected, and I’m not just talking about defensive programming. Pattern-matching should make you think about all paths through a program. Failure at the app level should fit seamlessly into the DevOps ecosystem, with logging, fallbacks, reporting, resillience. Can’t open a file? Can’t write to a file? Who cares? Life goes on. There are a thousand other files. Log it and keep rolling. It could be a transient error. 99% of the time we don’t work with all-or-nothing situations.

As much as possible, you should check for File System and other common errors before you even start the work. My pattern is to load the command line parameters, load the config file (if there is one), then check to make sure I have all the external pieces I need — the actual files exist, the directories are there, and so on. This is stuff I can check early, before the “real” code, so I do it. That way I’m not in the middle of a function to do something really complex and then forgetting to check whether or not I have an input file. By the time I get to the work, I know that all of my tools are in place and ready. This allows me to structure solutions much better.

2014-09 Metrics

 

For instance I woke up this morning and looked at the stats. Looks like pulling things from medium.com isn’t working — and hasn’t been working for several days. That’s fine. It’s an issue with the WebClient request headers. So what? I’ll fix it when I feel like it. Compare this to waking up this morning with a monolithic server app and realizing the entire thing doesn’t run because of intricate (and cognitively hidden) dependencies between functions that cause the failure of WebClient in one part to prevent the writing of new data for the entire app in another part. Just getting started with the debugging from a cold start could take hours.

Note that a lot of DevOps sounds overwhelming. The temptation is to stop and plan everything out. Wrong answer. It’s an elephant. Planning is great, but it’s critical that you eat a little bit of the elephant at a time. Never go big-bang, especially in this pattern, because you really don’t know the shape of the outcome at the DevOps level. [Although there may be some corporate patterns to follow. Insert long discussion here]

Next time I’ll talk about how the actual programming works: how code evolves over time, how to keep things simple, and how not to make dumb OO mistakes in your FP code.

September 8, 2014  Leave a comment

My Agile 2014 Book Report

2014-07-Agile-2014-Weasels-For-Or-Against

I don’t do conferences. The last Agile conference I was at was five years ago, in 2009. So although I’ve been engaged in the work, I haven’t spent much time with large groups of Agile practitioners in some while. I thought it might be useful to folks if I wrote down my observations about the changes in five years.

The Good

  • We’re starting to engage with BigCorp environments in a more meaningful way. There’s still a lot of anger and over-promising going on, but the community is grudgingly accepting the fact that most Agile projects exist inside a larger corporate structure. If we want to have a trusting, healthy work environment, we’re going to need to be good partners.
  • Had one person come up to me and say something like “You know, you’re not the asshole I thought you were from reading you online.” It would do well for all of us to remember that for the most part, folks in the community are there to help others. It’s easy to be misunderstood online. It’s difficult to always assume kindness. Being snarky is just too much fun sometimes, and people don’t like having their baby called ugly. In fact, it’s probably impossible to fully engage with people online the way we do in person. We should know this! 🙂
  • I’m continuing to see creative things emerge from the community. This is the coolest part about the Agile community: because we don’t have it all figured out, there is a huge degree of experimentation going on. Good stuff.

The Bad

  • In many ways, Agile has lost its way. What began as a response by developers to the environments they found themselves in became a victim of its own success. It’s no longer developers finding new ways of developing software. It’s becoming Agile Everything. I don’t have a problem with that — after all, my 2009 session was on Agile in Non-Standard Teams — but there’s going to be a lot of growing pains.
  • The dirty secret is that in most cases (except for perhaps the biz track?) the rooms are filling with folks who already agree with the speaker. But speakers spend time justifying their position anyway. For such a large group, there was quite a bit of clanning. Sessions were already full of cheerleaders. It might be good to clearly understand whether we’re presenting something to the community for their consideration — or presenting something they already love and showing how to get others to like it. These are incompatible goals for the same session.
  • Maybe it was just me, but for such a relaxed group of facilitators, there was quite a bit of tension just under the surface. For a lot of folks, the conference meant a big chance to do something: to get the next gig, to meet X and become friends, to hire for the next year, to start a conversation with a key lead. It was all fun and games, but every so often the veil would slip a bit and you’d see the stress involved. I wish all of those folks much luck.

The Culture

  • Dynamic Open Jam areas were awesome. Even though nobody cared about my proposed session on Weasels, I thoroughly enjoyed them.
  • I saw something very interesting in Open Jame on Wednesday. We were all doing presentation karaoke. A big crowd had formed to watch and participate; perhaps 40 folks. But our time was up. So the leader of the freeform session said “Our time is up, we should respect the next person, who is here to talk about X”The guy gets up and somebody from the crowd says “Hey! Why don’t we just combine the two things?”So we spend another five minutes doing both presentation karaoke and talking about the new topic. That way, we maximized the number of people that stayed involved, while at the same time switching speakers. It was a nice example of both being respectful and adapting to changing conditions.
  • The party on the last night was most enjoyable. I think this was the most relaxed state that I saw folks in. Not sure if the alcohol had anything to do with that 🙂 Lots of great conversations going on.
  • Where did all the developers go? Maybe it was just me, but it seemed like there was a lot more “meta” stuff presented. It didn’t seem like there was as much technical stuff.
Budgeting? Strategic alignment? Huh? Who let the managers into this place?

Budgeting? Strategic alignment? Huh? Who let the managers into this place?

Good and Bad

  • People really hate SAFe (The Scaled Agile Framework, a detailed guide supposedly describing how to run teams of teams in an Agile manner) — to the point that some speakers had a shtick of opening mocking it. I’m process agnostic — I don’t hate anything and all I want is to help folks. SAFe, like anything else, has good and bad parts. Keep the good parts, ditch the bad parts. But for some, SAFe seems like a step backwards.

    What concerns me about watching both sides of this is the emotional investment both groups have in already knowing how things are going to turn out without the necessarily huge sample size it would take to determine this for the industry as a whole. One group might think “Why of course Agile is going to have to evolve into more traditional product management. How else would it work?” The other might think “Why of course we would never put so much structure into what we do. That’s what prompted us to become Agile in the first place.”

    Look, I don’t know. Give me 1,000 examples of SAFe actually being deployed — not some arcane discussion about what the textbook says but how it actually works in the real world — and I can start drawing some conclusions. Until then? This is just a lot of ugliness that I’m not sure serves a greater purpose. Sad.

  • UX, or figuring out what to build, is making waves. Some folks love it, some folks think we’re back to imposing waterfall on the process. I tend to think a) because it takes the team closer to value creation it’s probably the most important thing the community has going right now, and b) it’s just not baked enough yet. At least not for the rest of us. (I don’t mean that practitioners don’t know what they are doing. My point is that it is not formed in such a way that the Agile community can easily digest it.) That’s fine with me, but not so much with others. I’m really excited about seeing more growth in this area.

Summary

We are realizing that any kind of role definition in an organization can be a huge source of impediment for that organization growing and adapting. You’re better off training engineers to do other things than you are bringing in folks who do other things and expecting them to work with engineers. So much of everything can be automated, and whatever your role is, you should be automating it away.

Having said that, I don’t think anybody really knows what to do with this information. We already have a huge workforce with predefined roles. What to do with them? Nobody wants to say it directly, but there it is: we have the wrong workforce for the types of problems we need to be solving.

Finally, it’s very difficult to be excited about new things you’re trying and at the same time be a pragmatist about using only what works. It’s possible, but it’s tough. If Agile is only love and goodness, then you’re probably doing it the wrong way. Agile is useful because the shared values lead us into exploring areas we are emotionally uncomfortable with. Not because it’s a new religion or philosophy to beat other people over the head with. It should be telling you to try things you don’t like. If not, you’re not doing it the right way. Enough philosophy (grin).

August 5, 2014  Leave a comment

« older posts