Why reviewing great code may make you a star

Discovering the path to becoming a great developer is not difficult.

There is an abundance of great material at our fingertips: websites, blogs, code, books, podcasts, and even free online courses. With access to so much good material, if you are willing to put in the time & effort, you will become a standout developer.

After several years of putting in this time & effort, you will be an expert in the “best practices” and trends in software development. You may also find, however, that most of us are reading the same blogs, reading the same books, following the same practices, and generally, agreeing on all of the same trends. This is probably a good thing – we are reaching some level of consensus of good practices and approaches to software development.

However, using this approach, we are not likely to discover our blind spots. We learn a lot, but we’re learning too much of the same things. How can we counter this problem? Possibly re-stated, how do we move from being a “top 10%” developer to becoming a “great” developer, a “star”.

Cal Newport gives this advice for becoming a star:

“In theory, the people who tend to consistently produce important work seem to be those who consistently take the time to decode the latest, greatest results in their subject area.”

For developers, that means we need to take time to read great code.

Oren Eini (Ayende Rahien) says it this way:

“One of the reasons that I have routinely been going out and searching for difficult codebases to read has been to avoid that. I know that I don’t know a lot, I just don’t know what I don’t know. So I go into an unfamiliar codebase and try to figure out how things work over there.

I have been doing that for quite some time now. And I am not talking about looking at some sample project a poo schlump put out to show how you can do CQRS with 17 projects to create a ToDo app. I am talking about production code, and usually in areas or languages that I am not familiar with.

Some people can learn by reading academic papers, I find that I learn best from having a vague idea about what is going on, then diving into the implementation details and seeing how it all fits together.”

Notice he says “production code, and usually in areas or languages that I am not familiar with.”  This is the same idea Cal Newport stresses – don’t just do the work – do the hard and difficult work that most of our peers do not.

It is clear that Ayende reviews code regularly and then blogs about his findings.  Given his successes with NHibernate and all of his new work at Hibernating Rhinos, I think its clear that he is a “star” developer, and someone we can learn from.

* If you’re convinced, Oren Eini also has more tips on reading large code bases here.

 

How to recruit better developers and other job candidates


Note:  This post focuses on hiring and recruiting software developers, but these principles apply to other specialist job roles as well.


Job recruitment ads are terrible.

We are not hiring
Photo Credit: billsoPHOTO via Compfight cc

The process of creating job/recruitment ads goes something like this:

  1. Hiring Manager needs to hire for a role.
  2. Hiring Manager asks Recruiter to advertise for the position.
  3. Recruiter asks Hiring Manager what essential technologies & skills are required.
  4. Hiring Manager lists all technologies & skills relevant to the team.
  5. Recruiter creates the job ad, listing all of the “required” skills.

So, what’s the problem?

The problem is that we end up with an advertisement that looks something like this:


The MUST HAVE skills experience:

  • ASP.NET Web Development experience
  • SQL Server (2000 – 2008)
  • 2 years in-depth MVC recent experience (design patterns etc..)
  • C#/ASP.NET 4.0
  • Entity Framework
  • TDD, Unit testing and MVC3/4
  • Web based development experience
  • Finance background is a bonus!
  • Agile based projects
  • Enterprise level applications
  • Must be passionate, team player and enthusiastic
  • Flexible to work in various areas of the business
  • Outstanding communication and interpersonal skills
  • IT Degree or relevant qualifications

The excerpt above was extracted verbatim from a job advertisement I found while preparing this post. I (intentionally) didn’t search long – it was the second job ad I encountered. I am highlighting this advertisement specifically because it is typical.

What’s wrong with our job ad?

  • There are exactly 14 “must have” skills.
    • One of the skills mentions it is as a “bonus”, leaving us to assume only 13 are mandatory.
  • Is it really mandatory to have worked with C#/ASP.NET 4.0?
    • If a good developer has “only” worked with version 3.5, then they should not apply?
  • Is it really mandatory to have “2 years in-depth MVC recent experience”?
    • Agreed, experience with ASP.NET MVC is helpful.  But, would deep knowledge of web technologies and web development would be just as (or more) important?
  • Is it really mandatory to have “Entity Framework” experience?
    • Or, would it be more important to understand data access and possibly troubleshoot ORM frameworks? If the applicant knew NHibernate and SQL development well, would that be just as important?
  • How many applicants would say they are not “passionate, team players or enthusiastic”, and do not have “outstanding communication and interpersonal skills”?

Okay, you get the idea. This checklist approach is not an effective way to find good candidates.

Of course, in reality, hiring managers and recruiters will (have to) be flexible on one ore more of the listed items. But that’s my point! Take the time to think about what type of candidate you really want, and list those qualities. It will improve your advertisement, and, more importantly, it will help you clarify what type of person you really need for your team.

“Take the time to think about what type of candidate you really want, and list those qualities. It will improve your advertisement, and, more importantly, it will help you clarify what type of person you really need for your team.”

Possible counter-argument: “We will get too many applicants if we don’t narrow down our criteria!”

Wrong!

Okay, maybe it is true. It will screen out some candidates. But not the ones you think…

Consider this:

Competent individuals rate their skill levels lower than they should, while other less competent individuals rate their skills higher!

Thus, an outstanding developer might review the required skills in the job advertisement and think:

  • “I haven’t worked with C# 4.0″ (even though I’m great with v3.5)
  • “I haven’t worked with Entity Framework” (although I’m an expert with NHibernate and data access)
  • “I only have 1 year of MVC experience” (even though I’ve done web development for 10 years).
  • Therefore, I am not qualified.  I cannot apply for this position.

Whereas, the mediocre (less-skilled) developer might think:

  • “I know C# 4.0″ (I know Java and I heard the syntax is mostly the same)
  • “Yes, I know Entity Framework” (I may have read a blog post about it and I think know how it works)
  • “I have 2 years of MVC experience” (I read about the MVC pattern in school a few years ago, so that counts as 2 years experience).
  • Therefore, I am qualified. I will apply for this position.

If you are not sure about your advertisement, show it to a few great developers that are not part of your current team. (For example, ask your best developers to show it to their friends, the ones they would recommend as great candidates to hire). Would their friends apply to the advertisement? Why would they hesitate? What would encourage them to try?

Possible counter-argument #2:  “But this position is especially critical and highly specialized! I must include these skills!”

You are might be right. But…

Everywhere I have worked has advertised for specialist roles:  some for developers, some for business analysts, some for testers, and some for team leaders.  At least once, they have all written highly restrictive job advertisements, received less than desirable applicants (who may talk a good talk), and then had to settle for the “best” of the candidates that applied. Then, the hired applicant didn’t work out…

So, to be more certain, try asking yourself:

  • How restrictive is this ad? How many people will meet all of these criteria?
  • Are all of these exact skills required?  In other words, are they deal-breakers?
  • Could I think of anyone who would be great for the role, who might not meet all of the criteria?
  • Could I find a good candidate and train them in one or more of the criteria instead?
  • Am I willing to wait for the right person? Or will I have to take one of the applicants anyway?

Conclusions

If time is critical and you just need to hire someone who is “good enough“, then the standard job advertisement might work. You may even find the best candidate anyway.

However, its usually worth investing the additonal time to screen your candidates
in other ways. Make the advertisement clear that you’re looking for candidates who can demonstrate their abilities (e.g. by writing good code, solving hard problems, following good practices…).  Then, enhance your hiring process to allow them to do so.

A few ways to have candidates demonstrate these abilities could be:

  • Submit sample code as part of job applications.
  • Have them write code (again) in their interviews.
  • Give them small applications that are in need of fixing or enhancements
    • Ask them questions (during or afterwards) on why they solved problems they way they did
    • Watch whether they demonstrate the skills you are seeking (do they write good code, unit tests, use TDD, …)

Still need help:  Feel free to contact me


The Gilded Rose Kata and the Approval Tests Library


Note: It may be helpful to read my previous post on the Golden Master Technique before seeing an example of its application here.


The Gilded Rose Kata

I encountered the Gilded Rose Kata after reading this blog post on Why Most Solutions to the Gilded Rose Miss The Bigger Picture. The Gilded Rose code kata is particularly appealing because it requires modification of a poorly written (simulated) code base. It is excellent practice for our typical daily work, yet it is still small enough to learn and jump in quickly.

I encourage you to try the Gilded Rose Kata before reading further. At a minimum, please read the overview and consider how you would approach the problem.

Please give it a try and come back…

Ok, you’re back. How would you start?

The Golden Master Technique

This type of problem is a great example of when the Golden Master technique can be a better starting point than going straight into unit tests.

The overview of Gilded Rose states that the existing application “works”. This describes a “real application”, working in production, that is useful enough to enhance with new features. Thus, it is as important, if not more so, to preserve existing behavior of the application as it is to add new features.

In order to unit test the code properly, we would need to modify the code somewhat significantly (in proportion to the existing code base) to be able to create the seams required for good unit tests. And to get complete coverage of the requirements would require a fair number of tests, some of which may not be obvious or explicit.

Applying the Golden Master Technique

So, let’s get started in creating our “golden master”. We need to execute a sufficient number iterations of the code to generate enough output to create a meaningful baseline. For this example, 50-100 iterations would be more than sufficient coverage.  In a larger code base, we would need to create a more diverse range of input values. In this case, however, the initial setup appears sufficient to cover the necessary code paths.

To generate the Golden Master, we need to:

  1. Open/Create a file for saving the output
  2. Modify code to output state to the file, e.g. write the Item properties (Name/SellIn/Quality) out for each execution of UpdateQuality()
  3. Modify the code to iterate through through a sufficient number of days, 100 days for example
  4. Close/Save the file

Performing the above steps in code would not be difficult. However, this is even easier using the Approval Tests Library. The framework does exactly what we need:

  1. Runs a specified “test”
  2. At the end of the test, asserts/verifies a given state
  3. Compares the resulting execution, with an accepted master. If an accepted master doesn’t already exist, you accept & save it (or fix it until you are happy with it). If the master does exists, but is different, it fails the test. Which can again, either be fixed, or accepted as a new master.

To get started:

  1. Open the GildedRose solution
  2. Add/Install ApprovalTests into our solution.  For .NET, the easiest way, of course, is via NuGet. Otherwise, it can be downloaded from here.
  3. Enhance the application to be able to capture its state, for example using a string representation of the items & their current values
  4. Create a test that will verify the state from the previous step
  5. Run and fix the tests until you are happy with the accepted golden master

How to enhance the application to capture state (step 3)

public string GetSnapshot()
{
    var snapshot = new StringBuilder();

    foreach (var item in Items)
    {
        snapshot.AppendLine(string.Format("Name: {0}, SellIn: {1}, Quality: {2}", item.Name, item.SellIn, item.Quality));
    }

    snapshot.AppendLine("------------------------");

    return snapshot.ToString();
}

How to Create a test that will verify the state in the previous step (step 4)

1.  Let’s create a basic approval test, simply to validate the state from the initial setup (before any iterations):

[Test]
[UseReporter(typeof(DiffReporter))]
public void TestThis()
{
    var app = Program.Initialise();

    var initialSetup = app.GetSnapshot();

    Approvals.Verify(initialSetup);
}

To make our test compile and run…

2. Change the Program class to public…

public class Program

3. … and extract the initial setup into an Initialize() method:

public static Program Initialize()
{
    return new Program
    {
        Items = new List
        {
            new Item {Name = "+5 Dexterity Vest", SellIn = 10, Quality = 20},
            new Item {Name = "Aged Brie", SellIn = 2, Quality = 0},
            new Item {Name = "Elixir of the Mongoose", SellIn = 5, Quality = 7},
            new Item {Name = "Sulfuras, Hand of Ragnaros", SellIn = 0, Quality = 80},
            new Item {Name = "Backstage passes to a TAFKAL80ETC concert", SellIn = 15, Quality = 20},
            new Item {Name = "Conjured Mana Cake", SellIn = 3, Quality = 6}
        }
    };
}

Now, our Main() looks like this:

static void Main(string[] args)
{
    System.Console.WriteLine("OMGHAI!");

    var app = Initialise();

    app.UpdateQuality();

    System.Console.ReadKey();
}

4.  Finally, we can add another test, to execute with a 100 iterations, which creates our “golden master”:

[Test]
[UseReporter(typeof(DiffReporter))]
public void TestThisTimes100()
{
    var app = Program.Initialise();

    var snapshotForHundredIterations = new StringBuilder();

    var initialSnapshot = app.GetSnapshot();

    snapshotForHundredIterations.Append(initialSnapshot);

    for (int i = 0; i < 100; i++ )
    {
        app.UpdateQuality();
        var currentSnapshot = app.GetSnapshot();
        snapshotForHundredIterations.Append(currentSnapshot);
    }

    Approvals.Verify(snapshotForHundredIterations);
}

When, approval tests execute with the [UseReporter(typeof(DiffReporter))] attribute, the framework launches an installed diff tool. If you save the resulting file, formatted as SomeTestName.approved.txt, it becomes the accepted “golden master”.

That’s it! If you find it slightly confusing the first time, try a couple examples yourself. Once you understand the concept of the ApprovalTest framework, it is a simple and effective way to create a “golden master” test quickly.

At this point, you could proceed with the Gilded Rose Kata, making enhancements, or if desired, creating a set of explicit unit tests to more accurately describe the given requirements & for future readability & maintainability.

Other Resources

Using the Golden Master technique to test legacy code

Working with legacy code is a scary proposition. Generally, we lack an understanding of the application and its codebase, and we don’t have automated test coverage. In fact, in his book Working Effectively with Legacy Code, Michael Feathers defines legacy code as “code without tests”.

Therefore, before making changes to legacy code, it is important to guard against unintended changes. These days, developers are often too quick to assume unit tests are the (only) way to do this. However, in a large code base, where requirements are missing or unclear, this may not be a viable option. We could even introduce bugs by “fixing” behavior,  if downstream systems assume an existing (incorrect) behavior. Therefore, it may be important to first capture and lock down existing behavior before writing unit tests or modifying the existing code. Characterization tests are a means of capturing the existing behaviour.

To create the characterization tests, we can generate a large set of diverse inputs and run them against the existing codebase. By recording and saving these outputs, we capture the existing behavior. These outputs from the original code base are called the “Golden Master”. Later, when we need to modify the code, we can replay the same set of inputs and compare them against the original “master” outputs. Any differences between the original and new outputs help to identify unintended behaviour changes (or can be accepted if intentionally changed).

I have used this technique in real life scenarios, which previously, had been difficult to cover sufficiently with tests. The 80/20 rule applies here; we spent 80% of our time trying to cover 20% (less, really) of the fringe cases. In the end, the golden master technique has been more effective. Once in place, this technique can be combined with unit testing and other test methods.

Thanks to jbrains‘s posts on Legacy Code Retreat for introducing me to this technique. In particular, I recommend this for more information.

I will provide more details, examples, and tools for this technique in future posts.


Update: See an example of using this technique with my follow up post, The Gilded Rose Kata and The Approval Tests Library

Primitive Obsession Example # 1 – KeyValuePair

My last post described how removing cases of “Primitive Obsession” is one of the best ways to improve the readability of your code. Replacing primitives with domain specific types helps reveal the intention of the code, which helps others to read, maintain, and use the code we write.

This post will be the first in a series of examples illustrating the primitive obsession code smell. In each post, I will provide a “before” code sample where primitives are used, and then an “after” sample, where the usage of a primitive type is replaced with a domain specific type.

The first example will illustrate using a .NET built-in type, the KeyValuePair.

Disclaimer # 1: It is perfectly acceptable and appropriate to use this type in some cases. However, the type is often overused and can make code less readable. My intention with this post is for you to compare the “before” and “after” versions to see which approach is more readable for your situation.

Now, on to our example…

“Before”- code using the built-in generic type, KeyValuePair<TKey, TValue>:

class Program
{
    static void Main()
    {
        var list = new List<KeyValuePair>();

        list.Add(new KeyValuePair("Chris Melinn", "FAKE123"));
        list.Add(new KeyValuePair("Lisa Simpson", "ABC456"));
        list.Add(new KeyValuePair("Waylan Smithers", "XYZ789"));

        // Outputs (as strings):
        // Chris Melinn, 'FAKE123'
        // Lisa Simpson, 'ABC456'
        // Waylan Smithers, 'XYZ789'
        foreach (var item in list)
        {
            var message = string.Format("{0} , '{1}'",
                        item.Key,
                        item.Value);

            Console.WriteLine(message);
        }
    }
}

Even in this simple example, it can be hard to tell what data is captured by the List of KeyValuePairs. Each KeyValuePair contains a Key of type string, and a Value also of type string. In a real application, hopefully, we would also have better variable names to help us determine what these objects contain. However, in this case, I have intentionally left generic variable names (such as item and list) to highlight the loss of meaning by using the primitive type.

After inspecting the data values in the sample, we could probably guess that our Key’s are people’s names. However, we might not be able to guess what our Value’s hold.


This example will assume that our code intends to print out the company’s computer inventory, with a list of machines issued to various staff members. In other words, in our example, the Key is the employee name and the Value is the machine serial number. Unfortunately, the primitive type does not capture that information. However, we will see this is easy information to convey when we refactor to a domain specific type.


Also, notice that, in this example, our type is created and used within the same block of code. In a real application, it is much more likely that another class would create the list (e.g. by reading from a database), and then return the values it to seperate class to consume the data.  In such a case, we would only see an interface or method such as:

public List<KeyValuePair<string, string>> GetEmployeeMachines()

Even with a well-named method, the developer will probably need to navigate to the class and method and view the source code (if possible) to confirm that the keys and values are what she expects them to be (not to mention, code signatures like these are fairly ugly and unpleasant).

Now, let’s refactor our original sample to remove the primitive KeyValuePair type and replace it with a domain-specific type. It would look something like the following (affected/modified lines are highlighted):

“After”- KeyValuePair<TKey, TValue> replaced with a domain specific type, EmployeeMachine:

class Program
{
    static void Main()
    {
        var list = new List();

        list.Add(new EmployeeMachine("Chris Melinn", "FAKE123"));
        list.Add(new EmployeeMachine("Lisa Simpson", "ABC456"));
        list.Add(new EmployeeMachine("Waylan Smithers", "XYZ789"));

        // Outputs (as strings):
        // Chris Melinn, 'FAKE123'
        // Lisa Simpson, 'ABC456'
        // Waylan Smithers, 'XYZ789'
        foreach (var item in list)
        {
            var message = string.Format("{0} , '{1}'",
                        item.EmployeeName,
                        item.MachineSerialNumber);

             Console.WriteLine(message);
        }
    }
}

class EmployeeMachine
{
    public string EmployeeName { get; private set; }
    public string MachineSerialNumber { get; private set; }

    public EmployeeMachine(string employeeName, 
                           string machineSerialNumber)
    {
        EmployeeName = employeeName;
        MachineSerialNumber = machineSerialNumber;
    }
}

The final result is more intention revealing and better conveys what data is captured in the EmployeeMachine class. Even with other generic, poorly named variables still left in the code (such as item and list), the reader can still understand what the code is doing without navigating and inspecting other classes.

Most importantly, I believe this refactoring is one of the easy “little things” you can do to increase the satisfaction of others who read, maintain, and use your code.


As always, I would love to hear from readers — especially if you have any personal experiences (or even code examples!) to share.

Improving Code – Remove “Primitive Obsession”

One of the best ways to improve code is to remove cases of Primitive Obsession. This particular code smell arises when we use primitive types (such strings or integers) to represent what should be explicit types in our domain.

In some cases, primitive types are quite appropriate. For example, we may only need an instance of some string to display somewhere in our UI. However, in other cases, our “simple string” should be a better abstraction within our domain.

For example, using a string to store my name (“Chris Melinn”) would likely be a loss of an abstraction. This probably needs to be modeled as a something more explicit such as a User or a Person, having a FirstName and a LastName. This is particularly true when other places in the application do things such as parse this string to derive a first name and a last name. These behaivors are telling us that our “name” is trying to be more than just a string.

At first, it may seem silly to create a whole new class that possibly contains just one or two members (like our Person with a Name or FirstName and LastName). However, by creating this “placeholder”, we are defining its place in our domain. More importantly, you may find that other behavior begins to find its way into your “simple” class (such as the conversion and calculation logic that would otherwise be spread around your application). When this happens, you know you did the right thing.

For a much better description and discussion, please take the time to watch this video by Corey Haines and J.B. Rainsberger. They are both excellent craftsmen, and, if you haven’t already, I recommend you take the time to explore their blogs as well.

JB Rainsberger – Primitive Obsession from Corey Haines on Vimeo.

Questions Before Refactoring

I want to discuss a question I found quite some time ago on StackOverflow, which posted a code sample (to follow) along with these questions and comments (bold highlights are mine):

(clipped)  “… this is the perfect time to refactor this puppy and make it more portable, reusable, reliable, easier to configure etc. Now that’s fine and dandy, but then I started feeling a tad overwhelmed regarding to where to start. Should it be a separate library? How should it be configured? Should it use IoC? DI?

So my [admittedly subjective] question is this — given a relatively small, but quite useful class like the one below, what is a good approach to refactoring it? What questions do you ask and how do you decide what to implement or not implement? Where do you draw the line regarding configuration flexibility?

[Note: Please don't bash this code too much okay? It was written a long time ago and has functioned perfectly well baked into an in house application.]”

Before we get into the code sample, let’s address some of the key questions…

“What is a good approach to refactoring it?  What questions do you ask…?”

The first question I would ask is:

1.  Do I need to refactor this code at all?

In this case, the author commented at the end of his question that the code “was written a long time ago and has functioned perfectly well baked into an in house application”.  This comment suggests there may not be a need to refactor at all!  If the code has survived and hasn’t been modified in a long time, it probably doesn’t make sense to change it now.  However, let’s assume we have good reasons to start (e.g. need to start making a changes)…

2.  What is my objective in refactoring the code?

The term “refactoring” is becoming quite commonplace, and it is often used interchangeably with any type of code modification.  However, in Refactoring: Improving the Design of Existing Code, Fowler defined it as

“Refactoring (noun):  a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior.”

With this definition, Fowler highlights that refactoring is not any general modification to the code, but specifically a change with the intention to “make it easier to understand”.

In this episode of Hanselminutes, “Uncle Bob” Martin makes the comment:

“You are constantly mercilessly refactoring systems,
not necessarily because they need to be refactored
but because they need to remain flexible. So it’s
rather like moving the gears on a bicycle or driving
your car for no other reason than to keep it well
lubricated, and in order to make those kinds of
changes you need tests, it all boils down to tests in
the end.”

(Before we move on, I want to highlight another important fact.  You cannot be sure you haven’t changed behavior unless the code is covered by good unit tests.  So, without good unit tests, you are not really refactoring.)

With this context in mind, we can determine how to make the code more flexible, easier to understand, and cheaper to modify.

3.  What should I do?  What should I refactor?

The reason I was attracted to this particular code sample to highlight is that it is actually a fairly decent piece of code.  I can see, as the author points out, that it could have been good enough to support an existing application without many problems.  I also think this sample is reflective of the type of code I generally see in the wild:  functional, but not expressive.

Additionally, even for a small code sample, it contains quite a bit of code duplication. I feel this amount of duplication is typical (unfortunately) of the average application.

The biggest step you can make to improve your code is learning (1) how to remove duplication, and (2) how to make your code more expressive.