Primitive Obsession Example # 1 – KeyValuePair

My last post described how removing cases of “Primitive Obsession” is one of the best ways to improve the readability of your code. Replacing primitives with domain specific types helps reveal the intention of the code, which helps others to read, maintain, and use the code we write.

This post will be the first in a series of examples illustrating the primitive obsession code smell. In each post, I will provide a “before” code sample where primitives are used, and then an “after” sample, where the usage of a primitive type is replaced with a domain specific type.

The first example will illustrate using a .NET built-in type, the KeyValuePair.

Disclaimer # 1: It is perfectly acceptable and appropriate to use this type in some cases. However, the type is often overused and can make code less readable. My intention with this post is for you to compare the “before” and “after” versions to see which approach is more readable for your situation.

Now, on to our example…

“Before”- code using the built-in generic type, KeyValuePair<TKey, TValue>:

class Program
{
    static void Main()
    {
        var list = new List<KeyValuePair>();

        list.Add(new KeyValuePair("Chris Melinn", "FAKE123"));
        list.Add(new KeyValuePair("Lisa Simpson", "ABC456"));
        list.Add(new KeyValuePair("Waylan Smithers", "XYZ789"));

        // Outputs (as strings):
        // Chris Melinn, 'FAKE123'
        // Lisa Simpson, 'ABC456'
        // Waylan Smithers, 'XYZ789'
        foreach (var item in list)
        {
            var message = string.Format("{0} , '{1}'",
                        item.Key,
                        item.Value);

            Console.WriteLine(message);
        }
    }
}

Even in this simple example, it can be hard to tell what data is captured by the List of KeyValuePairs. Each KeyValuePair contains a Key of type string, and a Value also of type string. In a real application, hopefully, we would also have better variable names to help us determine what these objects contain. However, in this case, I have intentionally left generic variable names (such as item and list) to highlight the loss of meaning by using the primitive type.

After inspecting the data values in the sample, we could probably guess that our Key’s are people’s names. However, we might not be able to guess what our Value’s hold.


This example will assume that our code intends to print out the company’s computer inventory, with a list of machines issued to various staff members. In other words, in our example, the Key is the employee name and the Value is the machine serial number. Unfortunately, the primitive type does not capture that information. However, we will see this is easy information to convey when we refactor to a domain specific type.


Also, notice that, in this example, our type is created and used within the same block of code. In a real application, it is much more likely that another class would create the list (e.g. by reading from a database), and then return the values it to seperate class to consume the data.  In such a case, we would only see an interface or method such as:

public List<KeyValuePair<string, string>> GetEmployeeMachines()

Even with a well-named method, the developer will probably need to navigate to the class and method and view the source code (if possible) to confirm that the keys and values are what she expects them to be (not to mention, code signatures like these are fairly ugly and unpleasant).

Now, let’s refactor our original sample to remove the primitive KeyValuePair type and replace it with a domain-specific type. It would look something like the following (affected/modified lines are highlighted):

“After”- KeyValuePair<TKey, TValue> replaced with a domain specific type, EmployeeMachine:

class Program
{
    static void Main()
    {
        var list = new List();

        list.Add(new EmployeeMachine("Chris Melinn", "FAKE123"));
        list.Add(new EmployeeMachine("Lisa Simpson", "ABC456"));
        list.Add(new EmployeeMachine("Waylan Smithers", "XYZ789"));

        // Outputs (as strings):
        // Chris Melinn, 'FAKE123'
        // Lisa Simpson, 'ABC456'
        // Waylan Smithers, 'XYZ789'
        foreach (var item in list)
        {
            var message = string.Format("{0} , '{1}'",
                        item.EmployeeName,
                        item.MachineSerialNumber);

             Console.WriteLine(message);
        }
    }
}

class EmployeeMachine
{
    public string EmployeeName { get; private set; }
    public string MachineSerialNumber { get; private set; }

    public EmployeeMachine(string employeeName, 
                           string machineSerialNumber)
    {
        EmployeeName = employeeName;
        MachineSerialNumber = machineSerialNumber;
    }
}

The final result is more intention revealing and better conveys what data is captured in the EmployeeMachine class. Even with other generic, poorly named variables still left in the code (such as item and list), the reader can still understand what the code is doing without navigating and inspecting other classes.

Most importantly, I believe this refactoring is one of the easy “little things” you can do to increase the satisfaction of others who read, maintain, and use your code.


As always, I would love to hear from readers — especially if you have any personal experiences (or even code examples!) to share.

Improving Code – Remove “Primitive Obsession”

One of the best ways to improve code is to remove cases of Primitive Obsession. This particular code smell arises when we use primitive types (such strings or integers) to represent what should be explicit types in our domain.

In some cases, primitive types are quite appropriate. For example, we may only need an instance of some string to display somewhere in our UI. However, in other cases, our “simple string” should be a better abstraction within our domain.

For example, using a string to store my name (“Chris Melinn”) would likely be a loss of an abstraction. This probably needs to be modeled as a something more explicit such as a User or a Person, having a FirstName and a LastName. This is particularly true when other places in the application do things such as parse this string to derive a first name and a last name. These behaivors are telling us that our “name” is trying to be more than just a string.

At first, it may seem silly to create a whole new class that possibly contains just one or two members (like our Person with a Name or FirstName and LastName). However, by creating this “placeholder”, we are defining its place in our domain. More importantly, you may find that other behavior begins to find its way into your “simple” class (such as the conversion and calculation logic that would otherwise be spread around your application). When this happens, you know you did the right thing.

For a much better description and discussion, please take the time to watch this video by Corey Haines and J.B. Rainsberger. They are both excellent craftsmen, and, if you haven’t already, I recommend you take the time to explore their blogs as well.

JB Rainsberger – Primitive Obsession from Corey Haines on Vimeo.

Is Ubiquitous Language Possible?

The Goal

There have been some interesting discussions recently about the importance of having a “common language” between developers and business domain experts.  DDD refers to this concept as its “Ubiquitous Language”:

“The concept is simple, that developers and the business should share a common language, that both understand to mean the same things, and more importantly, that is set in business terminology, not technical terminology.”

The intention is that once we share a common vocabulary, we can improve our requirements and design.  The class design (the business entities and their relationships) hopefully better reflects the true interactions of the business we are trying serve.

The Problem / An Example

The difficultly in identifying the “real names” typically lies somewhere between developers and the business domain experts.  However, on some very large projects, I seemed to have encountered another variation of this problem.

Let’s consider a large project, divided into multiple teams.  Each team is responsible to design and implement separate functionality, but each team has significant overlap with the other.

For example, imagine Really Big Company’s executive  management has decided to build an “improved” customized version of their current email system to communicate better with customers.  So, in assemblies our project teams, we decide to divide the project into two primary teams: Team A will build the “inbox” functionality, and Team B will build the contact management module.

Since Team A and Team B follow good DDD principles, they being working on creating their common language (shared names).  Team A discovers that the domain experts for the Inbox system refer to their mail recipients as “contacts”, and, ironically, Team B’s domain experts refer to their contacts as “customers”.

At some point, they realize that “customers” are really just one “type” of contact.  With this shared understanding, the developers create a design where [Customer] will be a subclass/child of [Contact].  The domain experts readily agree the improved design.

So now, developers will distinguish contacts from customers.  And the next time they hear domain experts refer to a “customer”, they think they can assume a[Customer].  The business agreed on it, right?!

So, the real question — did the business user really mean [Customer]?  Or did they really mean [Contact]?

Conclusion

It is unlikely that the vocabulary of the business users will change, especially when the business users are not part of the full time project staff.  Remember, the developer staff isn’t the only interaction that these domain experts will have!  So, even when we have “agreement”, it is unlikely (and probably undesirable) to change the language of the business.

So then what?

After all this, we may be tempted to throw out our ideas on common language, shared names, and ubiquitous language.  However, let’s stop to see what we have accomplished.

In our theoretical meeting, when the domain expert mentioned “customer”, the technical team can now ask “Do you mean [Customer] or [Contact]?”.  And, with any luck, the business now understands what we mean, and can give a simple answer.  “Ah, yes, you’re right.  I do mean [Contact]”.

Common language isn’t a cure-all.  But it sure helps.

I’m curious to hear your experiences as well.