The importance of performing spike solutions

Firstly we should discuss what is a spike solution? Well according to The Art of Agile Development[Shore & Warden 2007]:

A spike solution, or spike, is a technical investigation. It’s a small experiment to research the answer to a problem.

The Extreme Programming site defines them as:

A spike solution is a very simple program to explore potential solutions. Build the spike to only addresses the problem under examination and ignore all other concerns.

Spikes

These two definitions line up perfectly and also go on to present one major point that must be made before continuing:

Spike/Spike solutions should never be committed into the main codebase

They should be treated as throw away code, once you have your answer it has served it’s purpose (if you must have the code under source control make sure it’s completely separate to the main trunk!).

I also want to clarify how I see the difference between spike solutions and prototyping which could appear very similar, prototyping usually encompasses a much larger goal such as putting some quick static front-end screens together to gauge the UX whereas spike solutions in contrast help to answer a specific technical question such as will EF be able to map to our legacy Users table correctly. This means that spike solutions should require a lot less time to complete and we should probably be time-boxing how long we spend on a particular spike to ensure this.

Now that we have defined what a spike solution is I want to go through a subset real world example (with name changes) that demonstrates there effectiveness when certain situations arise.

XMHELL

Back Story

Fubar DIY Stores Ltd has an e-commerce site that lists all the products available that can be bought, when a product is displayed reviews are also shown and new ones submitted, this was managed in house previously but now they have decided to use a popular global third party service Haveyoursay and they have assured them that they can provide a like for like data match to what we currently have stored.

So in order to bring in the reviews from Haveyoursay twice a day the reviews will be exported as an XML file, we have been tasked with using this exported XML file to import the reviews into the Fubar DIY Stores database so that they can then be retrieved along with the product data.

Making a start

For this example we will only be concentrating on the import of the XML file and ignoring the subsequent steps.

So we have completed a rough architecture of all the moving parts we roughly know which objects we need an how they need to collaborate with each other, essentially there will be a coordinating object that knows what steps need to be performed in which order, this will be the ReviewImportCoordinator and has a method that will start the import PerformImport taking in the XML file path.

We know also that we need an object that ReviewImportCoordinator will collaborate with to read the XML file and bring back a trusty XmlDocument object that we can then use to parse the data we need and then save to the DB.

So we start writing our unit tests first for ReviewImportCoordinator and stub our IXmlReviewFileReader this has a method that takes the import file path and returns us our XmlDocument object and we continue with our unit testing of the import process.

Setting ourselves up for a fall

It seems were doing everything right, we have broken the responsibilities up into separate objects and are using TDD/BDD against our import process. However we have jumped the gun here and are making some big assumptions about how we go about reading the XML file which will have an impact on how our ReviewImportCoordinator does it’s work.

Just enough design

This is were people new to agile get it wrong and start jumping into the code rather than doing some design up front, agile does not tell you to do any design is tells us not to big design up front were we try and guess everything about the system before any code is written.

Our first task should be to get a copy of the XML file this will get rid of our assumptions about how to handle the XML import, so after chatting to the stakeholder we get a copy of the XML file and good job we did as we hit a potential hurdle, the file is around 450MB this new discovery should start us asking questions:

  • Is this a one off or are they all going to be around this size?
  • How much memory will be used if we load this into an XML DOM?
  • Will the machines that are running the import be impacted by the extra memory usage?

After asking the stakeholder for some other import files they are also around 450MB so this seems to be the expected size for each import, so now we can move onto our crucial question How much memory will be used if we load this into an XML DOM? until we get the answer we have no way of knowing whether the will be an impact on the memory usage.

Time for a spike solution

This is the ideal time for us to write a spike to discover the answer, so we knock together a very quick console app with some hacked together code in the Main method, that simply loads an XmlDocument using one of our supplied import files as the input and a Console.ReadLine() so that it waits for input to allow us to open up task manager and discover how much memory the process is using (we just need a ballpark figure otherwise we could use some profiling tools to get more insight).

static void Main(string[] args) 
{
    XmlDocument.Load(@"c:\haveyoursay\imports\import.xml");
    Console.ReadLine();
}

Getting Feedback

So after we run our spike solution we find that the process is using around 1GB of memory to load the import XML into a DOM, we now have a confident number that we can go back to our stakeholder with in order to find out what impact this will have on the machine running the import.

After discussing with the stakeholder it turns out this machine is already being used to perform other jobs and will suffer badly from having that much memory being taken away from these jobs, so we have to look at streaming the XML file rather than loading it into a DOM so we need to use XmlReader rather than XmlDocument so we can now start to unit test using this knowledge and heading down the right path from the start.

Summary

I hope this demonstrates how we can use spike solutions with a little design up front to help steer us in the right direction, this example was done at the time of implementation you can also use spike solutions as part of estimating when you have to use a unfamiliar technology, library, protocol etc… It can give you a quick way of gauging how difficult it is perform certain tasks to give you a bit more confidence in your estimates.

So next time your stuck with a technical question a spike solution could be just what your after!

Advertisements