The importance of idempotent services

Introduction

We all know the importance of performing a unit of work (UoW) in a transaction scope so that if anything goes wrong we can rollback the actions that have taken place and be sure that the state is exactly how it was before the UoW took place, this is were we can use a messaging framework (i.e. MSMQ) that can support transactional capabilities using a transaction coordinator (i.e. MSDTC) so that when we receive a message perform some work, but when saving to the DB there is an issue (i.e. Network problem, transaction deadlock victim) we know that the message received will be put back on the queue and any messages that have been attempted to be sent will be returned (to prevent duplicates):

public void DoWork()
{
    using (var scope = new TransactionScope())
    {
        var command = MsmqInboundWork.NextMessage();
        // *** processing here ***
        DbGateway.Save(workItem);
        MsmqOutboundResult.SendMessage(new ResultMessage { CompletedAt = DateTime.UtcNow; });

        scope.Complete();
    }
} 

If this happens we have options (i.e. retry for a limited amount of goes and then put the message into an error queue) however we don’t always have the option of using a reliable/durable/transactional transport layer such as messaging, for instance if we are integrating with a 3rd party over the internet they are going to provide usually a HTTP based transport layer such as SOAP/REST or like in my case recently integrating with an internal legacy system that only provides a HTTP transport layer (SOAP webservices was the only option in my case).

This post will demonstrate how we can work with services exposed on unreliable transport layer to perform actions but still not end up with duplicates by making the operations idempotent.

What is an idempotent operation?

This basically boils down to being able to call a function with the same arguments and have the same result occur, so if we have a method:

public int Add(int first, int second)
{
    return first + second;
} 

This method is idempotent because if I called it multiple times with Add(2,2) it will always return 4 and there are no side effects, compare this to the next method (demo only!):

public void DebitAccount(string accountNumber, decimal amount)
{
    var account = FindAccount(accountNumber);
    account.Balance -= amount;
    // account saved away
} 

If I call this multiple times like this DebitAccount("12345678", 100.00m) the client of that account is going to be more out of pocket each time a call is made.

How does this relate to my webservice method?

You may be thinking to yourself we have a system running that exposes a webservice with a method to perform a similar action to the example above or that you make a call to a 3rd party to perform an action (i.e. send an SMS) and so far we haven’t had any problems so what’s the big deal!? You’ve been lucky so far but remember the first fallacy of distributed computing

The network is reliable

Once you get network issues this is where everything goes wrong, this may not be a big deal you may decide to retry the call and the user ends up with another SMS sent to their mobile however if you’re performing a journal on an account this is a big problem.

If we look at a typical call to a webservice, in this case we are going to use the example of a façade service of a legacy banking system:

Account Service
fig 1 – If all goes to plan

In this case everything has gone ok and the client has received a response back from the account service so we know that the account has been debited correctly, what if we have the following:

Account Service
fig 2 – Network issue causes loss of response to client

Now we have a problem because the client has no way of knowing whether the debit was performed or not, in the example above the journal has taken place but the machine running the account service could have a problem and not get round to telling the banking system to perform a journal.

Fixing the problem

We established in the last section that the client has no way of knowing whether the debit took place if we do not receive a response from the account service so the only thing we can do would be assume that it did not take place and retry the service call given the scenario from fig 2 this would be very bad as the debit did take place so we would duplicate the debiting of the account.

So we have a few options of how we can deal with this issue:

  1. We don’t perform retries on the client side automatically instead we handle the situation manually and could perform a compensating action before attempting a retry (i.e. perform a credit journal)
  2. We check on the account service whether we have already had a debit for the chosen account and amount in a specific time frame and if so we ignore the call (this would not work in this example as we may have a genuine need to debit the account twice within a specific time frame)
  3. Have the client send something unique (i.e. GUID) for the current debit that you want to take place this can then be used by the account service to check if we have already performed the journal associated with this request.

The last 2 options are to make the account service idempotent my recommendation would be to use the last option and this is the option I will demonstrate for the rest of the post.

Ok so the first change we should make is to add a unique id to the request that gets sent, it is the responsibility of the client application to associate this unique id with the UoW being carried out (i.e. if the client application was allowing an overdrawn account be settled then the UoW would be a settlement and we can then associate this id to the settlement) so the client needs be changed to store this unique id.

The next change is for the account service to perform a check to make sure that we have not already performed a journal for the id passed to it, this is accomplished by storing each id against the result (in this case the journal reference) once the journal has taken place and querying it as the first task, if we find the id then we simply return the journal reference in the response back to the client.

If the id is not found we must not have performed a journal so we need to make the call to the banking system as we were before and then store away the id and the journal reference and return the response back to the client.

Now that this is in place we can have the client perform retries (within a certain threshold) if we receive no response back from the account service safe in the knowledge that we won’t get duplicate debits from the account, here is a sequence diagram to outline the new strategy:

New idempotent account service

Ah the joys of distributed architecture!

Advertisements

6 thoughts on “The importance of idempotent services

  1. I’m studying at Penn State University and I want to voice my affection for your kindheartedness toward individuals who seek help with this one topic. Your real dedication to getting the solution out there looks to be quite beneficial and has permitted college students much like me to reach their goals. Do know that this work means a lot to all of us.

  2. Hello my family member! I want to say that this post is awesome, nice written
    and include approximately all vital infos. I would like to see extra posts like this .

  3. miumiu 靴
    I am curious to find out what blog system you happen to be working with?
    I’m having some small security problems with my latest site and I’d like to find
    something more secure. Do you have any recommendations?

  4. Woah! I’m really loving the template/theme of this site. It’s simple, yet
    effective. A lot of times it’s very difficult to get that “perfect balance” between user friendliness and visual appeal. I must say that you’ve done a amazing job with this.
    Additionally, the blog loads extremely quick for me on Opera.
    Outstanding Blog!

Comments are closed.