Wednesday, February 8, 2012

Tales of the Infinite

Any control structure which goes on and on infinitely wastes server resources and can ultimately crash a server.  At Flex, if we see a remote function call from the Flash workbench lock up and no client error is thrown and there's no exception in the server logs, we usually start to suspect infinite trouble of one kind or another.

In software development there are two main types of control structures subject to the perils of infinity: infinite loops and infinite recursion. 

Infinite Loops

An infinite loop is a loop whose test condition is always true, like these:

while (2 + 2 = 4) {
     //do something
}
          while (true) {
               //do something
          }

We use loops all over the place, usually for each loops or for loops, which are seldom subject to infinite loop problems.  It can happen however:  while loops like this next one are a great way to crawl up a tree or acyclic graph structure without resorting to a recursive function:

TreeNode node = random tree node;
while (node.getParentNode() != null) {
    //do something
   node = node.getParentNode();
}
This is all well and good; in a regulation acyclic tree eventually you'll reach the parent node in the tree, the loop's test condition will be false and the loop will exit.  But what if this isn't true, what if what should be the root node somehow thinks that one of it's descendents in the tree structure is its parent?  In graph theory, this is called a cycle.  If the underlying tree in this example had a cycle, it would be impossible to find a node whose parent was null and thus this loop would continue executing over the same closed path until someone stopped it.

Infinite Recursion

Another technique for traversing directed graphs is recursion, which is what we call it when a function or method calls itself.  We use this technique for traversing data structures throughout Flex.  A stripped down example of recursion looks like this:

public int calculatePricing(FinancialDocument lineItem) {

     int result = lineItem.getQuantity() * lineItem.getPrice();

     if (lineItem.getChildren() != null) {
            for (FinanicalDocumentLineItem childLine : lineItem.getChildren()) {
                   result += calculatePricing(childLine);
            }
     }

     return result;

}

This example takes a line item and uses recursion to drill down into child line items.  If we have a nested line item structure that goes several levels deep, eventually each unique path through the tree will have a leaf node (a node without children) and recursion will stop.  But what if a single node is both an ancestor and a descendant?  What if we have a line item that contains a line item that contains itself?  Again, we have a cycle.  There will be at least one path through the tree for which the conditions for continued recursion would always be met.  Usually this kind of infinity is eventually stopped by the Java Virtual Machine with a Stack Overflow Exception. Even so, infinite recursion is bad and has to be addressed before it gets to the point of overloading the call stack.

Cycle Proof Recursion

Courtney recently found an infinite recursion issue related to cycles in suggestion configuration; where somewhere down the line a suggestion ends up suggesting itself.  If these same suggestions also happened to be auto-include suggestions, the auto include logic would recurse infinitely and eventually crash the session.

Obviously we need to add logic that detects cycles to the validation process when configuring suggestions, but I always like to add an extra measure of safety in situations like this, such that even if there's a cycle in the graph the recursive function can detect the cycle and exit.

This usually means adding a parameter to the function that maintains some information about what's happening elsewhere on the call stack.  In this case, our addSuggestion() function, which is recursive for auto-include suggestions, needed to maintain a list of items for which suggestions had already been processed.  This list would get checked for every method call and ensure that suggestions for the given item had not already been processed.  Here's a snippet from the Flex codebase (edited for clarity):

    protected boolean addSuggestedItem(
         FinancialDocumentLineItem baseLineItem,
         ManagedResource resource,
         Set<String> antiCycle) throws Exception{
       
        if (antiCycle == null) {
            antiCycle = new HashSet<String>();
            if (baseLineItem.getResource() != null) {
                antiCycle.add(baseLineItem.getResource().getObjectIdentifier());
            }
        }
       
        if (antiCycle.contains(resource.getObjectIdentifier())) {
            return false;
        }
        antiCycle.add(resource.getObjectIdentifier());

        //do normal suggestion processing

       //recursive call
       if (resource has auto include child suggestions) {
           addSuggestedItem(baseLineItem, suggestedItem, antiCycle);
      }

    }

This check starts by initializing the antiCycle information if the caller doesn't provide it, which would probably happen if the method call is the first one on the call stack.  The antiCycle list is checked to make sure the method call would not be redundant and if so, short circuits out of the function and breaks recursion.

If the function call is valid, the item for which it's being called adds itself to the antiCycle list to guard against infinite recursion further down the stack.

Bad Data

This method doesn't eliminate the cycle in the underlying data; it merely limits how much damage the bad data can do.  The next step for this particular issue is to add cycle checks to the suggestion configuration screens.  More to come on that front.

No comments:

Post a Comment