Monday, February 11, 2013

Sugiyama Layered Graph Drawings: Cycle Removal

Captains Log: Stardate: uggh........ (Zap Brannigan voice)

It's another day into my graph visualization journey and I've had a lot to think about during my random walks in the graph theory world (joke...get it? ;D ). I've always had an interest in graph visualization, but the real kickstarter for my current journey was a project for visualizing user input graph data that I am currently working on. The first and most basic graph drawing methods I started looking into for the project were circle-based layouts (see previous post on circle-based graph drawings). Soon after that I heard about BioFabric and had my mind totally blown. As it turns out the timing of my BioFabric discovery coincided with the second type of drawing method on my visualization to do list: layered graph drawings! 

I mark it as a special coincidence because of one similarity that the two methods have: layering. Now to be completely clear they ARE NOT the same thing! Layered graph drawing algorithms are meant to convey hierarchical information within a graph, whereas BioFabric is a tool for visualizing large and complex networks to make pinpointing things like clusters and graph structure more easy (i.e. its primary goal is not to clarify hierarchical dependencies). What I was just trying to point out was a parallel that I noticed between the two in that with layered graph drawings the vertices are aligned in layers with edges in between (the vertices in this case are still dots mind you), while in BioFabric the vertices are layered with edges in between layers as well but each vertex constitutes its own layer. (and most importantly, the vertices are no longer dots but LINES! REMEMBER THAT!!!). Perhaps as Adam Smith would say there is an invisible hand guiding me in my graph journey. That or maybe this kooky girl started getting some crazy ideas while out walking her dog! 

Anywho, let's get on to the meat of the post: layered graph drawings! If you do a Google search of layered graph drawings, you're going to see the Japanese name Sugiyama pop up all over the place. That is because the framework associated with it describes the most common and popular method for creating layered/hierarchical graph drawings.It got its name from one of the men behind it, Kozo Sugiyama.

The goal of the Sugiyama approach is to draw a directed graph (or digraph for short) in a hierarchical fashion where you have vertices grouped with other vertices into layers, and most of the directed edges flow according to some direction (could be top to bottom, bottom to top, left to right,..etc.). Visualizing a directed graph in this fashion can give you a sense of hierarchy with the data and also convey dependency relationships more elegantly than if drawn otherwise. The simplest example that I can think of relating to this idea that also happens to show up all the time in computer science involves rooted trees. In many computer science applications you will have a tree with some designated vertex drawn at the top or bottom of the tree that is referred to as its root. From there you have leaves (the nodes at the very bottom of the tree/eh...or top) and internal nodes which are just the rest of the tree's vertices. It is almost always the case that your rooted tree's nodes are all drawn with a layering approach (the different "layers" being called the levels of the tree). The neighbors of the vertices in the graph are appropriately called its ascendants or descendants, implicitly giving the graph some sense of direction if directions are not already assigned to the edges. Here is a sample rooted tree produced by moi via Paint. :-)
 Of course, the real utility of layered graph drawing methods comes when you have something much more than a tree....something with lots of cycles and edges scattered here and there. You can imagine that it is harder to decompose such graphs into layers and then to meet all of the other visual aesthetics (e.g. minimized number of edge crossings). It should come as no surprise then that the classic Sugiyama-based approach requires more than just a couple of steps. In total there are four general steps as listed below: 

1. Remove all directed cycles.
2. Assign vertices to layers. 
3. Reorder vertices within layers to minimize edge crossings. 
4. Reassign x coordinates of vertices as needed to get rid of bends produced by "dummy vertices" in layering step. 

And maybe as a semi last step or four and a halfish one you'll want to reinsert any edges that you may have taken out in step 1. 

Mind you that there is not one fool proof way of tackling any of these steps which are all hard! Many different algorithms and heuristics have been proposed for handling each of them. A case in point regards the first step which I will now discuss: directed cycle removal. 

Directed cycles are just what they sound like: cycles that have directions associated with the edges, all pointing clockwise or counterclockwise so to speak. To eliminate the cycles in a directed graph you can take one of two approaches: remove edges from the cycles, or reverse the direction of some of the edges in each of the cycles. Whichever approach you take though it would be nicest if you didn't have to remove that many edges or change too many edge directions. That leads to the so-called minimum FAS (Feedback arc set) problem. The aim is to find a set of directed edges (arcs) whose removal or edge reversal will make the digraph acyclic and that has minimum cardinality. The complementary problem to this is finding a maximum acyclic subgraph of a directed graph.  

One algorithm that was proposed for this problem was by Berger and Shor. It works like this: Let what is denoted as Ea initially be the empty set. Pick any vertex from the directed graph and examine its in-degree and out-degree. If the out-degree of v is greater than or equal to its in-degree, then add all of v's outgoing edges to Ea. Otherwise add v's incoming edges to Ea. Once Ea has been updated remove v and all of its edges from the graph and repeat until you can't anymore. The edges within the final Ea will induce an acyclic subgraph (if not then you would have an outgoing and incoming edge for a vertex within Ea but that contradicts the construction of Ea). 

Another algorithm that I am currently reading is by Eades et. al. Their algorithm is also easy to explain, though rather than arbitrarily picking vertices the focus is on handling sources and sinks of the graph (vertices that have all outgoing or all incoming edges respectively)and creating so-called vertex sequences out of them. I will not outline the steps of their algorithm as with Berger and Shor but I will explain the intuition behind constructing these "vertex sequences". Suppose that you have labelled a set of digraph vertices as v1, v2,...vn and arranged them horizontally according to that order. Locating the directed edges that go from right to left will pinpoint the edges that finish off directed cycles. Finding a minimal set of edges "against the flow" will then help us in tackling the minimum FAS problem described earlier. 

There are many more ways of handling the cycle removal step but I will stop at these two for now. It might not sound so bad up to this point, but keep in mind that this is just step numero uno in the list of four (or four and a half-ish) steps outlined in the Sugiyama framework. Rather than be three quarters empty though I choose to be one quarter full. :-) (Heck! Even if I hadn't read any of this yet I'd be a cup full cause I just love the material so much! <3 ) Perhaps I will go on another one of my random walks now and let the graph info sink in. ;D   

References: 
Bonnie Berger and Peter W. Shor, approximation algorithms for the maximum acyclic subgraph problem, Proc. First ACM-SIAM Symposium on Discrete Algorithms (1990) 236-243

P. Eades, X. Lin, and w.F. Smyth. A fast and effective heuristic for the feedback arc set problem. Information Processing Letters, 47(6): 319-323,1993