Monday, February 25, 2013

Sugiyama Graph Drawing: Layering

It's another day in my graph visualization journey and I am about half way up my climb of "Mount Sugiyama". I've passed the first sign on the trail of no cycles, and after a little bit of huffing and puffing I have finally cleared my way through the layers of boulders and rocks scattered across what is claimed to be the longest path in the hike. Phew! With some minutes now to contemplate my journey thus far I thought that I would describe some of the things encountered during my most recent excursion through Mount Sugiyama.

Now I honestly don't know if there really is a Mount Sugiyama out there. I've gone hiking many times, but have never had the opportunity or gumption to go scaling up one of 'em big ole mountains. Intellectually though I consider myself to be going on a hike of sorts, this hike being through the Sugiyama drawing framework mentioned in an earlier post. In that post I listed the four general steps of that framework which I will repeat now: 

1. remove all directed cycles in your input digraph 

2. partition the vertices into groups/layers. The edges of the graph will run between these layers. 

3. reorder vertices within layers so as to reduce edge crossings 

4. reassign vertex x-coordinates 

I have already discussed the first step in a previous post so now on to the second step: layering (i.e. the treacherous path of Mount Sugiyama!!!!)

Before going into algorithm details we should first remember the  main purpose of the Sugiyama method: to visualize a directed graph in hierarchical fashion. After giving your digraph the layering treatment it should look something like the following illustration, with vertices bunched into groups/layers and the majority of edges pointing in one direction (up, down, left, right,...etc.). It is the layering step that is responsible for the nice partitioned vertex sets and the uniform edge flows. 
















Now, how exactly is layering accomplished? As with the first task of removing directed cycles, many different algorithms have been proposed for this problem, each with its own set of advantages and disadvantages. Of all of the layering algorithms in existence the simplest is what is often referred to as the longest path algorithm. It basically works by building up the digraph's layers in a bottom up fashion so that the majority of edges are pointing down. Here is a summary of the algorithm's pseudocode: 

Input: directed acyclic graph G = (V, E) 
Output: layering 

Initialization: 
create two empty sets V1 and V2 
create an integer variable currentLayer = 1 

while |V1| < |V| (Here |V| means the cardinality of V and ditto for V1)

Select a vertex v in V\V1 s.t. all of its outgoing neighbors form a subset of V2. If such an element exists then assign it to the current layer and append it to U. 

Else increase currentLayer by one and let Z be the union of Z and U. 


The algorithm is simple enough to explain, but it's extremely important to understand exactly what is going on. Let's look at the first selection statement. At the very beginning of each while loop iteration we are trying to find a vertex such that all of its outgoing neighbors form a subset of V2. V2 is initially empty so the step starts by looking for vertices that have no outgoing edges, namely the sinks. Every time that a sink s is found it is assigned to layer 1 and appended to vertex set V1. Following the algorithm you will continue to do this until no more sinks are left. Note that there will be at least one vertex in layer 1 since every digraph has at least one sink (See if you can prove this! Proofs are fun :-D ).

After handling all of the sinks we now reach the point where there is a vertex with positive outgoing degree. Its set of outgoing neighbors clearly can not be a subset of the empty set V2. So we increment currentLayer to 2, take the union of V2 with V1 so that it now has all of the graph's sinks, and troop on. 

Again, we check to see if there is a vertex not yet processed that has all of its outgoing neighbors contained in the previous layer(s). Assuming that we just finished handling all of the sinks the question is, is there a vertex that has nothing but sinks as its outgoing neighbors? The answer is yes (again try to prove it! Leave a comment if you would like to see a proof of this and the previous claim ;D ). So we assign that vertex to the currentLayer which is 2, and proceed. If an unprocessed vertex has at least one neighbor that isn't assigned to a layer, then we are forced to increment currentLayer and take the union of V2 and V1.  The observant reader should note that the topmost layer will contain the source(s) of the digraph.

The very crucial part that I want to point out here is the condition that a vertex can not be assigned to a layer until all of its outgoing neighbors have been given layers. This is the condition that ultimately creates the uniform edge flows in our hierarchical drawings. The edge flow produced by this algorithm will be from top to bottom. If you assign a vertex to a layer before one of its outgoing neighbors, then you will get an edge pointing up (against the flow) once that particular neighbor is processed. That is not what we want in this kind of drawing, so when programming the longest path algorithm, be sure to check the status of all of the outgoing neighbors of a vertex before giving it the layering treatment. The only time in the Sugiyama method that you will tolerate having edges against the flow is after layering is complete. Once that part is done you can reinsert or reverse the direction of edges that were handled when removing directed cycles (Step 1).

So yeah...remember that! 

A few last things I thought I would point out about longest path layering concern its computational and aesthetic pros and cons. Pros: the algorithm is simple, runs in linear time, and constructs a hierachical layout with minimum height. 

Cons: Other aesthetic criteria such as edge densities and drawing width are not met so well by the algorithm. In particular, drawings produced by the longest path algorithm tend to be wider at the bottom. 

Remember that too! :-D

As with all of my graph theory posts I really enjoyed researching the material for this one. It's fascinating that there are so many different ways of approaching graph drawing, a problem that initially sounds easy but becomes much more detailed and complex upon further inspection. I like to think of the algorithms and heuristics that I pick up on my graph visualization journey as pretty little gems nested inside the walls of a deep dark cavern. Because I enjoyed discussing layering so much, it is possible that I may revisit this topic in a future post. If so then I will most likely discuss another algorithm that is commonly used for layering, the so-called Coffman-Graham algorithm. Until then, keep on graphing and don't be afraid to go searching for some gems to add to your algorithms collection. :-) 

Sunday, February 17, 2013

The Square Root of a Matrix

At some point in our lives we're bound to encounter square roots; e.g. what's the square root of 4, or the square root of 9? I'm trying to remember the first time I came across those in school. Middle school? Fifth grade? Do you remember when you first saw square roots? Ahhhh....nostalgia. :-) 

(Snapping out of it) Okay, so uh.....advancing further along the algebra path, we then start to see square roots of not so "perfect" numbers (e.g. 8, 14, 39, 45). Unlike 4 or 9 we can't find whole numbers that when squared will match these exactly. It won't work! Never ones to admit defeat, we roll up our sleeves, manipulate 'em radicals, and figure out how to at least simplify said square roots (or use our calculators if all else fails! not recommended though :-P.....I MEAN IT). That's fine and most of us will then choose to stop our little "land of square roots" family vacation there. There are always a few brave souls though with a taste for danger who will decide to venture on (show-offs). First stop: the "imaginary sulfur pits" (ominous and foreboding music starting to play). Below is an account recorded by one of these  adventurers. His name is not mentioned as he would like to protect his anonymity. :-) 


Recording: 

Well there's nothing even trying to be imaginary about the smell. I'm catching a strong whiff of something resembling eggs (makes me think of my Aunt Edna's egg salad...uggh). The smoke's starting to depart now. There appear to be several piles of rocks surrounding the yellowish pit, each inscribed with various symbols: a bunch of lower case i's and -1's, the latter all surrounded by square root symbols. The writing looks pretty old. Were these ruins created by a long lost civilization? If so what was with their fixation on the 9th letter of the alphabet? Even more disturbing though: did the ancients know how to do their math because they keep on trying to take the square root of a negative number: impossible! Hmmmmm....what to do? I think I'll pull out my trusty math guidebook from the glove department. It probably has something to say about this. 

(flipping pages) 

Ah! Here we are! Imaginary numbers (explains the "imaginary" sulfur pits at least). And look here! It says that you can define a whole new set of numbers as an extension to the real numbers, denoting them collectively as C. Within this numbering system it is possible to take the square roots of negative numbers, the very base case being the square root of -1. Rather than writing -1 with a square root symbol over and over again it recommends denoting the square root of -1 as i. Helps to make simplifying complex radicals easier I suppose. Oh and look hun! There's even a cute little example here where it shows you the square root of -4: 2i and -2i. I get there being two square roots but this still seems a little weird. Hughh.... there's even a coordinate system on the next page where the x and y axes are also complex. How odd! It says for further information either directly ask for a park ranger who knows complex analysis or go read a freakin' textbook! How rude!

This family vacation keeps on getting curiouser and curiouser (Alice in Wonderland moment). The sulfur pits were alright, a little different for me to get my mind around. Kind of cool though. What do you say kids? You ready to call it a day or did you still want to check out the museum of petrified matrix roots? (Loud protesting noises and shouts of excitement in the background). Ahhh very well. Onto the museum then. 

(fast forwarding tape recorder) 

And here we are! Final destination stop: the museum. It looks nice enough. There are display cases everywhere filled with samples of petrified matrix roots. In the center of each display case is a black and white picture of a diagonalizable matrix. Below said pictures are roots that have been supposedly extracted from the matrices.The museum lady says that they're all from the forest of Diag'ble matrices, donated by generous mathematician patrons from around the world. Let's see how many there are: 1, 2, 3, 4, oh my goodness! 5, 6, 7, and 8!!! I thought that matrices just had two roots, you know, like the stuff at the sulfur pits and the radicals shown on the signs leading up to there. How on earth could something have more than 2 roots? Hmmmnnnn.... the sign by this display case might give a hint; or rather a whole explanation. It reads as follows: 

Square Roots of Matrices: 
The square root of a square matrix A is a matrix B such that when you square B you get A. How does one find these square roots and how many of them are there? The first temptation that many might have is to simply take the square root of each of the entries within matrix A and call it a day. If you test out your answer though you will see that in general due to the way matrix multiplication is defined, this approach will not give you back your original matrix A. A way to get to the square root of the matrices shown here is a bit more involved than that. 

The matrix shown in the above picture was taken from the forest of Diag'ble matrices in the year 1895. Although it is quite old, due to the preservation techniques of volunteering mathematicians it has maintained all of its properties since then, namely that it is still diagonalizable. This means that it can be written as the product of three matrices: (S^-1)*D*S (or equivalently as the product T*D*(T^-1)for some matrix T and diagonal matrix D). Here S is a matrix that has an inverse (denoted as S^-1) and D is a diagonal matrix that just has zero entries off of the main diagonal (diagonal entries may be zero as well). 

It is keeping this form in mind that one can then extract roots like the petrified ones displayed below said picture. To find them you take the following steps: 

1. Find the eigenvalues of matrix A. 
2. Find eigenvectors that correspond to the eigenvalues of #1. 
3. Form S from the eigenvectors of #2.These eignenvectos will be the columns of S.
4. Compute the inverse of S. 

5. The diagonal entries of D will correspond to the eigenvalues of A. Their ordering on the main diagonal will in turn depend on the ordering of the eigenvectors in S. E.g. if eigenvector x1 is put into column 1 of S, then its associated eigenvalue will be the first main diagonal entry of D. If eigenvector x2 is designated as the second column of S, then its eigenvalue will be the second diagonal entry in D,.., and so on. 

6. Find a square root of D and denote it as D*. Calculating a root is very simple in this case: because of the form of diagonal matrices you can simply take square roots of all of D's entries. Since each nonzero number within D can have two real square roots, this can potentially create up to n^2 different real D*'s.

7. SD*(S^-1) will all qualify as square roots of matrix A. Notice that there will be as many roots of this form as there are roots of D, explaining the case of having at least 8 square roots. 

End of plaque explanation

Well that makes some more sense when explained like that. Pretty amazing that you can have more than 2 square roots for a matrix! And I thought that all of that facebook and twitter stuff was crazy! What do you think about all of this kids?.....Kids? 

Ahhh...looks like they've gone straight to the gift shops. Better go chase 'em down before they try and destroy everything. 

End of Recording 


Well that was errghh..enlightening. I decided to leave the complex roots part to the park rangers and go on with providing a bit more explanation of the matrix roots discussion through example. Here is a 3 by 3 matrix that I grew for myself out of a Diag'ble tree seedling I found myself. :-) Following the steps outlined above let's see if we can spot some of its roots!

First of all here is the lovely 3 by 3 matrix: 

A = [9 -5 3]
    [0 4 3]
    [0 0 1]

Since the matrix is triangular its eigenvalues are the numbers along its main diagonal: 9, 4, and 1. 

With a little bit of calculation involving the equation (cI - A)x = 0 (c denotes one of the above 3 eigenvalues, x is an eigenvector), you will find that the following 3 eigenvectors work (I will denote them as x1, x2 and x3). 

x1 = [1 0 0]

x2 = [1 1 0]

x3 = [-1 -1 1]


For the matrix S I'll let the columns from left to right be the eigenvectors x1, x2 and x3 correspondingly. 

S = [1 1 -1] 
    [0 1 1] 
    [0 0 1]

I did enough paper and pencil matrix calculations this past week, so I'll just let you know (and trust me it's correct) that the inverse of this matrix S is the following (to find the inverse you can form the augmented matrix from S and the identity and successively do row operations on both sides until the left side is the identity matrix). 

S^-1 = [1 -1 2] 
       [0 1 -1] 
       [0 0 1]

We're almost done now! The diagonal matrix D will have the eigenvalues 9, 4 and 1 placed in order according to the order of the eigenvectors within S. As mentioned earlier a square root of D can be obtained by taking the square roots of each of its diagonal entries. 

D = [9 0 0] 
    [0 4 0] 
    [0 0 2]

Example of a square root of D: 

D* = [3 0 0] 
     [0 2 0] 
     [0 0 1] 

Notice that there is not one unique square root of our diagonal matrix D; each of the elements along the main diagonal of our sample square root can be positive or negative. When squared they will all yield the same result D. Thus there are at least 8 roots of matrix D which in turn implies that there are at least 8 roots for A (again this is because when you square S*(D*)*(S^-1) you get back S*D*(S^-1), A diagonalizing representation of A). Pretty cool! :-D Now is anyone up for checking out those imaginary sulfur pits? ;D


Monday, February 11, 2013

Sugiyama Layered Graph Drawings: Cycle Removal

Captains Log: Stardate: uggh........ (Zap Brannigan voice)

It's another day into my graph visualization journey and I've had a lot to think about during my random walks in the graph theory world (joke...get it? ;D ). I've always had an interest in graph visualization, but the real kickstarter for my current journey was a project for visualizing user input graph data that I am currently working on. The first and most basic graph drawing methods I started looking into for the project were circle-based layouts (see previous post on circle-based graph drawings). Soon after that I heard about BioFabric and had my mind totally blown. As it turns out the timing of my BioFabric discovery coincided with the second type of drawing method on my visualization to do list: layered graph drawings! 

I mark it as a special coincidence because of one similarity that the two methods have: layering. Now to be completely clear they ARE NOT the same thing! Layered graph drawing algorithms are meant to convey hierarchical information within a graph, whereas BioFabric is a tool for visualizing large and complex networks to make pinpointing things like clusters and graph structure more easy (i.e. its primary goal is not to clarify hierarchical dependencies). What I was just trying to point out was a parallel that I noticed between the two in that with layered graph drawings the vertices are aligned in layers with edges in between (the vertices in this case are still dots mind you), while in BioFabric the vertices are layered with edges in between layers as well but each vertex constitutes its own layer. (and most importantly, the vertices are no longer dots but LINES! REMEMBER THAT!!!). Perhaps as Adam Smith would say there is an invisible hand guiding me in my graph journey. That or maybe this kooky girl started getting some crazy ideas while out walking her dog! 

Anywho, let's get on to the meat of the post: layered graph drawings! If you do a Google search of layered graph drawings, you're going to see the Japanese name Sugiyama pop up all over the place. That is because the framework associated with it describes the most common and popular method for creating layered/hierarchical graph drawings.It got its name from one of the men behind it, Kozo Sugiyama.

The goal of the Sugiyama approach is to draw a directed graph (or digraph for short) in a hierarchical fashion where you have vertices grouped with other vertices into layers, and most of the directed edges flow according to some direction (could be top to bottom, bottom to top, left to right,..etc.). Visualizing a directed graph in this fashion can give you a sense of hierarchy with the data and also convey dependency relationships more elegantly than if drawn otherwise. The simplest example that I can think of relating to this idea that also happens to show up all the time in computer science involves rooted trees. In many computer science applications you will have a tree with some designated vertex drawn at the top or bottom of the tree that is referred to as its root. From there you have leaves (the nodes at the very bottom of the tree/eh...or top) and internal nodes which are just the rest of the tree's vertices. It is almost always the case that your rooted tree's nodes are all drawn with a layering approach (the different "layers" being called the levels of the tree). The neighbors of the vertices in the graph are appropriately called its ascendants or descendants, implicitly giving the graph some sense of direction if directions are not already assigned to the edges. Here is a sample rooted tree produced by moi via Paint. :-)
 Of course, the real utility of layered graph drawing methods comes when you have something much more than a tree....something with lots of cycles and edges scattered here and there. You can imagine that it is harder to decompose such graphs into layers and then to meet all of the other visual aesthetics (e.g. minimized number of edge crossings). It should come as no surprise then that the classic Sugiyama-based approach requires more than just a couple of steps. In total there are four general steps as listed below: 

1. Remove all directed cycles.
2. Assign vertices to layers. 
3. Reorder vertices within layers to minimize edge crossings. 
4. Reassign x coordinates of vertices as needed to get rid of bends produced by "dummy vertices" in layering step. 

And maybe as a semi last step or four and a halfish one you'll want to reinsert any edges that you may have taken out in step 1. 

Mind you that there is not one fool proof way of tackling any of these steps which are all hard! Many different algorithms and heuristics have been proposed for handling each of them. A case in point regards the first step which I will now discuss: directed cycle removal. 

Directed cycles are just what they sound like: cycles that have directions associated with the edges, all pointing clockwise or counterclockwise so to speak. To eliminate the cycles in a directed graph you can take one of two approaches: remove edges from the cycles, or reverse the direction of some of the edges in each of the cycles. Whichever approach you take though it would be nicest if you didn't have to remove that many edges or change too many edge directions. That leads to the so-called minimum FAS (Feedback arc set) problem. The aim is to find a set of directed edges (arcs) whose removal or edge reversal will make the digraph acyclic and that has minimum cardinality. The complementary problem to this is finding a maximum acyclic subgraph of a directed graph.  

One algorithm that was proposed for this problem was by Berger and Shor. It works like this: Let what is denoted as Ea initially be the empty set. Pick any vertex from the directed graph and examine its in-degree and out-degree. If the out-degree of v is greater than or equal to its in-degree, then add all of v's outgoing edges to Ea. Otherwise add v's incoming edges to Ea. Once Ea has been updated remove v and all of its edges from the graph and repeat until you can't anymore. The edges within the final Ea will induce an acyclic subgraph (if not then you would have an outgoing and incoming edge for a vertex within Ea but that contradicts the construction of Ea). 

Another algorithm that I am currently reading is by Eades et. al. Their algorithm is also easy to explain, though rather than arbitrarily picking vertices the focus is on handling sources and sinks of the graph (vertices that have all outgoing or all incoming edges respectively)and creating so-called vertex sequences out of them. I will not outline the steps of their algorithm as with Berger and Shor but I will explain the intuition behind constructing these "vertex sequences". Suppose that you have labelled a set of digraph vertices as v1, v2,...vn and arranged them horizontally according to that order. Locating the directed edges that go from right to left will pinpoint the edges that finish off directed cycles. Finding a minimal set of edges "against the flow" will then help us in tackling the minimum FAS problem described earlier. 

There are many more ways of handling the cycle removal step but I will stop at these two for now. It might not sound so bad up to this point, but keep in mind that this is just step numero uno in the list of four (or four and a half-ish) steps outlined in the Sugiyama framework. Rather than be three quarters empty though I choose to be one quarter full. :-) (Heck! Even if I hadn't read any of this yet I'd be a cup full cause I just love the material so much! <3 ) Perhaps I will go on another one of my random walks now and let the graph info sink in. ;D   

References: 
Bonnie Berger and Peter W. Shor, approximation algorithms for the maximum acyclic subgraph problem, Proc. First ACM-SIAM Symposium on Discrete Algorithms (1990) 236-243

P. Eades, X. Lin, and w.F. Smyth. A fast and effective heuristic for the feedback arc set problem. Information Processing Letters, 47(6): 319-323,1993

Wednesday, February 6, 2013

BioFabric: Combing the Lines Out of Hairballs!

It was not long ago that I officially proclaimed the start of my graph visualization journey on this blog. Two days into this analytic expedition of sorts I have already come across a branching path in the graph visualization world. One sign is pointing me to the chaotic land of hairballs, visualization cowboys darting back and forth to catch the nasty buggers with their algorithm loops. To the other path I see a sign pointing to a rather different sight, a land in which all species of hairball have gone extinct. In the fields of this unusual land I see what appear to be delicately woven tapestries floating about in space. What is this new breed of graph creature and where could it have possibly come from? 

As it turns out, deep down these tapestries are the same as the hairballs of the first branch's wild wild west. About a year ago a new cowboy came into town, bringing a rather different set of tools to help with the hairball infestation. Hairballs straying too close to the other side of the path are bound to meet defeat. Looking through the cowboy's set of tools you won't find the usual assortment of loop lassos and heuristics ammunition. No. Instead you will find a comb with the word BioFabric written over the handle. 

What is this BioFabric? BioFabric I have found out is a Java-based tool for the visualization of networks. Even those with only a brief exposure to graph visualization will find a plethora of tools throughout the Internet with different algorithms and capabilities for solving their complex data drawing problems. So,..., what makes BioFabric so different? What makes this cowboy stand out from the rest? What's the deal (in Jerry Seinfeld voice) with my unusually fluffy writing and short story-like introduction? I will say one word: lines! 

Now all of us graphsters and graphistas (males and females who like graph theory)are used to seeing lines for edges when we look at graphs. What I'm talking about are not edges though, but vertices. Rather than having the picture be made up of dots and lines we have lines and lines; specifically the vertices get drawn as horizontal lines and the edges between become vertical lines. Don't get it? Here I've made a few small examples showing what a few graphs drawn according to the traditional approach would look like if given the lines against lines treatment. 
The first example is the simplest of them all. We're just visualizing the graph of one dot (or K1 to be correct in graph lingo) as a horizontal line. What about the graph of two dots with a line in between (i.e. K2)? That will end up looking kind of like an I. The third graph which is the first cycle of the bunch (and K3 incidentally) will look kind of like a wind-chime after straightening things out (yeah...lame joke :-P). 

With those simple examples you should now have a slightly clearer idea of what I mean when I say graphing lines against lines. But these examples are small potatoes. If we are looking at graphs of this size or ones that are relatively small to the hairballs that I keep talking about, then the traditional approach would probably be a lot clearer to understand and eyeball for analysis. However, the focus of the last so many graph theory posts have not been about looking at these "small" manageable graphs, but rather the really scary monstrosities that have been replicating at an alarming rate in the land of hairballs. Those are the kinds of graphs that really benefit from the BioFabric treatment. The real power of BioFabric approach comes from the fact that your vertices become horizontal lines that can be of ANY length. Because said vertices can be as long as they wish, you won't have to worry about those horrid edge crossings anymore! If adding another edge to the graph runs the risk of introducing crossings to your visualization, just make the vertex longer and tack new edges to the end of it. The idea is that simple! 

Of course we've all been told time and time again that the devil is in the details, and some of you may be suspicious that this simple idea is too good to be true when you try to think how to organize the vertical and horizontal edges throughout this graphical mesh. If you do a little bit of reading up though (reference paper shown below) you will see that there is a very clear and systematic approach taken for this kind of layout scheme. So for my analytically minded readers who are not satisfied with just hearing that something is good without knowing why, I assure you that there is order and method that cleverly does the work just described. 

Now I have just been introduced to this new concept of graph visualization which is pretty ground shaking for me, the graph drawing traditionalist who always imagined vertices as dots. I am not a close-minded person though and find this new approach to be intriguing, interesting, and definitely promising! I plan on exploring this topic further and comparing it with the many vertex as dots approaches and see what kind of differences I find between the two general methods. I strongly encourage all of you to read up more on BioFabric's take on graph visualization. At the bottom I have provided links to the website of BioFabric and to a paper discussing the basic ideas behind it (it even talks about all of those darn details I was just telling you about!) Enjoy! 

..................................................................

Short Story Afterword: 

Long ago people were frightened at the possibility of the earth being round. It's not round! It's supposed to be flat! Hundreds of years later it appears that we have now reached a somewhat similar phase in history. I can hear the graphsters and graphistas saying: "Those vertices aren't supposed to be flat! They're round I say! Round!". Do you ever get the feeling of deja vu? (in Groundhog Day voice) :-) 


Link to BioFabric website: 
http://www.biofabric.org/

Link to paper discussing BioFabric: 
http://www.biomedcentral.com/content/pdf/1471-2105-13-275.pdf 

 

Some Math Trickery

Now when I use the word trickery in this post's title, I am not talking about any sort of drama (as those of you who have seen my Downton Abbey graph may be thinking). Sorry to disappoint but there will be no lies, betrayals, jealousies or schemings. No, my math trickery is much more peaceful and magical thank that. I have rolled up my math magician sleeves and have created a neat little mathematical proof with my trusty wand (i.e. my pencil). Below is a proof that I wrote recently for a matrix theory class that I am currently taking. I am pleased with it because it rather gracefully avoids some of the gory details that can crop up in problems like these that involve identifying eigenvalues.

Here is the problem: For the general n by n matrix consisting of all 1's (call it J), show that the only eigenvalues of J are 0 and n. 

Proof: 
One way to go about this proof is by performing row reductions and linearly combining the columns (these operations won't alter the determinant for finding eigenvalues). However, rather than go through all of that calculation business which trips me up sometimes, I'll do more of a direct proof. 

Suppose that x is the n by 1 eigenvector corresponding to an eigenvalue of J. Finding an eigenvalue corresponds to solving the system (tI - J)x = 0 where t is the eigenvalue and 0 is the zero vector. The matrix tI - J has (t-1) in the main diagonal and -1's everywhere else. If we multiply this out by x = [x1, x2,...,xn] we get the following system of equations: 

(t-1)x1 - (x2 + x3 + .... + xn) = 0 
(t-1)x2 - (x1 + x3 + ... + xn) = 0 
.


(t-1)xn - (x1 + x2 + ... + x(n-1)) = 0  

From the n equations above we can express the sum x1 through xn in the following form: 

(*) x1 + x2 + .... + xn = (t-1)xi + xi = t*xi for all i = 1,...,n 

Now let's add up all of the n equations shown above before *. This gives us: 
(t-1)(x1 + x2 + ... + xn) - (n-1)(x1 + x2 + ... + xn) = 0 

This simplifies to: 
(t-n)(x1 + x2 + ... + xn) = 0 

Substituting in (*) we then get: 
(t-n)*t*xi = 0 which is true for all i 

Now clearly t = n and 0 are eigenvalues. To show that these are the only eigenvalues of the all 1's matrix, assume that there is an eigenvalue t of J not equal to n or 0. Dividing by (t-n) and t we then get that xi = 0 for all i. That is a contradiction though because x is an eigenvector and eigenvectors are assumed to be nonzero. Therefore the only eigenvalues of an n by n all 1's matrix are n and zero.
 
I like this proof because it is clean and gives the result through a little bit equation manipulation. An advantage of following the messier row reduction method (not shown here) though is being able to see the multiplicities of the eigenvalues as well as the general form of the corresponding eigenvectors. Clearly, both proof approaches have their merits.

Fin :-)


Monday, February 4, 2013

Circle-Based Graph layouts

Going back to my whole hairball analogy (from previous posts), if you do any generic google image search for graph theory, you are bound to come across pictures of really complex graphs (from a distance I think that they look like my dogs' hairballs). Analyzing these can be a beast,..., so much that anything (and I mean ANYTHING) that could lighten the load on our machines and brains is most welcome. That is where the topic of this post, graph visualization, comes in. 

Just as there are lots of different graphs out there, (the number of possible graphs on n vertices shoots exponentially as n grows), there are lots of different ways of drawing out those graphs, some methods especially made for graphs with a particular structure. The drawing method that I am introducing you all to now falls under the circle-based graph layout category. That's a mouthful! Basically what it means is that you position your graph's nodes along the perimeter of a large invisible circle and then draw the edges/lines in between nodes for all of the connections that you have. Below is an example of a simple circle-based graph visualization. 



The idea is simple enough, but can easily be expanded on. What I pictured above is the crudest kind of circle based visualization that you could have for a graph. Remember that the idea behind any visualization is to make something about the graph's structure clearer to see and interpret. How are we supposed to glean any information from something that looks like an arts and crafts project from grade school? The solution is to impose additional aesthetic criteria on it and see if we can design algorithms to do at least a partial cleaning of the visual mess  created. To be more specific about what this "partial cleaning" entails, it means that we would like to minimize the number of edge crossings (points where edges meet in the drawing that aren't vertices.

This is an area that I have just recently begun to explore. So far I have read up on the following approaches discussed by Yehuda and Gasner (see paper reference below).

1. order the vertices about the circle's perimeter in a way that will be less akin to fostering edge crossings

2. take some of the edges that cross on the inside of the circle and have them loop around the circle's exterior instead (a.k.a. exterior routing)

3. find densely bunched groups of edges and rather than drawing all of them individually, have one edge represent all of them. This representative edge will split off into several smaller edges at its very ends so that it is clear what edges were being represented. This edge bundling saves ink and makes the structure of the graph easier to interpret. Another way to think about it is like having a cherry twizzlers rope and you pull at the ends a little bit so that at either end of your rope you have these little "danglies" hanging. The "danglies" were bunched together in the middle which serves as the representative edge in this case. Uggh....hope that that made sense. Not the most sophisticated sounding example I'm afraid. :-P

As I find out more of the specifics regarding implementation of these ideas I will discuss them in later posts. Specifically, what I am looking into right now is how to identify those edge bunches that I was just talking so much about. :-) 

The impression that I've gotten so far in my readings on graph visualization is that it is a very experimental field. There are so many different things that can affect the visualization of a graph that I imagine it must be hard to prove things. When this is the case a good approach to tackling the problem is to identify several key parameters and adjust them experimentally to see what matters, what doesn't, and which of your ideas pan out. At the very least I expect that I will be doing quite of bit of experimenting in the "graph theory lab".  Much safer than experimenting in the chemistry lab I will say! :-D

If you are interested in reading more about this post's topic then I recommend looking at the following paper. It goes into detailed discussion over the 3 approaches to improved circle based visualization listed earlier.

Reference: Koren Yehuda, Gasner Emden R., Improved Circular Layouts, AT&T Labs-Research, Florham Park, NJ 07932, USA