Thursday, January 31, 2013

Downton Abbey Graph

WARNING: Spoiler Alert! This article contains information from episodes of the current third season of Downton Abbey. If you have not seen the episodes shown so far and would like to have everything kept secret until you buy the regular or blue-ray dvds of the show, then think twice before you peruse the following paragraphs. Otherwise, enjoy and I hope that you like the graph! :-)

Valentine's Day is drawing closer and love is in the air! If you're like me and keep up weekly with the new Downton Abbey episodes, you will have noticed that tension is rising among the house servants. With two new footmen (one who is particularly handsome I might add :-) ), a new girl, and a widowed kitchen maid Daisy, the need for companionship has never been greater among the downstairs folk! Unfortunately, none of the feelings expressed so far are really being reciprocated. To make sense of all of the crushes being fostered in the newest season of Downton Abbey, I thought that I would make a digraph to represent the whole situation. As you can see below, the Downton Abbey digraph has a total of five nodes and 5 edges, one of which is a loop. I used nodes to represent the characters bitten by the love bug, and added little heart labels to the edges so that the information indicated by the arrow would be clearer. :-) If you don't remember the names of the characters, Daisy is the widowed kitchen maid. She's been working for the house for quite a while, being a regular on the show since Season 1. This season she's finally gotten the promotion to kitchen maid that she deserves after slaving in the kitchen for hours creating feast after feast for the upstairs folk to nibble at. Alfred and Jimmy are the two new footmen that have just been hired this season. Alfred was hired first after his aunt Miss O'Brien put in a good word for him with Lady and Lord Grantham. Jimmy was hired later on, his looks probably giving him the advantage in the job searching business. I feel lame about this but I haven't paid enough attention to know the actual name of the new girl. All I can say is that she is another servant that has just been hired and that she works as an assistant for Daisy in the kitchen. And finally, who would ever forget Thomas, the slimy scheming yet handsome footman who like Daisy has been on the show since the first season. 

Now that I have "formally" introduced you to all of the characters in the show we can get down to the real gossip! Let's look at the graph now and see what it tells us. Following from the bottom left we have that Daisy likes Alfred, Alfred likes the new girl, the new girl likes Jimmy, and it appears that Thomas also likes Jimmy. I don't know if Jimmy likes anyone, but judging by his appearance I'll say that Jimmy likes Jimmy (hence the loop in the graph). With the exception of that one loop (which technically counts as a cycle of length 1) none of the feelings amongst are the servants are really being returned back. :-( Pity. I suppose that this Valentine's Day they'll all just have to buy themselves a box of chocolates and watch airings of Bridgette Jone's Diary. 

 P.S. I just found out that the new girl's name is Ivy. Sorry about that Ivy!

Wednesday, January 30, 2013

Graph Theory Applications: Databases!

So far I've been discussing some of the more what I will call "theoretical" aspects of graph theory; theoretical in that I haven't really delved into any concrete applications of the topic. Of course I love graph theory for its own sake but the fact that it has so many diverse applications makes it all the better and adds another layer to an already rich area of discrete math. Hence the topic of today's posting: applications of graph theory in databases! :-) 

I partly chose databases as the application focus because I've been giving a lot of lectures on so-called graph databases. These are databases where the information is stored within graph-theoretic entities such as nodes and edges. This contrasts with the traditional SQL approach of having tables of rows and columns hold your data. There are many different graph databases out there, but the one I have been focusing on the past few weeks is Neo4j. It is Java-based, the j standing for Java I believe. If you do a general google search online you'll come across various tutorials on Neo4j as well as links to the rich Neo4j apis. I have not experimented with any other graph databases besides Neo4j. As such I can't say how they compare to each other. However, that does not change my general opinion on the merits of Neo4j, which I believe to have a lot of really nice classes and methods that allow you to quickly create a graph database and add information to it. For instance, if you want to create a so-called embedded graph database, that can be accomplished through the following line of code: 

GraphDatabaseService graph = new GraphDatabaseFactory.newEmbeddedDatabase(pathName); 

The path name just indicates where you are going to store your graph database (i.e. the name of its path). 

After that it's just a matter of adding nodes and edges (in Neo4j lingo the edges are called relationships) to your database! Adding a node is as simple as declaring a Node variable and calling the createNode method of graph from above: 

Node me = graph.createNode(); 
me.setProperty("name", "Tanya"); 

Node myDog = graph.createNode(); 
myDog.setProperty("name", "Argos"); 

In this case I added two nodes, one for myself and one for my dog :-). I've also added names for the two of us through the  setProperty method of the Node class. The entire Neo4j framework follows the property graph model, which means that the nodes and edges in your graph database can have properties associated with them. These properties store the information you want within your database and give your graph more meaning. 

And as for edges, if you would like to create an edge from one node to another in your database, you can do the following:

me.createRelationshipTo(myDog, MyRelations.LOVES);

By invoking this createRelationshipTo method I am creating an edge/relationship from myself to Argos(first parameter in method) that is of type LOVES. The method requires that you specify the type of relationship that exists between two nodes. I won't get into the specifics of how you define your relationship types, but just know that it can involve a special type of class called an enum in Java. 


And that's it! 

Well.....errghh...
To be completely truthful adding the nodes and relationships to the graph takes a little bit more work than just this. In reality you have to do all of the node and relationship creation within the context of a transaction and declare a Transaction object to accomplish that. As with the relationship types I will not go into the details of that in this posting. What I will point out though regarding the sample code is how nice Neo4j is in the way  they've set up their classes and variables. With the exception of the whole Transaction business, adding nodes (such as people or dogs) and relationships between them to a graph is a cinch! 

Now all of this coding is fine and dandy, but what should be emphasized even more is just how powerful and novel the idea is of storing large quantities of information within graphs. If you do a regular image search of graph theory it is most likely that you will stumble across pictures of what look like big colorful hairballs. A zoom-in of that will show that there are actually distinct nodes and edges distributed throughout said hairballs.How does this relate to graph theory? Well, it is becoming more commonplace nowadays to attempt to visualize data in the real world in the form of graphs. The "hairballs" that you see in your image search were made to represent such data which in our current information age is becoming larger and more complex. Trying to visually make out parts of huge graphs can seem daunting, so imagine how hard it must be to try and analyze it analytically and locate something very specific within the graph! It's like trying to find a needle in a haystack! That is where graph databases make their contribution and why I think they are so revolutionary. Having a database store the information of a graph-like entity according to the exact same structure can make things much easier for searching purposes than the traditional relational database approach. The best example that I've seen discussed so far is the following. Say that you would like to determine all of the friends of my friends within our social network. Using Neo4j the search can be accomplished simply by traversing the relevant edges in the graph, starting of course at the node that represents me; just follow all edges pointing to my friends and the edges to pointing to their friends in turn. That's pretty intuitive and probably what many of us would do if given the problem. No need to do complicated and resource taxing join operations in the classic table-based approach! And if you want to be more specific about the friends of friends search, say that I'm only interested in knowing the friends of my friends that happen to live in Puerto Rico. By using a graph database that implements the graph property model,this more specific search can be accomplished by going through the properties of nodes that are traversed and only including those that have a location property of Puerto Rico. Sounds simple enough, but it's pretty amazing! Especially if said social network looks like one of those hairballs in your image search. ;D 

P.S. If any of you have had experiences with other graph databases, then please leave a comment! I am curious to hear how other graph db's handle things like node/edge creation and searches.