Monday, November 12, 2012

The Square of a Graph Continued.....

Article Under Construction:
I just realized that I accidentally posted what I am currently working on but haven't finished yet. I won't delete it since there are some that have already started looking at it. Just note that it is not complete, though I hope that what is here so far is nice! Below you will see some explanation about the different types of graph implementations out there. This post will primarily be concerned with the adjacency list implementation. Enjoy!

In my last post I discussed the notion of squaring graph. To recap, the square of a graph G has the same vertex set as G and two vertices in the square are connected via an edge if they are at distance at most two in G.  Because each vertex is a distance of zero from itself there will be loops in the square of G. A simple way of determining G^2 involves matrix calculations, (mainly multiplications), with the adjacency matrix of the graph that is being squared. Since matrix multiplication is the dominant operation in this simple method the complexity is around O(n^3) where n is the size of the vertex set of G; not very good if you have a lot of vertices to look at in your graph.

So....a natural question would be: is there a faster method for squaring a graph? It's a question that I am still mulling over and will continue to address in this blog, though for the time being I have at least thought of an alternative method for determining the square of a graph that does not involve matrices or matrix calculations. My method relies on another popular implementation of graphs: that which involves so-called adjacency lists. What I will discuss in the following paragraphs will sort of be a mash up of linked lists and queues with the words adjacent and adjacency popping up a lot. Let's see what this is all about.

So...., what's an adjacency list? Well first of all, let's look at adjacency matrices in a little bit more detail. If you have a dense graph with lots of edges relative to vertices, you'll expect to see significantly more ones than zeros in the associated adjacency matrix. That's okay. Your matrix has a lot of space to store graph information but almost all of the space is being used as needed. No problem. However, consider the opposite case: your graph is sparse with relatively few edges to vertices. If you draw out your sparse graph with the dots and lines shown in previous posts, it may look kinda flimsy; not a whole lot of edges holding your vertices together. A simple example of a sparse graph would be a tree (has exactly n-1 edges defined for its n vertices). Below is a picture of some trees for reference. This same "flimsiness" will be reflected in the adjacency matrix of the graph in the form of lots of zeros. Since all of those zeros simply reflect the lack of connection between two vertices, using an entire n by n matrix for a sparse graph would be wasting space (and as n continues to get larger a considerable waste of space). That's where adjacency lists come in. Adjacency lists are basically linked lists that store only the information that is necessary for a graph.

In the adjacency list implementation of a graph, each vertex is assigned a linked list of its own that contains references to all of its neighbors. If vertex A is not adjacent to vertex B then neither of them will have a reference to each other in their associated adjacency lists. This can save a lot of space, though in terms of any drawbacks I should note that it can involve more searching time due to the nature of linked lists. Any readers that are curious about this last statement can refer to any source on linked lists which should provide details on their searching time complexities. Below is a picture of a graph along with a drawing of its associated adjacency list storage to give you a better idea of how this works.



Sunday, November 4, 2012

Don't Be Such a Square!

I think that this blog has some good momentum going. Up to this point I've discussed some of the basic terminology of graphs, introduced you to graphs with interesting sounding names, and in my last post I went into a little bit more detail about one particular graph, the so-called caterpillar graph or caterpillar tree. Rather than delve right into a proof of the claims that were made at the end of that post, it would be a lot more thorough and just interesting to address some of the other things leading up to that result. I will do so in the order in which they were stated. Remember that the first claim was the following:

The square of a caterpillar graph contains a Hamiltonian path.

A natural question then is, "What exactly is the square of a graph?" You can think of the square of a graph as the original graph plus some things extra. The extra are edges that you add between two nodes in the graph if they satisfy a certain relation. In total the square will consist of the same set of vertices as the original graph with the relation defining edge connection being that two nodes x and y are neighbors if there is a path of length at most 2 between them in the original graph. That's a bit of a mouthful. Below is an illustration of this whole squaring process. To the left is our original graph which I am calling G. To the right is its square called G^2. Notice that if two nodes are already connected in the original graph then they don't need to have another edge added between when calculating the square.Also notice that the square of a graph will have a loop from each vertex to itself.
Seems simple enough. In computational terms though how difficult is it to compute the square of a graph? From a computational perspective it might be interesting to know if the square of a graph can be computed efficiently (informally meaning in a "reasonable" amount of time). This is a question that I will explore in the remainder of this graph. It is one that I am currently trying to find information for. There are plently of papers online that address the question of calculating roots of graphs, but I don't see any that at least in the title state that they are looking into algorithms for computing powers of a graph. I don't know what the computational complexity is for computing the square of a graph but in the following paragraphs I will describe a simple method for determining a graph's square and talk about its complexity.

Method #1:

One very simple way of determining the square of a graph is by performing some calculations on its underlying adjacency matrix. The adjacency matrix of a graph is a matrix representation of the connections between nodes. The rows and columns correspond to the vertices of the graph. Row i corresponds vertex i and column j vertex j. If an edge exists between vertices i and j then the i-jth entry of the matrix is a 1. Otherwise the entry is a zero. This is the typical definition of an adjacency matrix for a simple undirected and unweighted graph. Below is an example of such a graph and its corresponding adjacency matrix. In this example I have labeled the vertices a through d and the columns and rows of the adjacency matrix correspond to those vertices in the same order from left to right and top to bottom. Notice that since the graph is simple the entries along the main diagonal are all zeros. This just reflects the fact that we are excluding loops.
There is a basic result in graph theory that shows that if you square the adjacency matrix of a graph, the i-jth entries  correspond to the number of walks of length 2 between pairs of vertices i and j. I will not go into the proof of that but simply state it as fact in this post. It follows from that result that if you 1. add the entries in the original adjacency matrix to the non-diagonal entries of the square and  2. convert the nonzero entries to the number one you will obtain the adjacency matrix for the square of the graph. This method then gives us a simple way of determining the square of a graph.See if you can follow steps 1 and 2 on the first graph of the post to obtain the adjacency matrix of its square on the right.

Now for the complexity part. In the computation for this method, the two key steps are 1. matrix multiplication, and 2. matrix addition. Out of the two steps the one that takes more time computationally is the multiplication which takes running time around O(n^3) (The O here stands for big-O notation). I say around because the amount of time really depends on which matrix multiplication algorithm you select for squaring the adjacency matrix. The traditional method of multiplying rows by columns is O(n^3). There is a better algorithm for multiplying matrices though that is O(n^log7). It is known as Strassen's algorithm and was noteable for being the first one to break the O(n^3) barrier. The addition of two square matrices just involves adding elements of the corresponding matrix entries and is O(n^2). So using some basic results involving big-O notation we can say that this simple method of squaring a matrix is O(n^3).

Some of you may wonder if a better method for the problem at hand exists. In terms of complexity you may wonder if  we can improve on this upper bound in any way. One possibility that I will consider in an upcoming post is to look at the incidence matrix representation of the graph rather than the adjacency matrices that were involved in the above calculations. What exactly incidence matrices are as well as an elaboration on their uses will be discussed in that post. Try to see if you can find or think of faster ways of computing the square of a graph. As I hinted above, the answer may lie in the underlying implementation of the graph. As a side note it turns out there are lots of ways of storing the information of a graph besides the simple matrices discussed above. In terms of their set up they may seem more complex but are well worth the effort in other ways.

So until the next post....keep on graphing! :-)

Wednesday, October 31, 2012

I Spy With My Little Eye.....A Caterpillar!



In my last post I talked a little bit about what graph theory is and some of the different terms and names that pop up when you read about it. From the assortment of names and pictures you might have gotten the impression that graph theory can get pretty diverse. Not only is it varied in the kinds of graphs that you see, but also in the different things that people are interested in studying. A case in point comes when you talk about caterpillar graphs. As with the last post, a few definitions are in order to understand what is to be said.

The degree of a vertex in a graph (here I am using graph to refer to the simple graphs introduced in post 1) is simply the number of edges incident to it. This does not come up in simple graphs but to be complete I will say that in the case that a vertex is inside of its own loop you count the loop edge's contribution to the vertex twice (i.e. the degree of a vertex that only has a loop incident to it has a degree of 2).

Now for a couple of graphs:

The first one below is an example of a path graph. It's appearance matches the description quite well. All it's really made of is a sequence of vertices and edges which when followed from left to right account for the entire graph. (Pretty informal definition for the time being but it will do) It has two endpoints of degree one and the remaining internal vertices all have degree two. Apart from the tree that just consists of a single isolated node it is the simplest type of tree that you can have.
And as for trees....

A cycle in graph theory is what you would visualize a cycle as being in regular life. You have a series of vertices connected to each other in a circular fashion as seen below. The smallest cycle is the loop, followed by a cycle of length two made up of two vertices connected via parallel edges. The smallest simple cycle is the triangle which is made up of three vertices connected by three edges. Similarly there are four cycles, five cycles,...etc.
How does this relate to trees? Well, a tree is a connected (i.e. if there are at least two vertices you must be able to reach one from the other by following a sequence of vertices and edges in the graph) simple graph that has no cycles (i.e. it is acyclic). The simplest tree that has a vertex is the one consisting of a single isolated vertex. After that we of course have the path graphs which can be built on to make more complex looking graphs, an example of which is finally (drum roll...) a caterpillar graph! :-D  Below from left to right are pictures of trees with a caterpillar graph at the very bottom. Well okay technically they're all caterpillars (if you orient them a certain way you can get them to appear more like the bottom graph), but visually at least it's easy to recognize the bottom as a caterpillar.
Many interesting properties can be derived about trees in general. The following are a few basic ones:
1. A tree with at least two nodes must have at least two leaves (vertices in a tree of degree one).
2. A tree consisting of n nodes has exactly n-1 edges.
3. There exists a unique path between any two vertices/nodes of a tree.

(And if you're familiar with computer science you will know that binary trees come up a lot when discussing things like binary search trees, heaps, balanced trees, and the implementation of data structures like sets and maps. )

In addition to the general properties listed above, caterpillar trees have their own interesting  properties specific to their structure. One way to think of a caterpillar tree is as the result of a sequence of operations where the operation consists of two parts:
1. Add an isolated vertex to the graph.
2. Connect the isolated vertex to an internal node of the original path.
*And of course if none of these operations are performed on the path we can still say that we have a caterpillar!

What are these properties of caterpillar graphs you say? Well here are a couple of them below. Why are they true? That will be the subject of upcoming posts! To derive the first one I will first have to tell you what Hamiltonian graphs and paths are, and I will also have to explain what the square a graph is as well as its corresponding line graph. (Turns out that you can also just talk about the kth power of a graph in general. You can even talk about the square root of a graph!). Until then, see if you can come up with your own proofs. Draw out your own examples and see if you can notice any patterns. And as always, keep on graphing! :-)

Caterpillar properties:
1. The square of a caterpillar graph with at least two vertices contains a Hamiltonian path.
2. Similarly, they are trees whose line graphs contain a Hamiltonian path.




Tuesday, October 30, 2012

Graphs, Graphs and More Graphs!

For this first post I thought that I'd talk about my favorite of all of the mathematical subjects: graph theory. Appropriate for this blog not just because I like it but also because it has considerable overlap with computer science. Many algorithms studied in the typical undergraduate CS curriculum involve graphs (e.g. tree building algorithms, shortest path problems,...., and if you get more advanced, problems in network analysis). Graph theory also has a lot of overlap with other fields as well simply because it makes visualizing problems a lot easier. It's not unusual for scientists in certain areas of biology to be looking at similar problems (in graph theory terms) as a computer scientist. Graph theoretic problems have also been found to pop up in chemistry, archaeology, and sociology. This is a trend that I expect will continue to grow as data becomes bigger and more complex in the information age.

So now that I've whetted your appetite for graphs (maybe..hopefully) I thought I'd get into a little introduction about what exactly graph theory is and acquaint you with a few of the many many graphs that are floating around out there.

A graph is  a structure that consist of a set of vertices (also called nodes) connected to each other via edges. The vertices together form the so-called vertex set of the graph and the edges the edge set. The best way to think of a graph is as a collection of dots connected to each other by lines. The dots are the vertices and the lines the edges. Below is a picture of a very basic graph that I have drawn. As you can see, not all of the vertices have to be connected to each other, and it is possible that a vertex is connected to itself. When this happens the vertex is contained in its own loop. It is also possible that there is more than one edge between any two vertices. Two edges that have the same endpoints are called parallel or multiple edges. One last thing. I didn't indicate it in this picture, but it is also possible for a vertex to be all by its lonesome withouth any edges connecting it to other vertices in the graph. Such a vertex is said to be isolated.

Many variations of the concept of a graph can be studied. Direction can be assigned to the edges in the graph creating a digraph, and weights can also be assigned to vertices or edges to give some indication of additional characteristics of the graph. For example, one can easily model the network of streets in a city or town with a graph. The vertices represent the intersections of streets, weights can indicate their respective lengths, and arrows the directions of the streets. The graphs that I will show you now are in a sense the simplest of the type of graphs. They are called, not surprisingly, simple graphs.

Simple graphs are ones that have no loops nor multiple edges. The ones that I will focus on in this post have the added restriction that the edges have no assigned direction, nor weight, and all of them have a finite number of vertices. It is possible for you to have simple graphs that are weighted and have direction, but like I said I'm going to be keeping things "simple" for this post. :-)

Now on to the graphs! Whenever anyone starts studying graph theory they're bound to come across what would seem an endless number of definitions. Even as you become more used to the definitions you'll still end up seeing lots of different graphs with cute names. Why so many? My guess is because graph theory is combinatorial in nature. As you increase the number of vertices in a graph the possibilities increase rapidly to the point where you might feel lost in a sea of dots and lines. To make sense out of this and to remember ones that have important or peculiar properties it is helpful to come up with descriptive names as sort of a memory aid. So if you look at the first graph in the top left below, we have what looks like a bow or a couple of triangles merged together. This is commonly known as a butterfly graph. The name makes sense right? To the right we have a bull graph (if I was drawing this for a group of students I would probably use a red marker to make the bottom most vertex red and call it rudolph ;D ), below which is an example of a caterpillar graph. I have also seen the fourth graph in the upper right corner being quoted as a claw graph (it's a simple case of a complete bipartite graph).
If you start to think about it I'm sure that you can imagine your own graphs with descriptive names attached. The list of different graphs out there goes on and on:
theta graphs, cactus graphs, diamonds, snarks (I actually didn't even know there were snarks until a couple of hours ago), cages, wheels, stars,..etc. Besides these graphs there are also many named after people. Examples include the Petersen graph, Moore graphs, Hamiltonian graphs, and Eulerian graphs.

So I think that up to this point we can agree that graphs can have some interesting names, but what makes them special? Well....a lot of things actually. Depending on the area of graph theory, people may be interested in studying symmetries, degree sequences, existence or lack of cycles, paths, roots, spanning trees, and so on. In my next series of posts I will delve into more detail on the graphs just mentioned and the kinds of things that people like to study about them. Till then, keep on graphing!
:-)