Assignment #6 Description


Now that we have actually studied graph algorithms, hopefully you can see how the web is a directed graph and the web pages that you have been spidering form a directed graph. The way you are probably storing it is a form of adjacency list. For this assignment you are going to utilize that fact and write a graph algorithm that tells you something, hopefully interesting, about the sites that you spider.

Just to be nice, I'm going to let you decide what language you implement your algorithm in. You will need to write Djikstra's algorithm to find the shortest path from the top page to any other page in the site. Of course, to make this interesting your edges have to have weights. We will assign weights based on the idea that people are more likely to click on links that are near the top of a page. For that reason, the weight of each link will be given by the number of chracters into the page it was found at. This isn't a perfect metric, but it will work for our purposes.

So what your GUI should be able to do is allow the user to somehow select a page (it might be the current top page in your graph) and then run the shortest path algorithm on that. Then you need to display the path length information. You could do this by popping up a new window that has a list of the pages with their path lengths sorted in order, or you could think of a neat way to put that information in your graph drawing.

In addition, I want your program to throw up a dialog box (use JOptionPane.showMessageDialog) that tells me how long it took to do the shortest paths calculation. You can get current time information with the currentTimeMillis() method in System.

Extra Credit: For extra credit, use the same weights and also do an all-pair shortest path algorithm and tell me what page is the "most central". Tell me what metric you are using to define that.