Assignment #6 Description


Now that you have the ability to read from the web and hopefully keep track of how many substrings appeared on different sites, we can actually go through the task of calculating what you might want to charge different companies for the substrings they have on their web sites. You will also be writing some code that should make the code more efficient so that you won't be waiting as long for it to complete that tasks.

Objective 1 - Now is the time when you get to use that extra buffer space you put into your binary file. This is going to test some of your designs. Right now you have a counter in your substrings that is a general use counter. Put in the ability to store 16 counters in each substring. This will be an array of ints basically. It should be written out to file and read in from it. Your substrings should also allow you to read any of those counters and well as store the general counter into one of them. Each counter is going to be assocaited with a company as we will see in a second.

Objective 2 - Now that you have some conatiners that will perform fairly efficiently, you get to keep track of the counts for several different web sites and decide what substrings to charge them for and how much you will get total. You will use the WebSpider class you wrote last time to traverse different web pages. Write a new class called ChargeFinder that is constructed with either a file object (FILE* or ifstream&) or a string for the name of a file. The file is a text file that has one site URL per line. What ChargeFinder should do is create a WebSpider for each of the sites listed in the file and do substring counts on them. You get to charge each site for 3 substrings at $0.03 for the first one, $0.02 for the second, and $0.01 for the third, each is per occurance. ChargeFinder should also have a findAndOutputCharges method that takes either a file object or a string and writes a file where each line starts with the site URL, just like the input file, but is then followed by the 3 subtrings they are being charged for (in order) with a count for each and the total amount charged. The last line of the file should be the total amount charged.


Submission executable - The executable that you create for this assignment should take two command line arguments. The first is the name of the file that it is supposed to input. It should then create a ChargeFinder object with that file, do the work to figure charges, and output it to a file with the name of the second argument. I will put a file called ChargerInput.txt in the /CSCI1321 directory of snowwhite. I want you to run your program with that file and send me the output file with the name ChargerOutput.txt with your homework submission.

Extension - What I'd really like to do, but don't think you will have time for is to make it so that no two sites are charged for the same substring. As an extension, put in this alteration and decide what company should get charged for each one so you make the most money.


Once again I would like the written design handed in to me. The code need only be mailed on the date it is due, but this time I would also like you to return both the original design and the revised design. It would be nice if you can make in the revised design where you made alterations. This can be done by putting in any character string that stands out. Making sure it isn't common also allows you to do a search for it before the next assignment and remove it easily. I would recommend something like "!ALTERED!" so that it is also clear to me what it means. Of course, this has to be in a comment if it is parts of your header file. Also make sure you e-mail me the ChargerOutput.txt file with your submission.