Assignment #6


This is your first assignment "playing" with Perl. First go to GenBank and pull down the DNA sequence data for some species or just select a single gene from some species. Don't format it nicely by hand. Use code like what we have written in class to read in the sequence and put it all into a single long string with on the bases it in.

For this assignment I want you to do something to count how many of a few types of amino acid are coded for in that string. This would be a very hard problem to do correctly, but we are fine with doing it incorrectly. Segments of DNA three bases long code for different amino acids. You book as a table of them on page 157.

The code you turn in should have three subroutines. The first has the code that cleans up the DNA sequence to get rid of things that aren't bases. It should take a list and return the long string. The other two should use two different methods for "counting" amino acids. Include comments describing how they work. After you turn this in we can discuss your methods and why they might not be scientifically accurate. We'll do more on this later in chapter 8.