This is the last entry based on the book. In my opinion, further topics in the book are a little bit redundant and can be accomplished quite easily if you have followed the tutorial here.
If a good number of people have interest in checking the remainder of the book, just let me know and I will get back and follow the book. At the same time I am accepting suggestions on topics to be covered (send me an email or leave a comment). I already have some in mind and I am preparing a couple for the next phase of the website. So, here is the last entry. Last time we saw how to extract the sequence from a GenBank file.
This time we are going to parse some other information from these files. Basically we will use the same idea of our last post to extract the Organism name, the Locus and the Accession number of the item. From our last entry we have to remember this
1 2 3 4 5 6 7
and modify it to our needs. Looks simple, and it is. Let’s see
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Just add a flag for each entry you want to parse and that’s it. For longer entries, such as the sequence, we have to use the same approach used before, with a boolean flag and concatenating the lines until another flag is found.
Well, that’s it. After 46 entries we start a new phase.