We now have an idea on how to create a class in Python. For our FASTA reader we will use a very similar approach as the employee class created before. Based on the simplicity of the FASTA format it is easy to see what attributes we need in our class: sequence title (header) and sequence. That’s it. Speeding up, our class will look like this
1 2 3 4
all we need for now. Of course we could include sequence length,
sequence type, or any other relevant information as a class attribute.
But we will stick with those for the time being. NOw that we have the
class declared, we need to read the file and create instances of the
Fasta class to store all our sequences. We already know how to open
and read a file and create functions. We put everything together in a
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
At first, it looks scary. But is not. There are many ways to create this
loop and to read the sequences, and many ways to make this loop shorter.
We will get there, eventually, but for starters this is OK. Basically,
in the above script we have a
Fasta class, a
read_fasta function and
a couple of lines to read a print the results. The
basically checks all items of a list and see which is the first
character of each item: a
> sign indicates that the item should be
parsed to the temporary name string, another character redirects the
item to the sequence string. Instances are created on the fly and the
attributes assigned as what was file content is being scanned. In the
end, just to be sure of what we accomplished with the function we print
the list and loop through it and print the name of each sequence read.
Notice that the loop counter/index is the one that receives the class
attribute. The biggest advantage of this way of reading FASTA files is
that the class and the function are reusable. Basically we can create a
file (for instance called readfasta.py) and import into another script
and we have a FASTA reader ready to rock. We would need to tweak a
little bit in order to catch exceptions, but with consistent FASTA files
this would work fine.