Zientzilaria

Covering the bioinformatics niche and much more

On Notes to a Young Computational Biologist

| Comments

Yes, I never blogged, never liked to. I always been a tad recluse when posting opinions, comments and other things on websites. But recently I have found many nice blogs and personal pages that have insightful posts, interesting information and tips-and-tricks for your daily routine. Pages that are really a good read. I have the feed to nodalpoint which is more a community than a personal page, and the latest feed brought me to this post written by Bosco in his blog. The post gives a list of advices for young computational biologists.

I cannot agree more with his post, and I wish that I had the same information in 1995 when I started my internship in the lab where I would come to do my masters. Back then we did not have the internet the way it is today, with the amount of information, the large number of open-source software available, no easy Linux installation, among other things that make anyone’s life easier (sometimes) today. As for any kind of list, one cannot be satisfied until s/he gives her/his own opinion about it. All his eleven entries are great, and here I take the liberty to expand each one of them giving my humble opinion (as pointed elsewhere, 97% of all advice is worthless, use whatever is good for you and throw away the rest):

1- On scripting language: definitely a must, either one of them. But I would stress some knowledge of shell script as a key point too. You don’t have to be an expert on bash scripting, for instance, but some basic stuff might come handy when nothing else is available (such as a Solaris machine with no Python, Perl or Ruby). Use this link as a reference guide (or have any published book at arm’s distance).

2- On programmable statistics package: just one addition R, free, open source and a great user base.

5 – On delete data that you don’t need: I would do that, but be careful. Backup, compress, move to obscure partitions, directories, whatever is easier before deleting data. You might think that the data is useless today, until tomorrow proves you wrong. The Gb price is going down every day, we are not in the age of doubledisk anymore. But annotation of everything you have is crucial.

8- On command line plotting program: very nice advice, just pick one that fits you better. This is also a must if you don’t want to publish lame graphs in your papers.

9- On programming editor: I personally use Kate , but any other option is great. Be sure to find something you are comfortable with.

I cannot find further advices on the other points, but I would again take the liberty to add some points of my own (always follow the 97% rule):

1- As one of the comments in his post pointed out: subversion. Maybe not the greatest invention after sliced bread, but it really gives you a great deal of help when working in different places/machines. Say you have a great idea at home and want to update your scripts but forgot to copy everything to your usb disk. Yep, subversion is there.

2- Be a generalist: learn as many languages as you can. You don’t have to be an expert in any of them [after all you are a biologist, right?(??)], but sometimes one feature that will save your day is available in a language that you never noticed.

3- System administration skills: yep, you will need then. Not the crazy things as knowing where every log file is in your system, but at least some handy commands help you check your system or maybe give you more ability to use such system. One extra advice here: don’t buy any books on Windows system administration.

4- Three words: Linux, Linux and Mac OSX. That’s what you really need. Yep, you need to know your way on Windows, but when you master the power of the command line, new worlds will open for your. Always remember: you can type faster than you click.

Thanks for Bosco to post his insightful advices. I just added my two cents of worth(less) advice.