Covering the bioinformatics niche and much more

Paper of the Week: A Quick Guide to Organizing Computational Biology Projects

| Comments


This is a new series here at the Blind.Scientist headquarters. My team and I will try to feature one scientific publication a week, it might not be hot from the presses or it might not be only about biology and/or bioinformatics. We (my team and I) will try to be eclectic and cover different areas. Alternatively you will find these posts on Research Blogging too.

This week we start with a recent publication that appeared on PLoS Computational Biology, titled A Quick Guide to Organizing Computational Biology Projects, by William Stafford Noble (reference and link are on the bottom of the post). It’s a well written publication, that falls under the Education scope of this PLoS, understandably by its title. Mainly it tries to teach some tips and tricks on how to organize your files in a directories (folders) and subdirectories (subfolders). Believe me, all advices in the paper are sound, but only if you know nothing about computers or you’re just starting using them.

Apparently the author focused on newcomers to the bioinformatics field, mainly biologists learning the skills and not computer scientists making the jump. In my opinion this is the wrong focus. Hardly these days you find some summer student, even the ones that go directly to the wet-lab, with no computer skills. The kids come to the lab these days knowing how to program, or at least with good notion of scripting and they have used computers basically their whole lives. This paper should have been focused on old timers, like myself, and go deeper into the subject. We, old timers, already have a lot of baggage and a lot of bad habits on daily work, and we read this with a scorn on our faces. We read this and say (or think) “why do I need to do things this way?”. We need this kind of advice, but I don’t know if this is something that should be granted a scientific publication. Any computer scientist fresh from graduation and starting at some multi-national company would laugh at this and maybe think why do biologists need to put this on paper, or PDF files and even have it on a scientific journal.

Most of the tips presented in the paper, at least for me, are a mere declaration of what common sense should be applied to manage digital files. I’ve been in the trenches, I suffered with my own disorganization, and I learned from it and from a myriad of blogs and websites devoted to software design and development. These references not only give you a perspective on what the real world of software and project development looks like (or should be) but they teach you how to survive in the jungle of “databases” stored in Excel files.

It’s really sad that in 2009 we need to read these kind of papers, or even a paper with this subject is published. I know dozens of bioinformaticians that would be able write an identical or better publication, but decide not to write it because this is basically common sense, or maybe they wrote the same thing in their blog. Some of them will scratch their heads and think this was a sure thing, an easy publication where they just needed to write about what they do daily. Just browse some Bioinformatics and Computational Biology blogs and you will get more and better advice than you get from PLoS and this paper.

We need more papers like my friend Jan Aerts and his colleague published on Ruby and Bioinformatics. That’s a paper that makes you want more and gives you a primer to do more. It’s not a paper about common sense. But, hey, not everyone has it.

Noble, W. (2009). A Quick Guide to Organizing Computational Biology Projects PLoS Computational Biology, 5 (7) DOI: 10.1371/journal.pcbi.1000424