Dave’s Logs – Page 33 – And the sign said “Long-haired freaky people need not apply”

What is Dave doing in Bioinformatics? Pt. 2

Fri - 11.6.2009 by Dave·5 Comments

And we are back on the slow crawl toward eventually explaining what I do, out here in the darker recesses of my lab tucked in the remote Kansai countryside.

Aside from breeding deadly mutant monkeys to serve in my army of evil minions when I kickstart the world-domination part of my plot, that is.

Before I go any further, let me remind the casual reader that: 1) it is most likely nice and sunny out there where you live and you would be considerably better off looking at squirrels running through the trees 2) if you have even the slightest inkling of formal mathematical/computer science training, you will be better served foregoing this edulcorated version in favour of one of the 10 million tutorials and entries on bioinformatics available throughout the internets (Wikipedia being a good place to start). The entry written henceforth is geared at some hypothetical grandparents who would care to know what the fuss with modern Science is all about (for instance mine, were they not already perfectly content in the sole knowledge that the good Lord has put all these tiny amino-acids together in the best possible way of all worlds and that modern genetics is the work of the Devil¹).

In last month’s episode, we laboriously learnt that Biology abounds with really, really, tough problems. Two major points were:

1. For all practical matters, NP-Complete problems are all in the same bag: finding a way to solve one efficiently would mean you can solve any other in roughly the same order of time.

2. Once you have proved that a problem is NP-Complete, trying to find an exact solution for a real-life set of data, is about as meaningful as trying to take down the Everest with a toothpick. There are however plenty of ways to find an approximate solution. Proving NP-Completeness is your cue to start looking for approximation algorithms; and thus the fun begins.

Today, instead of going straight onto the myriad fun ways in which mathematicians solve biology problems, and which one of those I am actually connected to, another digression and an illustration everyone has heard of: genome sequencing.

Full genome sequencing (mapping the entire DNA of a given organism) is one of the earliest application of modern bioinformatics techniques, a seminal example: it starts off as a rather straightforward bio-chemistry problem, soon runs into pesky matters of size, complexity and intractability, goes through a difficult phase of alcohol and substance abuse, but is ultimately saved by the power of Love and Mathematics.

Before I go into the gory details, allow me to dissipate a common misconception about DNA sequencing: it is nowhere as easy as you might have been led to believe by your TV (most people’s preferred source of Science™ facts). Hearing of “DNA tests”, “DNA crime database” and other everyday life DNA-related techniques might make it sound like sequencing is as easy as sending your saliva swab to the lab and waiting a couple days for the results. In reality, despite serious advances, actual full genome sequencing is still a multi-year, multi-million-dollar affair. When people talk about DNA in a forensics or medical context, they are usually looking at a single base nucleotide, located at a precise location on one gene, out of the entire genome. Even cases that require a larger sample of such observations (e.g. DNA matching, when it actually uses sequencing altogether) are still somewhere in the lower hundreds (if that). That’s a mere 100 bases to look at, against 100+ million for the first organism fully sequenced, 10 years ago (make that 3 billions for humans). Quite a difference in scale. And, of course, this is one of those problems where solving twice the size requires much more than twice the time (hopefully by now, this does not surprise you, otherwise you might want to go back and read episode 1 again).

OK, let’s start:

Talking about remembrance…

Fri - 11.6.2009 by Dave·Comments Off

Yesterday, I completely forgot to remember, remember…

And now it’s already the 6th of November in Japan.

Maybe it’s not too late to go buy some gunpowder and have a celebration on my balcony tonight.

Erinnerung

Thu - 10.29.2009 by Dave·8 Comments

In order to prepare for my upcoming 3-month stay in Berlin, I have started brushing up on my terminally rusty German: buying a couple books and checking out online newspapers somewhat regularly (more than just once every three months when I am curious to know the Frankfurter Allgemeine‘s position on some European issue).

Much to my surprise, I not only still remember a sizable chunk of German despite over 10 years with zero practice, but my level has in fact improved since then. That is to say, I am nowhere near fluent, nor able to remember half the vocabulary I once knew. However: turns of phrases and idiomatic expressions that I know would have me staring painfully for minutes on end back in high school, now seem perfectly natural to me… Most phrases hit the comprehension part of my brain directly, without going through the lengthy “decoding word-by-word and digging up through memory for idiomatic equivalent” phase. In some way I have magically become more “fluent” than I was, when last I studied ten years ago.

At first, I just assumed my memories were being overly modest and that, maybe, I was not the teutonic classroom failure I remembered being. Then I thought back of the long evenings laboriously spent stringing together 20 lines of homework, endless hours of classroom procrastination, barely coasting by, year after year, and the extremely mediocre A-level — or French equivalent thereof — grade that ensued. There is ample objective evidence that I really sucked as a high school student of German and it appears that I suck ever so slightly less, now that I am resuming ten years later… Which goes squarely against the widely accepted notion that foreign language acquisition skills decrease with age.

In proper logic-obsessed OCD fashion, I tortured my brain for days, trying to come up with a rational explanation for this, which did not involve being abducted, probed and experimented on, by German-speaking aliens.

And I think I found it…

The better half of the years spent studying German, were when I lived in Paris. I therefore studied in French. Grammar explanations, bilingual vocabulary lists, chatting with classmates, thinking about the ongoing lesson, were all done in French.

Nowadays: I live in Kyoto and there is very little French language in my life. Lots of Japanese, of course, but I would venture that well over 90% of my thoughts and interactions occur in English. When I read up a text in German, that voice in the back of my head, trying to make sense of what I am reading, is speaking English, not French.

Why wear a t-shirt after all…

Sat - 10.24.2009 by Dave·2 Comments

Yesterday night’s program included ample (and unexpected) display of full female topless nudity in a public place. For the second time in less than a week.

I must obviously be doing something right. (or very wrong, depending on which side of the ‘gratuitous boobage action’ moral debate you sit on).

Mediocre Art: a Theory

Thu - 10.22.2009 by Dave·1 Comment

After years of sensing it, without quite putting my finger on it, I have finally uncovered the ultimate truth about mediocre art and its root causes.

It is all about sex.

Sex and sexual desires, are solely to blame for every single one of those nights you spent attending overpriced, underwhelming, “art” performances. You know the kind: some friend-of-a-friend-of-an-acquaintance, half naked, banging on pots, ululating while playing the electric guitar with an egg beater and a 2000W amp or just exploring the relation between art, space and materialistic consumerism by slithering in a kiddy pool filled with mashed potatoes while their partner sprays them (and the first two rows of the public) with milk and coke.

To be fair, most art is about sex, great art included. When masterpieces do not straight up depict sex, they are most often about their author hoping to get laid, or consistently failing to.

On the other hand, mediocre art is all about keeping your existing sexual partner(s) happy. Sex is the glue that keeps together delusional twenty-something “experimental” artists, long after the last of their friends have faced up to their talentlessness.

Behind every over-affected improv actress, is a bored but madly in love partner. Behind every shitty garage rock band, is a dedicated girlfriend ensuring none of her friends ever miss a gig. Behind every pointless expressive dancer’s performance, is a poor sap playing a detuned violin with a hammer, too busy checking her ass to wonder if it really was worth enduring 15 years of classical training for this. The fecund fields of experimental artistry are littered with people who would have long given up inflicting their fumbling on a sine-wave generator to the public at large, were it not for a support base, spinelessly ready to dish out all sort of undeserved praise and support, as long as it grants them VIP pants access.

And please do not come telling me this is a victimless crime: my eardrums and psyche, battered by hours of uninspired pseudo-stream-of-consciousness drivel recited to the sound of glass rim music, beg to differ.

What is Dave doing in Bioinformatics? Pt. 1

Fri - 10.16.2009 by Dave·7 Comments

Two months and countless draft embryos after initially promising it, here is the first part of an unfathomably long rant describing my field of research. I honestly don’t expect anybody to subject themselves to that read, but at least now I have a place to send those who foolishly ask me about it at cocktail parties.

The short answer is that I do research in Bioinformatics, which is where Mathematics (along with Computer Science and a dozen other disciplines) meet with Biology and Genetics in a dark back-alley, and do all sorts of indescribable things to each other in the hope of: creating a better world, curing cancer, breeding the next race of eugenic übermenschen or making a few bucks for Big Pharma… whichever comes first.

But that sort of answer, while technically correct, does not really tell you why such an unnatural coupling of disciplines was warranted in the first place. Allow me to start at the beginning. Way at the beginning.

[open long semi-relevant digression that can be advantageously replaced by a thorough read on Complexity Theory, if you feel up for the more sciencey and truthy version of things]

The scientific problems of this world tend to fall in either of two categories: those you might eventually solve with a good computer and some time… and those you will never solve exactly, no matter how much crazy sci-fi supercomputing power you throw at them.

This “solvable” vs. “not solvable” demarcation might sound like a tautology, until you understand the full meaning of “never” in the above statement: these, are not problems that might be solved one day, when science progresses far enough or computers get ten, a hundred or a million times faster. These are problems whose solutions require calculation of a complexity that is proven to be beyond the reach of any conventional means of computation in any foreseeable future (“unconventional means” would begin with the discovery of heretofore unknown laws of Physics: in other words, unlikely in your lifetime. at best¹).

By and large, the mathematical complexity of a problem, is the order of time (or computing power) it will take to solve it, relative to its size.

Without calculating the result of a certain task, it is often possible to predict whether producing this result could or could not be done in a reasonable amount of time (where “reasonable” usually means “in less than the age of the universe, assuming the use of every single computer on earth”, or somesuch).

There are countless examples of tasks falling in the first category, “easy” tasks that can be solved quickly, regardless of how big they are. For example, anybody past kindergarten age can presumably add two numbers of practically any size with a piece of paper and a pen. You just add each digit one by one (and, yes, carry the one) and adding two 100-digit numbers will take barely more time than adding two 3-digit numbers.

Now consider a different task: say you are a traveling salesman who needs to plan their next sales route. You have a map of the region, with the towns you must visit and all the distances between them, given in kilometers. How do you find the absolute shortest route that will take you to each city at least once without wasting gas or time?

More to the point: how difficult do you think finding that route will be?

Sure, it sounds easy enough: pick a starting point, follow every roads that go from that city to another one, then onto the next etc. Keep the shortest distance you’ve found. Can’t be that tough, right?

Let’s say there are five cities: you pick a city to start from, then check all remaining four, and from each four, go onto one of the remaining three etc. etc. In total, that’s 5x4x3x2x1 = 120 different paths to compare (that product can also be written using the factorial function: n! = n x (n-1) x … x 3 x 2 x 1. e.g. 5! = 5x4x3x2x1). Not so bad.

What if there are a few more cities… for instance, two times more: 10 cities. That’s 10! = 10x9x8x7x6x5x4x3x2x1 = 3,628,800 paths to look at. Huh, that might take a bit longer to do by hand. No worries: somebody will write a computer program that gives you the answer in a couple seconds.

Except that, you guessed it, each time I double the number of cities, the difficulty does way more than just double.

For 20 cities, the number of paths to look at is: 20! = 2,432,902,008,176,640,000.

For 70, cities, there are 70! (that’s factorial of 70: 70x69x68x…x3x2x1) possible paths to check one by one. That number has exactly 100 digits. This is (very) roughly the number of particles in the entire universe. Assuming you were to put every single computer in the world to work on this, you likely would not be done by the time the Sun explodes.

Tschüss…

Wed - 9.30.2009 by Dave's Keitai·Comments Off

Stop screwing with me, Gas Monkey…

Wed - 9.30.2009 by Dave·2 Comments

You know what is worse than waking up to a water-heater that refuses to work when you go for your morning shower?

[…]

Having the fucking thing finally work, after you finished taking your cold shower.

There’s a poltergeist in my house, and it has a really stupid sense of humour.

Evicting the previous tenants…

Tue - 9.22.2009 by Dave·1 Comment

My new apartment comes equipped with a pigeon coop: fresh pigeon eggs for breakfast every morning, straight from my balcony…

Note to the genius realtors who spruced-up the place before I moved in: enclosing the entire balcony in a metallic net to protect it from these flying rats, was a very good idea with laudable intent.

It would have been considerably more effective, had it not resulted in trapping an entire pigeon family on my balcony, inside that net.

When everything else has failed…

Thu - 9.10.2009 by Dave's Keitai·Comments Off

An apparently distraught software project manager, turning to the gods of Meiji-Jingu with a slightly unorthodox ema request…