An antimalarial drug

If I were to design a new antimalarial, I would probably try my luck with a quinoline derivative.

It seems that current common sense goes precisely in the opposite direction: As Chloroquine resistance is widespread then it is better to avoid quinolines altogether as the resistance mechanism would probably be the same.

Maybe the resistance mechanism is the same, but, maybe it could select in an opposite direction. Crazy idea? Actually it is not mine at all, but based on study of the spread of Mefloquine resistance. Drugs with the the same resistance mechanism but forcing selection in opposite directions could be deployed simultaneously or in interleaved periods in time.

Speculations of a pen and pencil theoretician sitting in a country where a mosquito would freeze to death in seconds (barring some sporadic cases of airport malaria in the summer).

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: drug development, malaria

by: tiago

No Comments

Splitting the blog

This blog as been a mishmash of things related to my interests: computational biology, tropical diseases, software engineering, cinema, music.

I do believe that, I have very different readerships: one academic for the computational biology stuff, another comprised of computer geeks for my programming content and yet another, highly unexpected for my free time content (most of visitors coming from Google searches are after a photography that I have on my blog).

I also do believe that the more in depth content in one area alienates the readership of the other area. Furthermore I am very inclined to beef up my “free time” content.

As such I have decided to split Perfect Storm in 3:

  • Meet Cognitive Consonance my new blog about software engineering. I will try to maintain the same tone as of here: longer, informative/tutorial posts.
  • Serendipity will be my personal blog: cinema, music, human rights, practicing sports. Focused on aesthetics and form in as much as substance. Expect also a certain confrontational style.
  • Last, but not least, Perfect Storm will continue to be the home for the computational biology content.

I am currently setting up all the infrastructure (apologies in advance for any initial setup problem). There will be a certain amount of cross posting as sometimes there are clear overlaps in content.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: meta

by: tiago

No Comments

DSLs: specification and behavior

One of the interesting applications of a DSL lies in the inherent facility to separate an abstract (domain-level) specification from possible applications. Lets make this a bit more concrete with an example (taken from my malaria domain).

As it is becoming a pattern is my recent posts, I start with a smallish explanation of the biological and pharmacological background and then I go deep in the technical DSL/Groovy design and implementation part.

Antimalarial drugs have effects on parasites (being the desired effect the killing of lots of parasites). Roughly speaking a malaria infection can be seen as a progression in time of parasite loads: Parasites are multiplying (growing) and this growth is balanced by both the human immune system natural response and the effect of drugs taken (which goes by the name of pharmacokinetics - PK). Malaria parasite loads in humans can go up to 10^12 (10 to the power of 12, no typo).

PK is modeled by a function (I won’t go into details here) which is parametrized by drug concentration and parasite response (resistant parasites tolerate drugs better). As an example for Chloroquine in Groovy:

formula: {3.8 / (1 + 1/K + CQ)}

This (for now) magic formula, represented as a closure, has a 2 parameters (1/K) which is 68 micrograms/liter for non-resistant parasites and CQ is the concentration of drug in the blood.

This is the specification of the problem. Now, what do we do with this formula? The obvious response is to use it to do calculations (i.e. given a certain drug concentration, what is the value of the PK function. But, in reality we might want to many other things with it, like generating documentation (say, by creating a Word or LaTeX document) or by converting this formula into a a faster language (e.g. Fortran) for simulation purposes. I actually do both things.

So, one thing is the formula as a specification. Another thing, is what you do with it. And we can do truckloads of different things with this specification.

Lets see how we could do some of the different tasks described above:

Calculating the value of the function

Lets imagine that we want to print the values of the function between 0 and 1800 (being 1800 ng/mL a reported maximum concentration in the blood of the Chloroquine). The solution could be:

//formula is a closure with the formula
formula.K = 1/68.0 //We set the fixed 1/K parameter
(1..1800).each { concentration ->
    formula.CQ = concentration  //Varying CQ concentration
    println formula() //Execute closure
}
//In the example above

So, in this approach we take the closure, set the parameters (setting closure properties in Groovy is very simple as the example above shows), and execute the closure repeatedly.

I actually think that this example is of the worse kind possible, because it is blending specification with execution. That is, we specify our effects formula without any behavior and the we take the specification and execute it. So we are tying specification and behavior. Pedagogical and philosophical considerations aside, this works OK, is easy to code and efficient.

Generating Fortran code

The formula above is also used to generate Fortran code with the formula representation which is plugged in a malaria epidemiology simulator. In that case executing the closure with arithmetic semantics is useless, so another strategy has to be used.

The current solution gets the code AST representation through the meta class. Before I present the solution, I will show the full representation of the (slightly altered) formula and effect:

cqEffect = effect(
    name:       "General Chloroquine effect",
    formula:    {3.8 / (1 + km1/cq) },
    parameters: [km1: 68.0] //Hoshen98 microg/l
)
//effect creates an Effect object

(So km1 is a fixed parameter for the effect and cq - drug concentration - is variable).

The Effect object has a property, called code which has the Abstract Syntax Tree (AST) for the formula, the AST is accessed in the Effect constructor in this way.

this.code = formula.getMetaClass().getClassNode().getMethods("doCall")[0].code

Short story: Gets the meta class for the closure, gets the closure class AST, and then get the AST for the code of the method doCall which has the formula code for the closure. Whew, big, long train.

Caveat: Because groovy is compiled, and for memory and performance reasons, sometimes getClassNode might return null :( . If that happens to you google for “getClassNode groovy” as that issue is out of the scope of this post (I could get around this in my cases, up to now).

So, now we have to traverse the AST. In the most general case, this would mean creating a full interpreter for the Groovy AST, a breath taking task (but a good way to learn all about Groovy ;) ). In our malaria case we will only process arithmetic expressions (and if constructs, but I will not discuss that here for brevity reasons), so we expect the users of our DSL to be careful in just passing a arithmetic expression. As such the formula is a block of statements which happens to have only a single statement composed of an arithmetic formula:

def expression = it.code.getStatements()[0].getExpression()
println expression

The first line traverses the AST to get the formula. It only works because the closure code is of the form define above (single arithmetic formula). println results in:

org.codehaus.groovy.ast.expr.BinaryExpression@186d484[
  ConstantExpression[3.8]
  ("/" at 22:22:  "/")
  org.codehaus.groovy.ast.expr.BinaryExpression@ea48be[
    ConstantExpression[1]
    ("+" at 22:27:  "+" )
    org.codehaus.groovy.ast.expr.BinaryExpression@14dd758[
      org.codehaus.groovy.ast.expr.VariableExpression@174d93a[variable: km1]
      ("/" at 22:32:  "/" )
      org.codehaus.groovy.ast.expr.VariableExpression@61a907[variable: cq]]]]

Although it looks dreadful at first, a second inspection will surface that we have what we need.

A vanilla expression processor for the AST above could be:

def drillExpression
drillExpression = { expr ->
    switch (expr.class) {
        case BinaryExpression:
            return "(" + drillExpression(expr.leftExpression) + ")" +
                     expr.operation.text +
                     "(" + drillExpression(expr.rightExpression) + ")"
            break
        case ConstantExpression:
        case VariableExpression:
            return expr.text
            break
        default: return ""
    }
}

This would return the string: “(3.8)/((1)+((km1)/(cq)))”

From here I think it is quite easy to see how one could take an expression and covert it to LaTeX or Fortran code (the remaining work is really just LaTeX/Fortran syntax).

There are 2 drawbacks from this approach: It requires work to do the AST traversing and supporting for all AST types would be daunting work. At least in my malaria case the amount of work required is very manageable.

A completely different strategy to this would be to Monkey Patch numbers (i.e. massively alter the definition of the classes) and variables in a radical way: not to produce arithmetic results but to, say, generate LaTeX sources. That is probably possible, but it would be one of the worse examples of monkey patching that I could think of. Monkey business indeed!

There is also Groovy Code Visitor pattern that I did not explore… It would be probably a variation of the AST traversal strategy presented here.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: bioinformatics, declarative programming, groovy, malaria

by: tiago

1 Comment

Chloroquine malaria treatment and Groovy (DSL tactics in Groovy 2)

Chloroquine was, for many years, the workhorse against P. falciparum malaria. Around fifties (give or take a decade) resistance appeared in Cambodia and spread around the globe (if my memory serves me right there are at most 4 independent sources of malaria Chloroquine (CQ) resistance, being the Cambodia one the first to appear). Currently CQ clinical efficacy is deemed too low and CQ use is frowned upon. CQ is extremely cheap, therefore economically sustainable in Africa. The more current Artemisinin (ART) based drugs (ART, a short lived drug commonly used in combination with other - longer lived - drugs) are too expensive for most countries where malaria is a public health threat (thus requiring subsidies from external sources).

CQ is still used as a first line drug at least in Guinea-Bissau (On Google Scholar search for “kofoed bissau chloroquine”), even in the presence of resistance. A change of drug regimen (i.e. how the drug is used) seems to make its clinical efficacy go up and without increasing the spread of resistance. This is interesting from both a theoretical and practical point of view (being able to reuse CQ would be great given its price and wide availability). This is roughly the scope of my current theoretical study.

I am developing a Groovy model to specify CQ resistance. The fundamental concepts are:

On the drug side there are Compounds (e.g., Chloroquine) and Drugs (a drug is composed of one or more compounds, for instance, the widely used SP is composed of Sulfadoxine and Pyrimethamine. Chloroquine (as a drug) is composed of… Chloroquine - A single compound drug).

On the parasite side there are enzyme (protein) mutations. A mutation might help the parasite in tolerating a certain drug.

So here is my current piece of Groovy code to model CQ resistance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
cq = compound(name: "Chloroquine", abbreviation: "cq", halfLife: 45.d)
 
CQ = drug(name: "Chloroquine", abbreviation: "CQ")
CQ.includes cpd: cq, qty: 300.mg, bioavail: 1.2
 
regimen = regimen()
regimen.take drug: CQ, qty: 2, at: 0.h
regimen.take drug: CQ, qty: 1, at: 6.h
regimen.take drug: CQ, qty: 1, at: 1.d
regimen.take drug: CQ, qty: 1, at: 2.d
 
CRT = protein("CRT")
CRT.mutatingAmino 76, Lys, Thr
 
cqEffect = effect(
    name:       "General",
    formula:    {3.8 / (1 + km1/cq) },
    parameters: [km1: 68.0]
)
 
cqResistance = resistance(
    effect:     cqEffect,
    mutations:  [CRT.mutation(76)],
    parameters: [km1: 204.0]
)

Chloroquine has a terminal half life (roughly the time that the body takes to eliminate half of the drug concentration) of 45 days (line 1). Actually, it is quite difficult to estimate half lives (and they vary from case to case). CQ is estimated to be between 1 and 2 months (extremely long).

A typical CQ pill has 300 mg of the substance (line 4).

A possible CQ regimen is, for an adult, 2 pills on the first day. 1 pill 8 hours later, 1 pill the 1 and 2 days after. Lines 6-10.

Resistance is related, among many other things to codon 76 of the CRT (Chloroquine resistance transporter) lines 12-13.

Looking at the code until line 13 I would say that is pretty readable and an elegant representation the problem. From line 13 onwards I think the same holds, but for now I will not discuss pharmacokinetics (I also refrained from explained the simplistic bioavailability parameter on line 4).

In the next posts I will concentrate on line 17, a formula for the pharmacokinetics (PK is mainly the killing effect of the drug on the parasite) of CQ. Sometimes I will be more of a computer geek and concentrate on the Groovy side of things, sometimes I will discuss more the underlying biology and pharmacology.

By the way, and going in the geek direction, why do optional parenthesis become mandatory inside list? i.e., I can do

DHFR.mutation 108

But I need parenthesis here:

[DHFR.mutation(108)]

The same seems to be happen when calling functions scoped inside a script (in the DSL example above, line 1 requires parenthesis).

By the way, that DHFR thingy above? DHFR is an enzyme involved in malarial resistance to SP, the other widely deployed cheap drug. SP acts in a less obvious way, and that will require changes to the DSL (to have relationships among effects), but that is further down the road.

Appendix:

One interesting Scala syntactic goodie that Groovy could plagiarize is this:

import org.jfree.chart.plot.{PlotOrientation, XYPlot}

From the snippet above you might infer that charts will be appearing in future posts ;)

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: bioinformatics, declarative programming, groovy, malaria

by: tiago

3 Comments

DSL tactics in Groovy (1/many)

I am learning Groovy during the process of implementing a DSL to study malaria resistance. This is a first post, of hopefully many, where I layout some tactics on implementing DSLs in Groovy. Take my suggestions with a grain of salt, as I am nothing more than a newbie. I will address small issues with each post. Expect a strong technical leaning discussing picky details…

Most of what you read is actually not my creation or ideas, but they come from the extremely helpful Groovy users mailing list, especially from Guillaume Laforge.

Numbers, script name space and named parameters

Our first objective is very simple: to model a drug compound (to treat malaria), say:

1
2
3
4
Ronald.init(this)
 
cq = compound(name: "Chloroquine", abbreviation: "cq", halfLife: 83.h)
//I am not sure about 83 hours at all

Lets start with the compound half life parameter (the time that takes to eliminate half of the drug concentration from the body). Notice the 83.h? Meaning 83 hours. How do we do this, considering that numbers are not time? A first implementation used Categories (an option you might want to explore), but my current one uses the ExpandoMetaClass:

Integer.metaClass.getH << {
  ->
    new Time (delegate, Time.HOURS)
}

There are several details worth noticing:

  1. First, we are adding a new method to the Integer class, i.e., the Integer class will be changed in behavior, a pretty nuclear change eh…
  2. We are adding a getter method, so, when you do 4.h Groovy will call the getter method getH, using the typical Java naming convention.
  3. Notice that we don’t return an Integer but a Time object (a vanilla object defined elsewhere). For me this caused a click in my mind: Calling an integer to get something else.
  4. Notice the delegate word, a way to access the object being manipulated.
  5. On the clumsy side, you will probably will have to do the same thing for the BigDecimal meta class if you want to support floats. Numbers off all kinds don’t have a common ancestor (they are actually Java classes), I suppose for performance reasons.

Now, lets go back to line 1 of the first code snippet, the initialization. One of the things init does is changing Integer and BigDecimal, but it is not the only thing it does: It also changes what is available on the script name space. There are actually a few ways of doing that, but the two others recommended but Guillaume required to have some boot up code in Java, and for now, in prototype phase, I am too lazy to do that. Just for information google for CompilerConfigurate#setScriptBaseClass() or check the Groovy Java-side Binding class. As for me, I just passed the script environment to the init function:

1
2
3
4
5
6
7
8
9
    static init(env) {
        env.compound = {Map args ->
            Compound c = new Compound(args['name'])
            c.abbreviation = args['abbreviation'] ?: null
            c.halfLife = args['halfLife'] ?: null
            return c
        }
        //...
    }

So, we just added a closure with name compound (a function, eh) to the script name space. The closure simply creates a new Compound. Nice syntatic sugar…

By the way, do you notice on line 2 that Map thingy? Lets talk a bit about named and optional parameters (and how, in my view, they suck in Groovy). Lets get a more explicit example:

writeFile (file: 'out.txt', mode: 'append')

Now, the way writeFile is supposed to be written is quite strange

void writeFile (Map args) {
    file = args['map']
    //...

Am I doing some kind of newbie mistake here? This is a strange way of attaining the result. Typing information is completely lost from the function signature (that is bad for smart IDEs, automated code tools and introspection). What is wrong with the ye old way of Python (where you list all parameters in the signature, assigning default values to optional parameters)?

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: groovy

by: tiago

2 Comments

Groovy/Scala/Ruby/Python on JVM

There seems to be some competition in the field that can be vaguely defined as “The next Java”(TM).

I don’t know if there will be a “next Java” to start with. Things seem to shape up in way where the JVM is our common interoperability platform and on top of it we have a an ecology of JVM based languages.

I have used Jython quite a lot but have several doubts about it, not only on the current status of Jython (lags a bit behind CPython) but I also deslike Python (when compared with the other languages dicussed here). As such I decided to evaluate the other Scala, Ruby and Groovy.

I have done a couple of small projects in Scala (A prototype DSL for modeling malaria resistance is available here) and JRuby. I am now starting with Groovy, and I think I’ve found my new love. Here I will try to explain why, among Groovy, Scala and JRuby, I have chosen Groovy. To preempt any religious war idea, I would like to say I have full respect for Scala, Ruby, which are, with Caml and Prolog among my favorite languages (for a true crusade and flame ask me for my opinion about Perl or Visual Basic 6 ;) ).

Steven Devijver suggests that Groovy is the language with more syntatic similarities with Java. I would say that, not only that, but on the semantics and everything, Groovy is the closest language to Java. And that is a good thing. The world (both in programming languages and all the rest) is never revolutionary. Revolutions, when they rarely happen, are either a disgrace or are not that much of big change below the surface. People normally prefer (for good and bad reasons) the path of least short term pain. Groovy delivers that: almost 0 cost in starting to code coming from a Java background. Most importantly Groovy does that but still delivers most of the new goodies. This is actually the cornerstone of my argument: path of least pain while delivering the good stuff (in some cases better than the competition, as we will see).

Let me start with the fundamental reasons why I dismiss JRuby (which is, nonetheless, my second option after Groovy). First, I would like to say, very honestly, that the work of the JRuby guys is nothing short of outstanding! But I have 3 problems:

  1. One, by definition, JRuby is based on Ruby, a language from outside the JVM. That means semantic hurdles, coupling issues between the two worlds (think, e.g., libraries)
  2. Most importantly (but connected with the first point): Typing. I am a bit far away from computing issues currently (I work with Malaria currently, so excuse me if I mess strong/explicit typing and such) but clearly the typing system of Ruby make like hard for IDEs (think IDEs to neded to tame those over engineered Java APIs) and automated tools around code. Debugging without explicit typing is also a pain in a big program (I actually suffered my first debug nightmare with typing systems with Caml, arguably the mother of Scala). Some might say that Scala type inference and Groovy duck typing also are problematic in this respect; while the argument might be correct both languages have mechanisms to support typical Java explicit/strong typing and as such profit from IDEs and automated analysis tools.
  3. Ugly perlisms. Although I have read somewhere that those might be deprecated in the future.

Ah… Scala… Mats Henricson argues that Scala is the only option because of elegance regarding multicore computing. I fundamentally disagree with his point - multicore programming is fundamental but Scala is not really a good solution, but before we get there, lets talk about other Scala issues.

Type inference. I have some experience with the “mother” of Scala, Caml. Type inference in Caml is really elegant: I don’t remember a single case of it failing and requiring the programmers’ help in discovering the type of a parameter. That is not the case with Scala, several times the compiler seems to be “lost in translation”. Some might say that this is because of JVM imposed constraints, but if that is the case then it would raise the argument of bringing a language with a foreign semantics to the JVM and the ugliness attached to the process.

My biggest peeve? Metaprogramming. I won’t give you my opinion about it because it really doesn’t exist. It is on the Scala wiki in the section “future”. I am sorry, but a 21st century language where meta programming is absent can only be called in “beta stage”. As a side note, there seems to be something lost in the ML branch of functional programming from Lisp in this regard (no introspection and such), that is a shame (How is Haskell in that respect?).

Ok, multicore computing. This is an area where I have some experience in the JVM: [Shameless plug] I invite you to have a look at my Java Web Start, Jython based, multicore aware evolutionary biology workbench LOSITAN. Furthermore I have written tutorials for the multicore paradigm and bioinformatics:

Bioinformatics, multi-core CPUs and grid computing: Introduction (1/4)


Bioinformatics, multi-core CPUs and grid computing: User perspective (2/4)

Most importantly in this context: Bioinformatics, multi-core CPUs and grid computing: developer perspective (3/4)

Mats argues that Scala Actors and immutable data types provide a simple and elegant solution to the extremely complex problem (I am calling it extremely complex, because I think it really is) of concurrent programming. Immutable data types… Does anyone believe that the hordes of existing Java developers/programmers are ready and willing to do radical conceptual jump to immutable data types? The change from C++ to Java was minor in terms of semantics, even the change from C to C++ was much less radical that a change requiring to “get rid of all variables”. How do you think the majority of programmers will react when you say: “Forget variables”? More, as Scala allows for imperative type of programming, what do you think most programmers idiom wil be: Imperative or functional? To makes things worse, in Scala a immutable is called a “val” and the mutable a “var”. Am I the only only picturing hordes of developers, with tight deadlines just swapping L’s for R’s?

I speak for myself here: in spite of having probably more experience with “immutable” languages (Prolog a lot, Caml a bit) than most developers, when I wrote Scala code, my reasoning was so tainted by “real world” imperative languages that it was really hard to write in a functional dialect. I have the background, enough free time, and the motivation to write functional code, but it was hard to get back in that mindset.

Scala only apparently solves the multi core problem. Give it to a typical developer and he will write imperative code, unless you put a functional zealot behind him (and give the said zealot a strong, resistant whip).

How to address the multicore issue? Clearly we have a problem here. A few ideas:

  • In many applications there is no big need to go multicore. In some cases lets not try to solve a problem that doesn’t exist in the first place.
  • Many multicore applications can survive very well with simple concurrency management. Not all applications require a PhD in concurrent programming.
  • Scala and the like. For those who can and are willing to go functional, why not? I have nothing against that. My only argument is that it won’t be mainstream.
  • The way of PAIN. Most developers will continue to use old languages and paradigms and SUFFER with it. Only after much suffering there will be motivation to try out new things and, say, endure the pain of learning a new paradigm. That suffering still hasn’t happen, only after this becomes a big problem, there will be interest in accepting new solutions.
  • A silver bullet that can be attached to the current programming paradigm. Sometimes it happens. Don’t misunderestimate (silly Bushism intended) the power of a “Black Swan” (A reference to Taleb’s book where he discusses the impact of the unexpected important events).

To finalize, I would like to say that I am not sticking with Groovy out of being conservative. Groovy seems to beat the competition in many areas (the biggest example is metaprogramming) and strikes a very good balance between being a “small evolutionary step” and delivering the goodies.

To really finalize, a caveat: my Groovy knowledge is still limited, one of these days you might read a post where I apologize for having written this ;)

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: Ruby, Scala, groovy

by: tiago

9 Comments

Poverty is Poison

An article by Paul Krugman:


“Poverty in early childhood poisons the brain.” That was the opening of an article in Saturday’s Financial Times, summarizing research presented last week at the American Association for the Advancement of Science.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: politics

by: tiago

No Comments

Reducionism and simplification

This post starts with what might seem as a discussion about computing, but is actually a poor man’s discussion about philosophy of science and has nothing to do with computing, it is much more applicable to biology, economics and sociology.

Lets be honest, computer scientists are trained to work for banks and insurance companies, to make web sites, software for cars and things like that. Those domains are actually very simple. A bank might be a gigantic institution, but it is possible to capture, granted with a lot a effort, all its processes inside a computer program. This creates a mental setting: Everything that we need to know is possible to be known: we just decide when to stop.

Now think about those simplistic (this is an understatement) mathematical and computational models for scientific problems (differential equations, Monte Carlo processes, Markov Chains, …). They model the “important parts” of the issue under study. These models are much simpler than the models working in computers to sustain day to day banking chores. Somehow it strikes me as strange that something as mechanic as a bank needs a more complex model than “nature”.

In the context of nature and making mathematical and computational models about it, I have a few things in mind:

First of all, in many problems in the natural world we don’t know what are the important parts to start with. This is very different from the “bank mentality” when you can know everything if you try hard. In my personal case, when I model malarial artesunate resistance, I am modeling something that people speculate how it works, and even if the speculation is correct most of the fundamental parameters are unknown. I am still to read a paper modeling something related to malarial drug use that doesn’t have a phrase like: “the relation between this value is and reality is assumed to be this (no citation - or citing something unpublished - or rationale provided)”.

But the cornerstone of my reasoning is that, in complex processes, the devil is in the details and in the interactions between participating factors (most of which we
are unaware of). Soft sciences are holistic by nature. The property of the whole system comes from the everything and everywhere. The “banking” and “hard science” mentality are no good here, we cannot know everything, what we know is probably not enough, and most simplifications will lose something fundamental.

Does this means that I am suggesting that we should stop modeling and all theoretical work? By no means, but we should refocus:

  • This is not hard science, don’t try to mask it as such. Hard rules, sensitivity analysis are mostly artifacts to make things look more “serious” and more “demonstrated”. This is biology (or even “worse”, economy or sociology), you don’t c.q.d. here.
  • Think you can forecast the future? You think you can… thaen bring me a always correct forecast of the weather in 2 months and I will listen to you. Most models that exist to forecast the future are there because they are very hard to disprove TODAY: climate (as opposed to weather) models, epidemiology, … . The vast majority of models that can be tested fail (think mathematical finance and the current subprime crisis in the USA, think weather predictions…).
  • Theoretical work, although not being able predict the future (or explain the past) might help create a cognitive and linguistic framework for discussion: present the fundamental concepts and narratives underlying the research process, make the discourse clearer, less cloudy, point dangerous imprecisions. This is actually the inverse that what happens now: theoreticians speak in a language that most people struggle to understand.
  • Theoretical work can create interesting questions for field scientists to try to answer: It is the precise inversion of what happens now: We don’t want models that are cheated to look realistic. We want reasonable models that fail miserably so that we can ask field scientists: This is failing, why do you think this happens? Have you considered this other hypothesis? What about testing it?

<sarcasm>
The existing modeling culture is quite good in the current scientific setting: Makes theoreticians look intelligent with all those complicated mathematics and computer programs (and associated publications) and excuses “practical” scientists of even trying to use their brains: They just apply the existing theory in a process that is more industrial then creative to their research questions. The biggest example that I know of this is phylogenetic analysis: Get data from the field, compute a mutation model from the premise that a small genetic distance is better, burn CPU cycles, publish - You don’t even need a human for this - a trained monkey is probably enough.
</sarcasm>

In economics things are a bit worse: elaborate game theories and such are presented as a “hard, undisputed” justification for an economic theory serving some nice agenda. Nothing more than a authoritarian argument.

PS - If you work in an hard science like physics or chemistry you might be thinking that I am smoking something very strong. I don’t think that this post applies to hard sciences, that is a different game altogether.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: biology, science

by: tiago

1 Comment

Malarial drugs and the economics of (human) languages

There is some interesting lack of precision, to the point of “error” on the way some concepts are dealt with by human language.

Take, for instance, the concept of drug half-life, i.e. the time that it takes for the concentration of a drug to drop to half (drug concentrations in the blood are normally modeled through exponential decay), it is conceived as a property of the drug - people talk about drug D has an half-life of H hours - but it is really a property of both drugs and individuals (actually is much more complicated than that, we could repeat the argument).

And no, this has not only to do with statistical deviations that are acceptably approached by the drug only.

As example, there is a study about the pharmacokinetic properties of Sulfadoxine-Pyrimethamine (a widely used cheap antimalarial). In this study, there is a big deviation for half-life (and other parameters) for the children between 2 and 5 years. The study concludes that “dose recommendations need revision” for that group. To put in another way, half-life (and other parameters) is not (only) a function of the drug.

Now, I am not suggesting that the concept of half-life tied just to the drug should be thrown away. I am just speculating why it is framed as a function of the drug only, as clearly that is not the case.

First there is probably historical inertia: The concept was first framed that way at a time that it seemed that half-life was only dependent on the drug and it stuck by “memetic” inertia.

But, much more importantly, it is still there because, it is both less expensive (it is easy to express half-life as a function of just the drug, than other parameters which might be still crucial in some situations) and still meaningful enough in many contexts (for instance, expressed as a function of drug it is still useful to compare the half-life of Artemether - short - against Sulfadoxine - long - for many kinds of reasonings). Even when the most economical concept entails some errors it might still be practical. The problem only arises when its simplicity has bad consequences (in this case, having wrong drug doses)… but, in certain contexts, it might be a problem, a serious problem (See my previous text about the notions of resistance, tolerance and sensitiveness for an example).

It all depends of the discourse context, but one should be careful.

As an anecdotal example if you are seriously ill and a doctor prescribes you a pill, do you prefer to hear “this will cure you” or “this will drop the parasite load at a rate of 1 order of magnitude per hour starting 3 (90% CI of 2.5 - 3.5) hours after intake. Parasite load is expected to drop to 0 in 10 hours”?

The problem arises when the cognitive bias of the simplicity of “this will cure you” gets into more rigorous contexts.

This has implications on the computational modeling of concepts. The tradition in computer science it to “dig down” to the “real meaning” of concepts. In that sense simpler explanations are deemed “wrong” (and should be rewritten in terms of “correct” conceptualizations). Maybe a different strategy is needed, one that takes some linguistic and cognitive economy to computational systems (while still maintaining rigorous and precise reasoning and conceptualization when that is needed - like human languages can do).

I am going to stop here, but I think that one of the problems that impairs mathematical modeling is the application of the “certainty of numbers and formulas” to non-rigorous concepts. Then you have the worst of both worlds: an authoritarian argument (mathematics is a foundation for authority. “The numbers prove it”) based on modeling vague, imprecise and wrong concepts. But that is a topic for a another post.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: bioinformatics, cognition, malaria, science

by: tiago

No Comments

Carlos Paredes and Tocá rufar

Portuguese guitar

For lovers of world music this the best you can get from Portugal.

A more recent project is Tocá rufar, a youth project based mainly on the south part of Lisbon targeting mostly kids is less well-off areas. Based on the local tradition of mobile bass drums. In this video you only see a few of them, but they are a “army” of hundreds and sometimes they play all together!

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • connotea
  • DZone
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati

Filed in: timeout

by: tiago

No Comments