HTA Code Challenge: Discussion


General thoughts

This was the first time that we’ve tried something like this. The idea was to make it a relatively simple set of problem to tackle but also sophisticated enough that there’d be enough for some of the more advanced R users to tackle. This is clearly a tricky balance to strike and, judging by the relatively low number of submissions, perhaps not one we got right. There are other considerations of course such as simply people not having enough time or just not wanting to do extra work. That said, the submissions we did have were of a good standard. I’ll be talking about these in another blog post but in the following I will expand on the some of the thinking behind my solutions.

C1

This question is really just setting the scene and calculating one-cycle decision tree QALYs.

An issue that will come up time and again in this set of questions is how to represent the data. This is not a trivial decision since the choices made early on will impact how we do the calculation later.

I simply assign individual variables to the probabilities and a named vector to the utilities. This is partly because I use the probabilities inside a matrix later on.

I chose the tribble function which is like a transposed sort of tibble which itself is a nicer version of the ubiquitous data.frame. It has some nice properties like when printing to screen it gives column types and uses colour, amongst other things.

Within a single list, named entries by the start state are these tribbles. I recommend using extra spaces when specifying these so that the column entries all line up and its easier to read.

Note also that I had to distinguish between starting in a particular state and transitioning to it, which I do by appending a zero such as well0. The compliment of an event is prepended with n such as ndead.

C2

This is when some of the work in C1 pays off. I decided to get estimates via simulating some population. In the original paper there is only a single death state. I decided to separate types of death depending on what the preceding state was so death from AIDS is diea, death from cancer is diec and everything else is dead.

We look over each member of the cohort. In each iteration an individual starts in the well state and is simulated until they enter a death state.

Because these are binary trees we can simple compare the event probability with a uniform random number between 0 and 1. The result of following a branch gives us the probability for the next random event in the same way.

Finally, each utility and state trajectory are saved in a list. We can then sum each individual and then take the mean. I use map from the purrr package but base apply would do just as well. Unfortunately, for n = 1000 its not a very good estimate. This is prime for some parallelisation but that is for another time.

C3

Once we’ve worked out the transition matrix this is moving towards something that should be familiar to regular Markov modellers. I put the probabilities for each state in a list and then applied do.call using rbind to create a matrix. There are lots of different ways of doing this down to personal preference. To demonstrate a more robust way of creating objects, I use a constructor type function which can then check for errors. The actual calculation is again done by individual level simulation. This time though we can use the multinomial random sampling function in base R rmultinom instead of the uniform method for the binary probabilities. Again the accuracy isn’t great though.

C4

So this is the regular Markov model case. We wrap the calculation in to a function. The actual work is done on the two lines where the current state populations are multiplied by the transition probabilty matrix or state utilities. The key thing is the use of %*% which is the dot product operator.

The total expected QALYs are much more accurate.

C5

The rolling-back of the Markov-cycle tree is done via recursion. It may be worth noting that there are more than one way to represent the Bellman equation for this problem depending on the order of operations.

This equation does have an exact solution for an infinite number cylces. This would depend on the transtion matrix being invertible. Because the probability transition matrix can be written in upper traingular form we can write the equivalent calculation in recursive form.

Summary and lessons learned

Finding the right problem to base this challenge on was an initial challenge in itself. I think that perhaps the questions were not clearly defined enough but this is always going to happen to some degree. I was happy that they allowed for the demonstration of some more interesting programming approaches. In future, if we do something like this again I think we may focus more of data manipulation and function structure rather than different models. If its a standard model but we include additional complications that may be more interesting and useful to potential participants, like the input data in different formats or calculating different statistics.