Trust in Excel or R

Trustworthiness

David Spiegelhalter has written in several places about trust in algorithms e.g. Harvard Data Science Review, Should We Trust Algorithms?, David Spiegelhalter, Jan 31, 2020, DOI: 10.1162/99608f92.cb91a35a.

The general idea is that we shouldn’t think just about trust in algorithms but also trustworthiness. He provides a checklist of questions he would like to ask:

  1. Is it any good when tried in new parts of the real world?
  2. Would something simpler, and more transparent and robust, be just as good?
  3. Could I explain how it works (in general) to anyone who is interested?
  4. Could I explain to an individual how it reached its conclusion in their particular case?
  5. Does it know when it is on shaky ground, and can it acknowledge uncertainty?
  6. Do people use it appropriately, with the right level of skepticism?
  7. Does it actually help in practice?

Excel vs R

A common criticism, I’ve heard for both sides of the argument, is that R/ Excel is not trusted because it is not clear how the model has been implemented. Excel users claim the WYSIWG interface is transparent and the R users claim that it is exactly this interface that make interrogating the model and testing it difficult and so not transparent.

Can we use David Spiegelhalter’s ideas to compare Excel and R for doing HTA?

The previous list is to do with the underlying algorithm, the data they’re used on and how the results are used. There is no mention of the implementation which is what we are interested in here.

So, borrowing from above, a possible Excel vs R check list could be:

  1. Is it able to simply implement a given model?
  2. Can someone easily understand the implementation (against the mathematical description)?
  3. Is the flow through the model clear?
  4. Are there tests and checks built in to the model?
  5. Are the inputs constrained or errors produced for bad values?

Of course, these elements are interrelated. If a model is easy to implement in software then it is more likely to be easy to understand and to follow its pipeline. So there is an assumption that the model builder has implemented the model in the best(ish) way appropriate for that software.

There are other benefit to using R, such as speed, extensibility and reuse but these aren’t directly linked to trustworthiness.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: