HIV-TRePS - HIV Treatment Response Prediction System RDI

Frequently Asked Questions

1. Predictions made from data including a genotype

What models does the system use to make its predictions?

The predictions of treatment response are made by a committee of 10 random forest models, the outputs of which are averaged to give the final prediction.

How reliable are the predictions made by the system when the genotype is included?

During cross-validation, the 10 random forest models that provide the predictions of treatment response for the system achieved an area under the receiver-operator characteristic curve (AUC) of 0.82.   When the models were tested with an independent test set of 375 treatment change episodes the AUC was 0.87 and overall accuracy was 91%.

How do you classify the system’s outputs as response or failure?

The output of the models is an estimate of the probability of the HIV viral load going below 50 copies HIV RNA/following a change of antiretroviral treatment. During the development and cross validation of the models we identified the optimum operating point (OOP) – a cut-off value for response and failure that provides the best overall accuracy of the system. In this case the OOP was 44% - any value below is classified as a prediction of failure and any above as a prediction of success.

How much data was used to train the models used by the system?

The RDI database contains data from approximately 85,000 patients.  7,263 complete treatment change episodes that are representative of contemporary HIV clinical practice and meet all our stringent quality criteria were used in the training of the models that are currently being used by the system to make predictions using data including the genotype.

What data are the predictions based on?

The system makes its predictions of response to a new antiretroviral treatment based on the individual patient’s HIV genotype, treatment history, viral load, CD4 count and the time to follow-up, as entered by the healthcare professional.

 

2. Predictions made without a genotype

What models does the system use to make its predictions?

The predictions of treatment response are made by 11 random forest models, the outputs of which are combined to give the final prediction.

How reliable are the predictions made by the system when a genotype is not available?

During cross-validation, the random forest models that provide the predictions of treatment response for the system achieved an area under the ROC curve (AUC) of 0.77 and overall accuracy of 75%. When the models were tested with an independent test set of 800 treatment change episodes, the AUC was 0.77 and overall accuracy was 71%.

How do you classify the system’s outputs as response or failure?

The output of the models is an estimate of the probability of the HIV viral load going below 400 copies HIV RNA/following a change of antiretroviral treatment. During cross validation of the models the optimum operating point (OOP) – a cut-off value for response and failure that provides the best overall accuracy of the system, was identified. In this case the OOP was 55% - any value below is classified as a prediction of failure and any above as a prediction of success.

How much data was used to train the models used by the system?

The RDI database contains data from approximately 85,000 patients.  Approximately 15,000 treatment change episodes that are representative of contemporary HIV clinical practice and meet all our stringent quality criteria were used in the training of the models that are currently being used by the system to make predictions without the use of a genotype.

What data are the predictions based on?

The system makes its predictions of response to a new antiretroviral treatment based on the individual patient’s treatment history, viral load, CD4 count and the time to follow-up, as entered by the healthcare professional.

 

3. General questions about HIV-TRePS

Why can’t I get predictions for maraviroc and tipranavir from your system?

Because when we developed the models that we currently use we had not received adequate data from clinical practice involving these drugs to train the models to make reliable predictions. We intend to include these drugs in the next version of the system using new models once the next round of data collection is complete.

 

The RDI

What is the RDI?

The HIV Resistance Response Database Initiative is a not-for-profit group set up in 2002 as a wholly independent international body to:

  1. Be a global repository for HIV resistance and outcome data
  2. Use these data to develop computational models to predict patients response to antiretroviral treatment
  3. Make such models available free-of-charge over the internet

More information

Who are the RDI?

The RDI consists of a small research team based in the UK and a large global network of advisors, research partners and data donors.

Where does your data come from?

The data is donated to the RDI by hospitals, clinics, research programmes, pharmaceutical companies and other institutions and groups around the world. More information.