The predictions of treatment response are made by a committee of 10 random forest models, the outputs of which are averaged to give the final prediction.
During cross-validation, the 10 random forest models that provide the predictions of treatment response for the system achieved an area under the receiver-operator characteristic curve (AUC) of 0.82. When the models were tested with an independent test set of 375 treatment change episodes the AUC was 0.87 and overall accuracy was 91%.
The output of the models is an estimate of the probability of the HIV viral load going below 50 copies HIV RNA/following a change of antiretroviral treatment. During the development and cross validation of the models we identified the optimum operating point (OOP) – a cut-off value for response and failure that provides the best overall accuracy of the system. In this case the OOP was 44% - any value below is classified as a prediction of failure and any above as a prediction of success.
The RDI database contains data from approximately 85,000 patients. 7,263 complete treatment change episodes that are representative of contemporary HIV clinical practice and meet all our stringent quality criteria were used in the training of the models that are currently being used by the system to make predictions using data including the genotype.
The system makes its predictions of response to a new antiretroviral treatment based on the individual patient’s HIV genotype, treatment history, viral load, CD4 count and the time to follow-up, as entered by the healthcare professional.
The predictions of treatment response are made by 11 random forest models, the outputs of which are combined to give the final prediction.
During cross-validation, the random forest models that provide the predictions of treatment response for the system achieved an area under the ROC curve (AUC) of 0.77 and overall accuracy of 75%. When the models were tested with an independent test set of 800 treatment change episodes, the AUC was 0.77 and overall accuracy was 71%.
The output of the models is an estimate of the probability of the HIV viral load going below 400 copies HIV RNA/following a change of antiretroviral treatment. During cross validation of the models the optimum operating point (OOP) – a cut-off value for response and failure that provides the best overall accuracy of the system, was identified. In this case the OOP was 55% - any value below is classified as a prediction of failure and any above as a prediction of success.
The RDI database contains data from approximately 85,000 patients. Approximately 15,000 treatment change episodes that are representative of contemporary HIV clinical practice and meet all our stringent quality criteria were used in the training of the models that are currently being used by the system to make predictions without the use of a genotype.
The system makes its predictions of response to a new antiretroviral treatment based on the individual patient’s treatment history, viral load, CD4 count and the time to follow-up, as entered by the healthcare professional.
Because when we developed the models that we currently use we had not received adequate data from clinical practice involving these drugs to train the models to make reliable predictions. We intend to include these drugs in the next version of the system using new models once the next round of data collection is complete.
The HIV Resistance Response Database Initiative is a not-for-profit group set up in 2002 as a wholly independent international body to:
The RDI consists of a small research team based in the UK and a large global network of advisors, research partners and data donors.
The data is donated to the RDI by hospitals, clinics, research programmes, pharmaceutical companies and other institutions and groups around the world. More information.