An Evaluation of Classification Algorithms for Prediction of Drug Interactions: Identification of the Best Algorithm
Introduction: One of the main causes of medical errors is drug interaction which occurs when a drug decreases or increases the effect of another drug. Drug interactions occur as a result of changes in pharmacodynamics, pharmacokinetics, or a combination of both. Due to the problems caused by these errors and lack of an efficient system of automatic diagnosis of drug interactions, and also since a large amount of these interactions can be prevented, we aimed to search for drug interactions in the medical texts and also classify and identify the best algorithm. Methods: A two‑stage classification was used to solve the problem of unbalanced data dispersion in drug interaction classes. A subset of the most suitable features was identified for classification. In the first step of designing a binary classification, pairs of drugs which interact with each other and those which do not be separated. Then, we classified the pairs of drug interactions in one of the following four classes: effect, advice, mechanism, and int. In this study, different algorithms were used in both types of classifications, based on the type of data and expert opinion. To validate the first‑stage model, we considered 90% of the data as training data and the rest were considered as the test data. To validate the second‑stage model, we used the difference verification method. Weka data analysis software was also used for designing the model; then, the classification was made. Results: The results showed that the most appropriate features were mutual information (obtaining a score of 1000) and parts of speech. The efficiency of J48 algorithm in the stage of separating the drugs with and without interaction (F‑measure = 0.914) and also in the multiclass stage of the bagging algorithm (F‑measure = 0.915) was the highest among other algorithms. ZeroR algorithm required the shortest time to build the model (less than half a second) in both stages. Conclusion: According to the results of J48 algorithms and random forest, it can be concluded that decision tree is the most appropriate approach in the extraction and automatic classification of drug interactions, using the features derived from the text to be applied in clinical decision support system.