|ORIGINAL RESEARCH ARTICLE
|Year : 2018 | Volume
| Issue : 2 | Page : 92-99
An evaluation of classification algorithms for prediction of drug interactions: Identification of the best algorithm
Rita Rezaee1, Reza Akbari2, Milad Nasiri3, Farzaneh Foroughinia4, Nasrin Shokrpour5
1 Health Human Resources Research Center, Clinical Education Research Center, School of Management and Information Science, Shiraz University of Medical Sciences, Shiraz, Iran
2 Department of Computer Engineering and Information Technology, Shiraz University of Technology, Shiraz, Iran
3 Clinical Pharmacy Department, Shiraz University of Medical Sciences, Shiraz, Iran
4 HIT Department, Faculty of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
5 English Department, Faculty of Paramedical Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
|Date of Web Publication||25-Sep-2018|
Dr. Nasrin Shokrpour
Faculty of Paramedical Sciences, Shiraz University of Medical Sciences, Shiraz
Source of Support: None, Conflict of Interest: None
Introduction: One of the main causes of medical errors is drug interaction which occurs when a drug decreases or increases the effect of another drug. Drug interactions occur as a result of changes in pharmacodynamics, pharmacokinetics, or a combination of both. Due to the problems caused by these errors and lack of an efficient system of automatic diagnosis of drug interactions, and also since a large amount of these interactions can be prevented, we aimed to search for drug interactions in the medical texts and also classify and identify the best algorithm.
Methods: A two-stage classification was used to solve the problem of unbalanced data dispersion in drug interaction classes. A subset of the most suitable features was identified for classification. In the first step of designing a binary classification, pairs of drugs which interact with each other and those which do not be separated. Then, we classified the pairs of drug interactions in one of the following four classes: effect, advice, mechanism, and int. In this study, different algorithms were used in both types of classifications, based on the type of data and expert opinion. To validate the first-stage model, we considered 90% of the data as training data and the rest were considered as the test data. To validate the second-stage model, we used the difference verification method. Weka data analysis software was also used for designing the model; then, the classification was made.
Results: The results showed that the most appropriate features were mutual information (obtaining a score of 1000) and parts of speech. The efficiency of J48 algorithm in the stage of separating the drugs with and without interaction (F-measure = 0.914) and also in the multiclass stage of the bagging algorithm (F-measure = 0.915) was the highest among other algorithms. ZeroR algorithm required the shortest time to build the model (less than half a second) in both stages.
Conclusion: According to the results of J48 algorithms and random forest, it can be concluded that decision tree is the most appropriate approach in the extraction and automatic classification of drug interactions, using the features derived from the text to be applied in clinical decision support system.
Keywords: Classification algorithm, drug interaction, f-measure, machine learning, natural language processing
|How to cite this article:|
Rezaee R, Akbari R, Nasiri M, Foroughinia F, Shokrpour N. An evaluation of classification algorithms for prediction of drug interactions: Identification of the best algorithm. Int J Pharma Investig 2018;8:92-9
|How to cite this URL:|
Rezaee R, Akbari R, Nasiri M, Foroughinia F, Shokrpour N. An evaluation of classification algorithms for prediction of drug interactions: Identification of the best algorithm. Int J Pharma Investig [serial online] 2018 [cited 2019 Feb 22];8:92-9. Available from: http://www.jpionline.org/text.asp?2018/8/2/92/241977
| Introduction|| |
Failure to manage drug interactions is considered a medical error. There are several reports that discuss the lack of knowledge and awareness about drug interactions as one of the most common errors. In fact, health-care providers do not sufficiently know about drug interactions and their associated factors.
One of the main causes of medical errors is drug interactions. The results of medical error evaluations show that about 44–98,000 deaths occur yearly in the United States due to errors. Other reports also point to similar patterns throughout the world,,, indicating that this problem is not specific to the United States. Many of the medical errors related to drugs are predictable. According to a study conducted in Australia, 75% of the medication errors are preventable., Hospitalized patients are exposed to potential drug interactions. Drug interactions are the causes of many referrals to emergency departments. Gurwitz et al., in a study on the side effects of drugs, found that 13.3% of preventable medication errors were related to the interactions of well-known drugs. Nearly 70% of the side effects of medications experienced by residents of the two elderly care homes during a 9-month period have been attributed to drug interactions. Various studies report the increased likelihood of hospitalization due to complications of drug interactions. Certainly, there is no reliable source of drug interactions, but there are multiple sources that provide information, evaluate and update drug interactions data, and label pharmaceutical products. In hospitals, this is restricted to financial issues of medications. Moreover, the clinical classifications of medications are not well-implemented due to special attention to financial issues and considering medicine as a “service.” On the other hand, the existence of a standard and up-to-date structure that incorporates the dynamic knowledge of pharmaceutical information and massive medical literature is essential in the field of drug interactions. Therefore, there is an urgent need for a system that would complement human activities and automatically perform these tasks. Drug prescription systems should include drug interaction control as one of the most basic types of systems which contribute to decision-making. The public's viewpoints as to hospital pharmacy have changed significantly in Iran, due to the advancement of medical services and drug distribution and also as a result of the advent of concepts such as clinical governance. The most important of these changes is the concept of pharmaceutical care which has caused an evolution, in which viewing clinical pharmacology as drug distribution is replaced with viewing it as the knowledge associated with all drug services including drug interactions. The management of drug-related services and drug interactions should be optimized.
In recent years, researchers have made an attempt to use computational methods to cope with the challenges of drug interactions. For example, Sun et al. proposed a Hadoop-based method to improve the scalability of prediction methods for drug combinations. Their study showed that using big data techniques was more promising for drug combination in comparison with the classic prediction methods. In their study, they used support vector machine (SVM) and Naïve Bayes under 10-fold cross-validation and leave-one-out cross-validation for comparison of the performances.
In another work, Xu et al. used stochastic gradient boosting (SGB) algorithm for identification of drug combination in the pharmacology domain. Their study aimed to improve the process of identification of drug combinations. In this study, 352 positive samples were used. In addition, 732 biological, phrenological, and chemical features were used. The Naïve Bayes, SVM, and SGB methods under 10-fold cross-validation were used for prediction. The best result was obtained by SGB method.
Bai et al. proposed an improved version of Naïve Bayes for prediction drug combinations. Different features such as drug objectives, protein paths, side effects of drugs, and metabolic enzymes. were used. This study showed that the improved Naïve Bayes had a better performance in comparison with Naïve Bayes, SVM, and K-Nearest Neighbor. 10-fold cross-validation and leave-one-out cross-validation were also used.
The present study aimed to evaluate the methods of drug interaction classification in medical texts, using machine-learning algorithms and natural-language processing methods. We also investigated the algorithms of drug interaction classification, considering different aspects (precision, recall, and F-measure and receiver operating characteristic [ROC] curve analysis). The results of this study can be useful for clinical decision support systems.
| Methods|| |
Due to the emphasis of DDI2013 on the use of both pharmaceutical database and MEDLINE for validating the results and comparing different studies, 142 MEDLINE abstracts and 572 world bank pharmaceuticals were used in this study. This set contains 6976 annotated sentences with 4 pharmaceutical entities and 5 classes of drug interactions [Table 1].
|Table 1: Pairs of drug entities in positive and negative superclasses and subclasses|
Click here to view
First, all pairs of possible drugs found in the identified sentences and pairs of drugs, specified as the study samples, were entered into the ARFF data file. Interaction or lack of interaction was determined by true or false, respectively.
Pairs of drugs marked with 1 were ranked within the positive class, and those marked with 0 were within the negative class in terms of training. In the next stage, positive samples were placed in one of the four classes 0–3 (advice, effect, mechanism, and int) so that each sample contained a pair of drug entities (drug, group, brand, and nonhuman). For instance, a sentence containing four drug entities contained six samples, each of which could be a potential drug interaction. Hence, the dataset containing 6976 sentences included 24,891 samples.
The first step in extracting drug interactions from the texts was to identify drug entities. There are three approaches in this regard: rule-based, machine learning-based, and dictionary-based approaches. In this study, we made an attempt to detect and extract drug entities in the text by using a machine learning-based approach. Finally, to avoid the negative effects of drug entities extraction on final evaluation of classification algorithms for drug interactions, we used the pharmaceutical database of drug bank containing target information about 4300 drug entities. In this study, to achieve a proposed approach to extract and classify drug interactions and identify the best algorithm, we classified each pair of drugs into one of the following five classes: without interaction, effect, advice, mechanism, and int, when the text only mentioned the existence of a relationship between the two drugs, without making a specific description of the relationship or determining the type of relationship. The main challenge in this regard was the asymmetric distribution of the classes. First, we just put the positive class against the negative class; only 19.3% (4037/20854) of the samples (pairs of drugs) were placed in the positive class. Moreover, the dispersion among positive classes was also asymmetric and only 4.6% (188/4037) of the samples were of int class. Diagram 1 generally shows the steps involved.
Before classification and in the preprocessing phase, all drug entities in the sentences were listed and refined for purification and normalization. The natural language processing technique was used in several stages including preprocessing and feature extraction stages as follows:
- All letters were changed to lowercase
- All drugs were marked as primary or secondary
- All numbers were replaced with letters in string
- Sentences containing <2 drugs were excluded. Stop words and punctuations were deleted
- The words were stemmed
- Negations including negative statements and negation markings such as “not” were identified.
Parts of speech were extracted using Stanford's natural language processing tool. The decomposing syntactic tree was obtained using Stanford natural language processing tool and some of the parse tree information was investigated for potential use as a feature.
To train the algorithms and find a set of best features for classification, we investigated various features such as similarity measures, parts of speech tags, stemmed words, mutual information, verb list, and parse tree information. Mutual information and parts of speech were selected as the main features.
Applying and evaluating the classes
In this study, the following eight different algorithms were used in order to take advantage of unique benefits of each algorithm.
- Bayes (Naïve Bayes and Multinominal Naïve Bayes)
- Rules (ZeroR, Jrip)
- Functions (LibSVM)
- Trees (J48 and Random Forest)
- Meta (Bagging).
Since the purpose of the system was to prevent prescription of interfering drugs for patients, the presence of a negative sample in a positive class was better than a positive sample in a negative class. Therefore, we attempted to minimize the possibility of a false positive existence in this study.
In order to evaluate how the classification algorithm functions, the confusion matrix [Table 2] was used, so that the concepts of precision, recall, accuracy, and true negative rate were primarily reviewed. Finally, these four measures were used for final evaluations: precision, recall, F-measure, and ROC analysis.
This measure was used to calculate the positive drug interactions reported by the algorithm, that are really positive. TP/P = TP/(TP + FN).
This measure was used to calculate how much the values classified in the sequential classifications of a class were close together, TP/(TP + FP).
F-measure is the harmonic combination of precision and recall. It is a type of average and it is used because none of the two measures has a special advantage over the other. It is calculated from the following equation:
Receiver operating characteristic
The accuracy of each test and combination of tests are determined using this analysis. The main reason for using this analysis is the relative compromise between the profits and costs. It also makes it possible to graphically display the comparison between the tests, which determines the efficiency of the classification systems; the greater the number for a classifier, the more efficient its final performance.
Since the ARFF file contains 1447 extracted features and a class label feature, CfsSubsetEval-Weka and PCA-RapidMiner were used for both classification stages. CfsSubset-Eval selected only 38 features for the evaluated file in the first stage (positive and negative set) and 35 features for positive samples. It reduced the time taken to build the model and analyze it. However, it did not increase the F-measure.
Solving asymmetry problem
To address the problem of dispersion of drug interactions in asymmetric classes, we investigated different approaches such as SMOTE and other resampling algorithms. To remove this problem, the SMOTE increased the ARFF file samples from 22,268 to 25,728 by injecting samples into them. Despite the increase in the file size and time required to build the model, system efficiency increased in both stages of drug interaction classification [Table 3] and [Table 4]. Porter stemmer was used for implementation; parse tree was used in information preprocessing. Stanford NLP Tools, WordNet, and Dragon tools were utilized to identify the entities and extract the features. Finally, Weka was applied to build the model and train the algorithms. Weka input file was provided as a text file in the ARFF format by bringing the desired features as columns and listing the pharmaceutical samples as rows. In each sample, a value was considered for each feature, which formed part of the sample vector. The final value of the sample vector actually represents the sample class label. Finally, as shown in [Figure 1] and [Figure 2], the drug interaction classification model was extracted.
|Table 3: System efficiency for different algorithms in two-classes stage after applying the Synthetic Minority Over-sampling|
Click here to view
|Table 4: System efficiency for different algorithms in multiclasses stage after application of the Synthetic Minority Over-sampling|
Click here to view
|Figure 1: Final model classification of drug interactions in two-classes stage|
Click here to view
|Figure 2: Final model classification of drug interactions in multiclasses stage|
Click here to view
In this study, mutual information and parts of speech were selected as the best features extracted from the sentences to train the classifiers. According to [Table 5] and [Table 6], given that bagging was used as a performance improvement algorithm, the best F-measure was obtained in J48 algorithm for the two-class stage and also in random forest for the four-class stage. Therefore, J48 and random forest can be considered as the best classification algorithms.
|Table 5: Final system efficiency for different algorithms in two-classes stage|
Click here to view
|Table 6: Final system efficiency for different algorithms in multi-classes stage|
Click here to view
As shown in [Table 5] and [Table 6], J48, random forest, and bagging algorithms had the best efficiency in the two-class stage, respectively. ZeroR required the shortest time to build the model and Jrip required the longest time to do so. [Table 7] and [Table 8] reveal that bagging, random forest, and J48 algorithms had the best efficiencies in the four-class stage, respectively. Moreover, ZeroR and Jrip required the shortest and the longest time to build the model, respectively.
|Table 7: Details of the final implementation of the algorithms in two-classes stage|
Click here to view
|Table 8: Details of the final implementation of the algorithms in multi-classes stage|
Click here to view
| Discussion and Conclusion|| |
In this study, a system was developed to extract and classify the drug interactions from the text. The key feature of using machine learning approach in the unbalanced classification of data was described. That is, implementing a two-stage classification (2-class, then 4-class), instead of a 5-class classification, had optimal applicability due to the unbalanced classes. A two-stage classification of drug interactions had advantages over one-stage classification. In addition to classifying the negative class rapidly, it also created a true negative rate between positive classes. Therefore, by turning this into a two-class problem, we can take advantage of binary classification techniques and allow the classifier to classify the samples into positive and negative classes according to its specific characteristics and avoid inclination toward the majority class. The use of a two-stage classifier also allows us to analyze different classifications to achieve the best results at each stage so that the two-stage classification, on the one hand, and multi-class classification, on the other hand, would yield the best results separately.
Our research shows that an appropriate classification is required for initial extraction of the features. This classification will result in better organization of a large amount of data. Lack of preprocessing methods may lead to disproportionate outputs and errors at the outputs; also, desirable features for classification inputs will not be generated.
As revealed in this study, after processing the data, different classification algorithms operate differently in terms of time and output and the results of their classification differ. Although ZeroR algorithm classified the drug interactions extraction model in a shorter time, taking other evaluation measures into consideration, we can say that the fastest algorithm does not always provide the best results.
In this study, different types of algorithms based on Bayes theory, rules, functions, and decision-making tree were contrasted in terms of better performance in classifications of drug interactions.
Bayes theory-based algorithms
At the beginning of the study, we expected a great deal about these types of algorithms due to the fact that they work based on statistics; however, according to F-measure (0.642), after the first stage of classification (separating pairs of interfering drugs from those without interaction), it was found that statistical algorithms failed to meet our expectations. This confirms the discussion that drug interactions are empirical and that this experimental science should be implemented through machine-learning methods.
According to the results of F-measure of the ZeroR algorithm (0.231), it was found that this algorithm had the worst performance in the second stage (four-class stage). However, given the results of the JRip algorithm and also considering the fact, that rule-based algorithms had the shortest modeling time, it can be concluded that these algorithms did not work that badly.
Although Weka itself gave the initial values to LibSVM parameters, and the SVM was sensitive to the parameter and right parameters were required, this algorithm failed to satisfy the expectation in both stages (two-class and four-class classifications).
The results showed that this kind of algorithms had the best performance, probably because the features used were extracted from the sentence and tree structure, which we saw in tree decomposition sentences was maintained in the process of data preprocessing. This group of algorithms had the best performance.
Meta (bagging) algorithm
In the first stage (two-class classification), bagging was in the third place with 0.1 difference with the best algorithm. In the second stage (four-class classification), however, bagging was in the first place with 0.08 difference between it and the third algorithm. This confirms the good performance of tree algorithms. However, given that this algorithm had the best results based on Roc analysis (in both two-class and multi-class classifications), it can be concluded that bagging has well met the expectations about a performance improvement algorithm.
Since the best algorithms are of the same group in both two-class and four-class stages (regardless of bagging algorithm), it can be concluded that the division of the five-class drug interaction classifications into a two-class and a four-class classifications is a good approach with a certain degree of uniformity.
According to the results, J48 and random forest are the most suitable algorithms for differentiation between negative and positive drug interactions and have a good performance in classification of positive drug interactions in the effect, advice, mechanism, and int. classes of bagging and random forest algorithms. We can, therefore, say that the best algorithms for classifying the interactions extracted from the medical and pharmacy texts are within the group of decision tree algorithms.
[Table 9] displays the efficiency of the bagging classification algorithm based on various classes of drug interactions; although int. class had the minimum sample size in the training dataset, it had the highest F-measure (0.93). Since recall (mechanism) shows the classification sensitivity of drugs with mechanism interaction relative to the total samples with mechanism interaction, it can be concluded that all samples related to this class are classified in the right class (100%). In addition, given its lowest F-measure, it can be concluded that there are also false classifications in this class.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Aspden P, Wolcott J, Bootman JL, Cronenwett LR. Preventing medication error. Washington DC: The National Academies Press; 2006. p. 118-25.
Chen YF, Avery AJ, Neil KE, Johnson C, Dewey ME, Stockley IH. Incidence and possible causes of prescribing potentially hazardous/contraindicated drug combinations in general practice. Drug Saf Int J Med Toxicol Drug Exp 2005;28:67-80.
Second National Report on Patient Safety: Improving Medication Safety. Australian Council for Safety and Quality in Health Care; 2002. p. 20-8.
Rosholm JU, Bjerrum L, Hallas J, Worm J, Gram LF. Polypharmacy and the risk of drug-drug interactions among Danish elderly. A prescription database study. Dan Med Bull 1998;45:210-3.
Pirmohamed M, James S, Meakin S, Green C, Scott AK, Walley TJ, et al
. Adverse drug reactions as cause of admission to hospital: Prospective analysis of 18 820 patients. BMJ 2004;329:15-9.
Runciman WB, Roughead EE, Semple SJ, Adams RJ. Adverse drug events and medication errors in Australia. Int J Qual Health Care 2003;15 Suppl 1:i49-59.
Nebeker JR, Barach P, Samore MH. Clarifying adverse drug events: A clinician's guide to terminology, documentation, and reporting. Ann Intern Med 2004;140:795-801.
Magro L, Moretti U, Leone R. Epidemiology and characteristics of adverse drug reactions caused by drug-drug interactions. Expert Opin Drug Saf 2012;11:83-94.
Gurwitz JH, Field TS, Harrold LR, Rothschild J, Debellis K, Seger AC, et al
. Incidence and preventability of adverse drug events among older persons in the ambulatory setting. JAMA 2003;289:1107-16.
Gurwitz JH, Field TS, Judge J, Rochon P, Harrold LR, Cadoret C, et al
. The incidence of adverse drug events in two large academic long-term care facilities. Am J Med 2005;118:251-8.
Hines LE, Murphy JE. Potentially harmful drug-drug interactions in the elderly: A review. Am J Geriatr Pharmacother 2011;9:364-77.
Hines LE, Malone DC, Murphy JE. Recommendations for generating, evaluating, and implementing drug-drug interaction evidence. Pharmacotherapy 2012;32:304-13.
Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, et al
. DrugBank 3.0: A comprehensive resource for 'omics' research on drugs. Nucleic Acids Res 2011;39:D1035-41.
Sun Y, Xiong Y, Xu Q, Wei D. A hadoop-based method to predict potential effective drug combination. Biomed Res Int 2014;2014:196858.
Xu Q, Xiong Y, Dai H, Kumari KM, Xu Q, Ou HY, et al
. PDC-SGB: Prediction of effective drug combinations using a stochastic gradient boosting algorithm. J Theor Biol 2017;417:1-7.
Bai LY, Dai H, Xu Q, Junaid M, Peng SL, Zhu X, et al
. Prediction of effective drug combinations by an improved naïve Bayesian algorithm. Int J Mol Sci 2018;19. pii: E467.
Segura-Bedmar I, Martınez P, Herrero-Zazo M. SemEval-2013 Task 9: Extraction of Drug-Drug Interactions from Biomedical Texts (DDI Extraction 2013). Spain; 2013.
Powers DM. Evaluation: Fro precision, recall and F-measure to ROC, informedness, markedness and correlation. J Mach Learn Technol 2011;2:37-63.
[Figure 1], [Figure 2]
[Table 1], [Table 2], [Table 3], [Table 4], [Table 5], [Table 6], [Table 7], [Table 8], [Table 9]