Classification Tree

Classification Tree for Court Decision

Classification trees have advantage over logistic regression models in that they are easy to interpret.

In [1]:
frame1 <- read.csv("Stevens.csv")
In [2]:
str(frame1)
'data.frame':	566 obs. of  9 variables:
 $ Docket    : Factor w/ 566 levels "00-1011","00-1045",..: 63 69 70 145 97 181 242 289 334 436 ...
 $ Term      : int  1994 1994 1994 1994 1995 1995 1996 1997 1997 1999 ...
 $ Circuit   : Factor w/ 13 levels "10th","11th",..: 4 11 7 3 9 11 13 11 12 2 ...
 $ Issue     : Factor w/ 11 levels "Attorneys","CivilRights",..: 5 5 5 5 9 5 5 5 5 3 ...
 $ Petitioner: Factor w/ 12 levels "AMERICAN.INDIAN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Respondent: Factor w/ 12 levels "AMERICAN.INDIAN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ LowerCourt: Factor w/ 2 levels "conser","liberal": 2 2 2 1 1 1 1 1 1 1 ...
 $ Unconst   : int  0 0 0 0 0 1 0 1 0 0 ...
 $ Reverse   : int  1 1 1 1 1 0 1 1 1 1 ...
In [51]:
library(caTools)
library(caret)
library(e1071)
In [23]:
set.seed(3000)
spl <- sample.split(frame1$Reverse, SplitRatio=0.7)
train <- subset(frame1, spl == TRUE)
test <- subset(frame1, spl == FALSE)
dim(train)
dim(test)
Out[23]:
  1. 396
  2. 9
Out[23]:
  1. 170
  2. 9
In [ ]:
In [11]:
library(rpart)
install.packages("rpart.plot")
library(rpart.plot)
Installing package into 'C:/Users/Desta/Documents/R/win-library/3.1'
(as 'lib' is unspecified)
Error in contrib.url(repos, "source"): trying to use CRAN without setting a mirror
Error in library(rpart.plot): there is no package called 'rpart.plot'
In [12]:
tree_Model <- rpart(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = train,
                    method = "class", minbucket = 25)
In [13]:
prp(tree_Model)
Error in eval(expr, envir, enclos): could not find function "prp"
In [24]:
# Prediction on the test dataset
CART_Prediction <- predict(tree_Model, newdata = test, type = "class")
In [25]:
table(test$Reverse, CART_Prediction)
Out[25]:
   CART_Prediction
     0  1
  0 41 36
  1 22 71
In [30]:
Accuracy <- (41+71)/sum(table(test$Reverse, CART_Prediction))
round(Accuracy,3)
Out[30]:
0.659
In [32]:
# Plot the ROC curve
library(ROCR)
In [36]:
Prediction_ROC <- predict(tree_Model, newdata = test)
pred <- prediction(Prediction_ROC[,2], test$Reverse)
perf <- performance(pred, "tpr", "fpr")
plot(perf, colorize = TRUE)
In [38]:
#Calculate the area under the curve
performance(pred, "auc")@y.values
Out[38]:

    0.693

Rondom Forest Model

In [40]:
library(randomForest)
In [42]:
train$Reverse <- as.factor(train$Reverse)
test$Reverse <- as.factor(test$Reverse)
forest_Model <- randomForest(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = train,
                nodesize = 25, ntrees = 200)
In [43]:
forest_Prediction <- predict(forest_Model, newdata = test)
table(test$Reverse, forest_Prediction)
Out[43]:
   forest_Prediction
     0  1
  0 41 36
  1 19 74
In [45]:
Accuracy_forest <- (41+74)/sum(table(test$Reverse, forest_Prediction))
round(Accuracy_forest, 3)
Out[45]:
0.676

The random forest model has improved the accuracy a little bit.

k-fold Cross Validation

In [54]:
numFolds <- trainControl(method="cv", number=10)
cpGrid <- expand.grid(.cp = seq(0.01, 0.5, 0.01)) # defines the cp parameter to test as numbers from 0.01 to 0.5 in increment of 0.01.
In [57]:
train(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = train,
                method = "rpart", trControl = numFolds, tuneGrid = cpGrid)
Out[57]:
CART

396 samples
  8 predictor
  2 classes: '0', '1'

No pre-processing
Resampling: Cross-Validated (10 fold)

Summary of sample sizes: 356, 356, 357, 356, 357, 356, ...

Resampling results across tuning parameters:

  cp    Accuracy  Kappa   Accuracy SD  Kappa SD
  0.01  0.644     0.2655  0.07412      0.1556
  0.02  0.639     0.2572  0.05265      0.1110
  0.03  0.626     0.2373  0.05089      0.1074
  0.04  0.626     0.2410  0.05089      0.1108
  0.05  0.644     0.2833  0.04053      0.0795
  0.06  0.644     0.2833  0.04053      0.0795
  0.07  0.644     0.2833  0.04053      0.0795
  0.08  0.644     0.2833  0.04053      0.0795
  0.09  0.644     0.2833  0.04053      0.0795
  0.10  0.644     0.2833  0.04053      0.0795
  0.11  0.644     0.2833  0.04053      0.0795
  0.12  0.644     0.2833  0.04053      0.0795
  0.13  0.644     0.2833  0.04053      0.0795
  0.14  0.644     0.2833  0.04053      0.0795
  0.15  0.644     0.2833  0.04053      0.0795
  0.16  0.644     0.2833  0.04053      0.0795
  0.17  0.644     0.2833  0.04053      0.0795
  0.18  0.644     0.2833  0.04053      0.0795
  0.19  0.644     0.2833  0.04053      0.0795
  0.20  0.626     0.2398  0.04371      0.1029
  0.21  0.611     0.2004  0.04114      0.1123
  0.22  0.571     0.0855  0.03326      0.1113
  0.23  0.545     0.0000  0.00596      0.0000
  0.24  0.545     0.0000  0.00596      0.0000
  0.25  0.545     0.0000  0.00596      0.0000
  0.26  0.545     0.0000  0.00596      0.0000
  0.27  0.545     0.0000  0.00596      0.0000
  0.28  0.545     0.0000  0.00596      0.0000
  0.29  0.545     0.0000  0.00596      0.0000
  0.30  0.545     0.0000  0.00596      0.0000
  0.31  0.545     0.0000  0.00596      0.0000
  0.32  0.545     0.0000  0.00596      0.0000
  0.33  0.545     0.0000  0.00596      0.0000
  0.34  0.545     0.0000  0.00596      0.0000
  0.35  0.545     0.0000  0.00596      0.0000
  0.36  0.545     0.0000  0.00596      0.0000
  0.37  0.545     0.0000  0.00596      0.0000
  0.38  0.545     0.0000  0.00596      0.0000
  0.39  0.545     0.0000  0.00596      0.0000
  0.40  0.545     0.0000  0.00596      0.0000
  0.41  0.545     0.0000  0.00596      0.0000
  0.42  0.545     0.0000  0.00596      0.0000
  0.43  0.545     0.0000  0.00596      0.0000
  0.44  0.545     0.0000  0.00596      0.0000
  0.45  0.545     0.0000  0.00596      0.0000
  0.46  0.545     0.0000  0.00596      0.0000
  0.47  0.545     0.0000  0.00596      0.0000
  0.48  0.545     0.0000  0.00596      0.0000
  0.49  0.545     0.0000  0.00596      0.0000
  0.50  0.545     0.0000  0.00596      0.0000

Accuracy was used to select the optimal model using  the largest value.
The final value used for the model was cp = 0.19. 
In [58]:
tree_ModelCV <- rpart(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = train,
                    method = "class", cp = 0.19)
treeCV_Prediction <- predict(tree_ModelCV, newdata=test, type = "class")
table(test$Reverse, treeCV_Prediction)
Out[58]:
   treeCV_Prediction
     0  1
  0 59 18
  1 29 64
In [60]:
Accuracy_treeCV <- (59+64)/sum(table(test$Reverse, treeCV_Prediction))
round(Accuracy_treeCV, 3)
Out[60]:
0.724

Using cross validation helped us to select a good parameter which evenually helped to improve the classification accuracy.

In [ ]: