Package for ordinal data classification and preprocessing implementing algorithms in Scala
Included algorithms are:
Before installing OCAPIS you need to get Python (>=2.7), Scala(>=2.11) and libsvm-weights-3.17 installed on your system, if they are not yet.
If using Linux, you can easily install Python from the command line, just typing:
$ sudo apt-get install python3
If your system is an Ubuntu distribution, or its counterpart in the distro you use. If you are not using Linux or you are not convinced to install Python through command line, just check this official Python Installation guide.
Similarly, if using Linux, you can install Scala from repo. For example, for Linux Mint just type:
sudo apt install scala
In any other case just check the Other ways to install Scala section from the official Scala Installation Guide.
Libsvm-weights-3.17 is required as it is used by SVMOP method. To install it, just follow the instructions in Installation and Data Format section from the README on Libsvm-weights.
After installing the external dependencies, the latest version of OCAPIS can be installed from GitHub via:
devtools::install_github("cristinahg/OCAPIS/OCAPIS")
The rest of the dependencies will be automatically installed. These are Reticulate and Rscala.
Below are shown examples of how to use all classification and preprocessing methods, using an ordinal dataset named balance-scale.
# Data reading
dattrain<-read.table("train_balance-scale.0", sep=" ")
trainlabels<-dattrain[,ncol(dattrain)]
traindata=dattrain[,-ncol(dattrain)]
dattest<-read.table("test_balance-scale.0", sep=" ")
testdata<-dattest[,-ncol(dattest)]
testlabels<-dattest[,ncol(dattest)]
# SVMOP
modelstrain<-svmofit(traindata,trainlabels,TRUE,0.1,0.1)
predictions<-svmopredict(modelstrain,testdata)
sum(predictions[[2]]==testlabels)/nrow(dattest)
# POM
fit<-pomfit(traindata,trainlabels,"logistic")
predictions<-pompredict(fit,testdata)
projections<-predictions[[1]]
predictedLabels<-predictions[[2]]
sum(predictedLabels==testlabels)/nrow(dattest)
# KDLOR
myfit<-kdlortrain(traindata,trainlabels,"rbf",10,0.001,1)
pred<-kdlorpredict(myfit,traindata,testdata)
sum(pred[[1]]==testlabels)/nrow(dattest)
# WKNNOR
predictions<-wknnor(traindata,trainlabels,testdata,5,2,"rectangular",FALSE)
sum(predictions==testlabels)/nrow(dattest)
mae(testlabels,predictions)
# Feature Selector
selected<-fselector(traindata,trainlabels,2,2,8)
trainselected<-traindata[,selected]
# Instance Selector
selected<-iselector(traindata,trainlabels,0.02,0.1,5)
trainselected<-selected[,-ncol(selected)]
trainlabels<-selected[,ncol(selected)]
For more details about method params, see OCAPIS documentation.