Define attribute in arff file using java
I am using j48 method in weka. Below is my sample data for training .arff,
@relation l4_tbl_final
@attribute MouseVariance numeric
@attribute EyeValue numeric
@attribute SocialTime numeric
@attribute KeyWords numeric
@attribute InvolvedTime numeric
@attribute grade {B,A,C}
@data
2731.35,87,47.55,0,49.7,B
864.891,55,0,0,94.33,B
2495.8,1386,0,2,71.75,A
1104.04,4490,0,0,61.91,B
The first 5 values ββare parameters and are based on this class "A", "B", "C".
Now I need to provide a set of test data and predict the estimate of that data. for this i have to provide the testdata.arff file as follows (? marks in class)
@attribute MouseVariance numeric
@attribute EyeValue numeric
@attribute SocialTime numeric
@attribute KeyWords numeric
@attribute InvolvedTime numeric
@attribute grade {B,A,C}
@data
2731.35,87,47.55,0,49.7,?
864.891,55,0,0,94.33,?
2495.8,1386,0,2,71.75,?
1104.04,4490,0,0,61.91,?
I used the following java code to convert sql databases to csv and after that the csv is converted to arff:
while (resultSet.next()) {
row = spreadsheet.createRow(i);
cell = row.createCell(0);
cell.setCellValue(resultSet.getString("MouseVariance"));
cell = row.createCell(1);
cell.setCellValue(resultSet.getString("EyeValue"));
cell = row.createCell(2);
cell.setCellValue(resultSet.getString("SocialTime"));
cell = row.createCell(3);
cell.setCellValue(resultSet.getString("KeyWords"));
cell = row.createCell(4);
cell.setCellValue(resultSet.getString("InvolvedTime"));
cell = row.createCell(5);
cell.setCellValue("?");
i++;
}
but when i create arff file this way the attribute is displayed as
@attribute grade {numaric} value.
therefore the expected class is not predicted. but if it does, then it will fix the problem.
@attribute grade {B,A,C}
How can I solve this?
source to share
It looks like the attribute doesn't know the list of available nominal values ββin the list.
Perhaps the AddValues filter can help add these items to the list. You can add the values ββof A, B, and C to the nominal variable, thereby making them consistent with the training data.
If that's not a problem, please provide more code and generated output and I'll look a little further.
Hope it helps!
source to share