Define attribute in arff file using java

I am using j48 method in weka. Below is my sample data for training .arff,

 @relation l4_tbl_final

    @attribute MouseVariance numeric
    @attribute EyeValue numeric
    @attribute SocialTime numeric
    @attribute KeyWords numeric
    @attribute InvolvedTime numeric
    @attribute grade {B,A,C}

    @data
    2731.35,87,47.55,0,49.7,B
    864.891,55,0,0,94.33,B
    2495.8,1386,0,2,71.75,A
    1104.04,4490,0,0,61.91,B

      

The first 5 values ​​are parameters and are based on this class "A", "B", "C".

Now I need to provide a set of test data and predict the estimate of that data. for this i have to provide the testdata.arff file as follows (? marks in class)

@attribute MouseVariance numeric
@attribute EyeValue numeric
@attribute SocialTime numeric
@attribute KeyWords numeric
@attribute InvolvedTime numeric
@attribute grade {B,A,C}

@data
2731.35,87,47.55,0,49.7,?
864.891,55,0,0,94.33,?
2495.8,1386,0,2,71.75,?
1104.04,4490,0,0,61.91,?

      

I used the following java code to convert sql databases to csv and after that the csv is converted to arff:

while (resultSet.next()) {
        row = spreadsheet.createRow(i);
        cell = row.createCell(0);
        cell.setCellValue(resultSet.getString("MouseVariance"));
        cell = row.createCell(1);
        cell.setCellValue(resultSet.getString("EyeValue"));
        cell = row.createCell(2);
        cell.setCellValue(resultSet.getString("SocialTime"));
        cell = row.createCell(3);
        cell.setCellValue(resultSet.getString("KeyWords"));
        cell = row.createCell(4);
        cell.setCellValue(resultSet.getString("InvolvedTime"));
        cell = row.createCell(5);
        cell.setCellValue("?");

                i++;
    }

      

but when i create arff file this way the attribute is displayed as

@attribute grade {numaric} value.

      

therefore the expected class is not predicted. but if it does, then it will fix the problem.

 @attribute grade {B,A,C}

      

How can I solve this?

+3


source to share


1 answer


It looks like the attribute doesn't know the list of available nominal values ​​in the list.

Perhaps the AddValues filter can help add these items to the list. You can add the values ​​of A, B, and C to the nominal variable, thereby making them consistent with the training data.



If that's not a problem, please provide more code and generated output and I'll look a little further.

Hope it helps!

0


source







All Articles