GATE car problems
I am using gate.ac.uk GUI for text data of mines and now I am trying to use its machine learning module. To do this, I created some XML schemas to load into GATE. Here's one example:
<?xml version="1.0"?>
<schema xmlns="http://www.w3.org/2000/10/XMLSchema">
<!-- XSchema definition for Condition -->
<element name="Condition">
<complexType>
<attribute name="attrb_ConditionStatus" use="optional" value="other">
<simpleType>
<restriction base="string">
<enumeration value="value_condition"/>
</restriction>
</simpleType>
</attribute>
</complexType>
</element>
</schema>
I created a similar schema for every attribute that I want to annotate. I will illustrate the step that I am implementing after creating the schemas: 1. I load the Schematic Annotation Editor for this purpose, and then load the customized schemas through the Language Resources menu item. 2. I also upload documents and corpus. 3. Then I run Annie 4. I see the custom schema in the Annotations tab of the document 5. I comment on terms with my custom annotations
Now I want to run machine learning through the "Learning-Batch Learning PR" plugin. I have added a processing resource to my application line. My problem is about generating a machine learning configuration file / schema, I searched the web but couldn't figure out how to create the schema correctly. I've looked at various examples, here's my attempt:
<?xml version="1.0"?>
<ML-CONFIG>
<VERBOSITY level="1"/>
<SURROUND value="true"/>
<PARAMETER name="thresholdProbabilityEntity" value="0.2"/>
<PARAMETER name="thresholdProbabilityBoundary" value="0.4"/>
<multiClassification2Binary method="one-vs-others"/>
<EVALUATION method="holdout" ratio="0.66"/>
<ENGINE nickname="PAUM" implementationName="PAUM"
options="-p 50 -n 5 -optB 0.3"/>
<DATASET>
<INSTANCE-TYPE>Token</INSTANCE-TYPE>
<ATTRIBUTELIST>
<NAME>ManType</NAME>
<SEMTYPE>NOMINAL</SEMTYPE>
<TYPE>Manufactuer</TYPE>
<FEATURE>category</FEATURE>
<RANGE from="-2" to="2"/>
</ATTRIBUTELIST>
<ATTRIBUTELIST>
<NAME>ModelType</NAME>
<SEMTYPE>NOMINAL</SEMTYPE>
<TYPE>Model</TYPE>
<FEATURE>orth</FEATURE>
<RANGE from="-2" to="2"/>
</ATTRIBUTELIST>
<ATTRIBUTE>
<NAME>Class1</NAME>
<SEMTYPE>NOMINAL</SEMTYPE>
<TYPE>Manufacturer</TYPE>
<FEATURE>majorType</FEATURE>
<POSITION>0</POSITION>
</ATTRIBUTE>
<ATTRIBUTE>
<NAME>Class2</NAME>
<SEMTYPE>NOMINAL</SEMTYPE>
<TYPE>Model</TYPE>
<FEATURE>type</FEATURE>
<POSITION>0</POSITION>
<CLASS/>
</ATTRIBUTE>
</DATASET>
</ML-CONFIG>
I want a machine learning algorithm to participate in annotating the manufacturer and the model (types), which is also a custom annotation that I created with a schema. My first question is, does the ml config structure look correct? I add a new Pipelin Corpus, add a Patch Learning PR process, select the Evaluation mode, and then run the application in my tutorial document. This is the result:
The number of threads used is 1
** Evaluation mode started:
Hold-out test: runs=1, ratio of training docs is 0.66
Split, k=1, trainingNum=0.
HOLDOUT Fold 0: (correct, partialCorrect, spurious, missing)= (0.0, 0.0, 0.0, 0.0); (precision, recall, F1)= (0.0, 0.0, 0.0); Lenient: (0.0, 0.0, 0.0)
*** Averaged results for each label over 1 runs as:
Results of single label:
Overall results as:
(correct, partialCorrect, spurious, missing)= (0.0, 0.0, 0.0, 0.0); (precision, recall, F1)= (0.0, 0.0, 0.0); Lenient: (0.0, 0.0, 0.0)
This learning session finished!
The result shows that something is not configured correctly - either the ml config file or the pipeline I created for this purpose. If anyone can share some thoughts on this, I would be grateful. Again, I've scoured the internet high and low and read a few machine learning tutorials and ppt at gate.ac.uk, but still seems pretty confusing to me.
Ofer relationship
source to share
No one has answered this question yet
Check out similar questions: