The results of semantic role labeling shared task are out

Now, that everyone has submitted their reports on Project B, we publish the results of SRL competition (or shared task, if you like). We have 12 student scores (3 persons did not want their results to be shown), and one baseline system score. The gold test set is published as well, together with the evaluation script.

5 submissions were made after the first official deadline, they are marked 'late'.

The submissions were evaluated with 4 metrics:

  1. Macro F1 measure
  2. Accuracy
  3. Correlation between the frequencies of each role in the gold test set and in the submitted predictions.
  4. Ratio of predicates with non-unique core arguments (for example, two A0 arguments). In the gold test set, this ratio is 0.005.

Our baseline system used a very simple decision tree classifier with the basic set of features (plus grammatical function of the predicate and the location of the argument before or after the predicate), without any re-ranking. Many of the submitted systems were much better, as can be seen from the results table.

For the highest ranking submissions, we also added brief descriptions of the algorithms and features they used. The winner will be officially announced and given a prize during the class on December 11.

Thanks to everyone for the participation!

Published Dec. 10, 2017 1:54 PM - Last modified Dec. 10, 2017 1:55 PM