Skip to content

Commit 3601cff

Browse files
initial version (v7.0)
1 parent f93adc3 commit 3601cff

File tree

6 files changed

+1797
-0
lines changed

6 files changed

+1797
-0
lines changed

README.txt

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
NAME
2+
CorScorer: Perl package for scoring coreference resolution systems
3+
using different metrics.
4+
5+
6+
VERSION
7+
v7.0 -- reference implementations of MUC, B-cubed and CEAF metrics.
8+
9+
10+
INSTALLATION
11+
Requirements:
12+
1. Perl: downloadable from http://perl.org
13+
2. Algorithm-Munkres: included in this package and downloadable
14+
from CPAN http://search.cpan.org/~tpederse/Algorithm-Munkres-0.08
15+
16+
USE
17+
This package is distributed with two scripts to execute the scorer from
18+
the command line.
19+
20+
Windows (tm): scorer.bat
21+
Linux: scorer.pl
22+
23+
24+
SYNOPSIS
25+
use CorScorer;
26+
27+
$metric = 'ceafm';
28+
29+
# Scores the whole dataset
30+
&CorScorer::Score($metric, $keys_file, $response_file);
31+
32+
# Scores one file
33+
&CorScorer::Score($metric, $keys_file, $response_file, $name);
34+
35+
36+
INPUT
37+
metric: the metric desired to score the results:
38+
muc: MUCScorer (Vilain et al, 1995)
39+
bcub: B-Cubed (Bagga and Baldwin, 1998)
40+
ceafm: CEAF (Luo et al, 2005) using mention-based similarity
41+
ceafe: CEAF (Luo et al, 2005) using entity-based similarity
42+
all: uses all the metrics to score
43+
44+
keys_file: file with expected coreference chains in SemEval format
45+
46+
response_file: file with output of coreference system (SemEval format)
47+
48+
name: [optional] the name of the document to score. If name is not
49+
given, all the documents in the dataset will be scored. If given
50+
name is "none" then all the documents are scored but only total
51+
results are shown.
52+
53+
54+
OUTPUT
55+
The score subroutine returns an array with four values in this order:
56+
1) Recall numerator
57+
2) Recall denominator
58+
3) Precision numerator
59+
4) Precision denominator
60+
61+
Also recall, precision and F1 are printed in the standard output when variable
62+
$VERBOSE is not null.
63+
64+
Final scores:
65+
Recall = recall_numerator / recall_denominator
66+
Precision = precision_numerator / precision_denominator
67+
F1 = 2 * Recall * Precision / (Recall + Precision)
68+
69+
Identification of mentions
70+
An scorer for identification of mentions (recall, precision and F1) is also included.
71+
Mentions from system response are compared with key mentions. There are two kind of
72+
positive matching response mentions:
73+
74+
a) Strictly correct identified mentions: Tokens included in response mention are exactly
75+
the same tokens included in key mention.
76+
77+
b) Partially correct identified mentions: Response mention tokens include the head token
78+
of the key mention and no new tokens are added (i.e. the key mention bounds are not
79+
overcome).
80+
81+
82+
The partially correct mentions can be given some credit (for
83+
example, a weight of 0.5) to get a combined score as follows:
84+
85+
Recall = (a + 0.5 * b) / #key mentions
86+
Precision = (a + 0.5 * b) / #response mentions
87+
F1 = 2 * Recall * Precision / (Recall + Precision)
88+
89+
For the official CoNLL evaluation, however, we will only consider
90+
mentions with exact boundaries as being correct.
91+
92+
SEE ALSO
93+
94+
1. http://stel.ub.edu/semeval2010-coref/
95+
96+
2. Marta Recasens, Lluís Màrquez, Emili Sapena, M. Antònia Martí, Mariona Taulé,
97+
Véronique Hoste, Massimo Poesio, and Yannick Versley. 2010. SemEval-2010 Task 1:
98+
Coreference Resolution in Multiple Languages. In Proceedings of the ACL International
99+
Workshop on Semantic Evaluation (SemEval-2010), pp. 1-8, Uppsala, Sweden.
100+
101+
102+
AUTHOR
103+
Emili Sapena, Universitat Politècnica de Catalunya
104+
http://www.lsi.upc.edu/~esapena
105+
esapena <at> lsi.upc.edu
106+
107+
108+
COPYRIGHT AND LICENSE
109+
Copyright (C) 2009-2011, Emili Sapena esapena <at> lsi.upc.edu
110+
111+
This program is free software; you can redistribute it and/or modify it
112+
under the terms of the GNU General Public License as published by the
113+
Free Software Foundation; either version 2 of the License, or (at your
114+
option) any later version. This program is distributed in the hope that
115+
it will be useful, but WITHOUT ANY WARRANTY; without even the implied
116+
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
117+
GNU General Public License for more details.
118+
119+
You should have received a copy of the GNU General Public License along
120+
with this program; if not, write to the Free Software Foundation, Inc.,
121+
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
122+
123+
Modified in 2013 for v1.07 by Sebastian Martschat,
124+
sebastian.martschat <at> h-its.org

0 commit comments

Comments
 (0)