Dataset | OpenDataMonitor

European
Data Catalogues
Dataset

CKAN

Sub menu

Data catalogues

Reuters-128 NIF NER Corpus

Dataset Profile

Odm ID	ce089a95-a81d-4599-9d98-818ee0003237
Τίτλος	Reuters-128 NIF NER Corpus
Σημειώσεις	This English corpus is based on the well known Reuters-21578 corpus which contains economic news articles. In particular, we chose 128 articles containing at least one NE. Compared to the News-100 corpus the documents of Reuters-128 are significantly shorter and thus carry a smaller context. To create the annotation of NEs with URIs, we implemented a supporting judgement tool. . The input for the tool was a subset of more than 150 Reuters-21578 news articles sampled randomly. First, FOX (Ngonga Ngomo et al., 2011) was used for recognizing a first set of NEs. This reduced the amount of work to a feasible portion regarding the size of this dataset. Afterwards, the domain experts corrected the mistakes of FOX manually using the annotation tool. Therefore, the tool highlighted the entities in the texts and added initial URI candidates via simple string matching algorithms. Two scientists determined the correct URI for each named entity manually with an initial voter agreement of 74%. This low initial agreement rate hints towards the difficulty of the disambiguation task. In some cases judges did not agree initially, but came to an agreement shortly after reviewing the cases. While annotating, we left out ticker symbols of companies (e.g., GOOG for Google Inc.), abbreviations and job descriptions be- cause those are always preceded by the full company name respectively a person’s name.
Συγγραφέας	Ricardo Usbeck
Ηλ. Διεύθ. Συγγραφέα	ricardo.usbeck@googlemail.com
Σύνδεσμος Καταλόγου	http://datahub.io/
Σύνδεσμος συνόλου δεδομένων	http://thedatahub.org/dataset/reuters-128-nif-ner-corpus
Ενημέρωση μεταδεδομένων	2015-09-15 13:07:00
Ετικέτες
Ημερομηνία Δημοσίευσης
Ημερομηνία Ενημέρωσης
Συχνότητα Ενημέρωσης
Οργανισμός	AKSW
Χώρα
Κατάσταση
Πλατφόρμα	ckan
Γλώσσα	en
Έκδοση	(μη ορισμένο)

OPENDATAMONITOR

Η πλατφόρμα αυτή παρέχει μια επισκόπηση διαθέσιμων πόρων ανοικτών δεδομένων. Επιτρέπει την ανάλυση και απεικόνιση υφιστάμενων καταλόγων δεδομένων με τη χρήση καινοτόμων τεχνολογιών.

Συγχρηματοδοτούμενο από την ΕΚ

Το έργο αυτό έχει λάβει χρηματοδότηση από το Έβδομο Πρόγραμμα Πλαίσιο της Ευρωπαϊκής Ένωσης για την έρευνα, τεχνολογική ανάπτυξη και επίδειξη με βάση τη συμφωνία επιχορήγησης Αρ. 611988.

Ακολουθήστε μας

Η πλατφόρμα OpenDataMonitor είναι σε συνεχή εξέλιξη. Ακολουθήστε μας για να ενημερωθείτε για τα τελευταία νέα.

ΑΔΕΙΑ

με άδεια χρήσης Creative Commons Attribution 4.0 International License.

SELECT YOUR LANGUAGE