Proceedings
IberLEF 2019 Proceedings are published in CEUR Workshop Proceedings Series; http://ceur-ws.org/Vol-2421/
Our overview’s version in the proceedings has been adjusted due to certain inconsistencies found in Table 14. The updated version can be found here.
Overview
The main objective of these tasks is to propose to participants the challenge of applying their systems/solutions to the activities of Named Entity Recognition (NER) and/or Relation Extraction (RE) in Portuguese texts. For this, three independent tasks have been organized, and participants are free to apply for any combination of activities, be it only one, two or all of them.
These tasks will contribute to the progress of Portuguese natural language processing, as there is a demand in the area for the development of new methods, tools and specific resources such as annotated data. These tasks are part of IberLEF 2019.
Task 1: Named Entity Recognition
Description
The first task we propose is NER, the task of identifying proper nouns within a given text and classifying them into one of many relevant categories or within a default category known as Miscellaneous. Our objective with this task is to evaluate the proposed systems in many textual genres. For datasets that have as main textual genres: news, memorandums, e-mails, interviews and magazine articles, we will evaluate the following categories: PER – Person, PLC – Place, ORG – Organization, VAL – Value and TME – Time. On the other hand, for Clinical notes and Legal texts, of which we will only evaluate the PER – Person category.
The coordinators will be responsible for:
- Evaluating the systems;
- Reviewing working notes;
- Camera ready submissions.
The participants will be responsible for:
- Development and training of systems;
- Submission of systems;
- Submissions of working notes.
Activities
The NER task consists of the following steps:
-
-
- Development Phase: For this phase, participants are required to develop a computational approach to NER. This approach, hereby referred to as system, must be capable of solving NER tasks for many textual genres. Participants are free to develop their solution however they see fit, so long as they comply with the requirements described in the training and test phases;
- Training Phase: The objective of this phase is that participants choose their training datasets. Participants are free to choose any datasets they so desire for training in the various types of textual genres;
-
-
-
- Test Phase: In this phase the coordinators will evaluate the capacity and reproduction of the submitted systems:
- Reproduction Stage: For this stage, the participants proposed systems will be executed by the coordinators. Should the coordinators be unable to execute a system, said system will not be evaluated;
- Evaluation Stage: Inputs composed of corpora in different textual genres will be entered into all systems that passed the Reproduction Stage. The expected output is to be in the “.txt” format, so that it may be evaluated via script following CoNLL-2002 metrics.
- Test Phase: In this phase the coordinators will evaluate the capacity and reproduction of the submitted systems:
-
Schedule
Below is the activity schedule for this task:
Activities | Dates | Responsible Party | Resources |
---|---|---|---|
Development and Training Phases | 18/03 to 22/04 | Participants | Task 1 - Input and Output Format Examples |
Submission of Systems | 25/04 to 06/05 | Participants | System Submission Instructions Revised System Submission |
Evaluation of Systems | 01/05 to 10/06 | Coordinators | Dataset Explanation and System Evaluations |
Working Notes Delivery | 11/06 to | Participants | Paper Publication Instructions |
Working Notes Review | Coordinators | ||
Camera Ready Submissions | 03/07 | Coordinators | |
IberLEF Workshop | 24/09 |
Task 2: Relation Extraction for Named Entities
Description
We propose a RE task that involves the automatic extraction of any relation descriptor expressing any type of relation between a pair of Named Entities of the Person, Place and Organization categories in Portuguese language texts.
The coordinators will be responsible for:
- Providing examples (seeds);
- Providing test datasets;
- Evaluating results;
- Reviewing working notes;
- Camera ready submissions.
The participants will be responsible for:
- Testing data;
- Delivering the results;
- Delivering working notes.
Activities
This RE task consists of the following steps:
- Systems Development Phase: In this phase, the coordinators will make a small annotated dataset (seeds) available for the participants’ use in developing their RE systems;
- Test Phase: The test phase includes two options for participants:
- Test 1: For this test, participants must extract relation descriptors between NE pairs (of Person, Place or Organization categories) from data provided by the coordinators. This data will already be annotated with NE information when provided, and as such will not necessitate the application of a NER system by participants;
- Test 2: For this test, the data provided will not be annotated with NE information. As such, the goal of the task will be to extract and classify (with Person, Place or Organization categories) the NEs from the test sentences, and then they must also extract the relation descriptors between pairs of the recognized NEs;
- Evaluation Phase: In this phase the participants will send their results from the Test Phase. They may submit results from Test 1, Test 2 or both to evaluation by the coordinators. Afterwards, the analyzed results will be sent back to the participants. The metrics used for evaluation phase will be Precision, Recall and F-measure.
Schedule
Below is the activity schedule for this task:
Activities | Dates | Responsible Party | Resources |
---|---|---|---|
Release of Examples | 18/03 | Coordinators | Task 2 - Examples |
Release of Data | 01/04 | Coordinators | Task 2 - Test Corpora |
Test Phase | 01/04 to 06/05 | Participants | |
Results Delivery (Test 1 and/or Test 2) | 06/05 to 20/05 | Participants | Task 2 - Results Submission |
Evaluation | 20/05 to 10/06 | Coordinators | Task 2 Evaluation |
Working Notes Delivery | 10/06 to | Participants | Paper Publication Instructions |
Working Notes Review | Coordinators | ||
Camera Ready Submissions | 03/07 | Coordinators | |
IberLEF Workshop | 24/09 |
Task 3: General Open Relation Extraction
Description
The task of general open relation extraction aims to identify structured representations of the information contained in unstructured sources, such as textual documents. This task faces many challenges, considering the generality of the problem, as well as the required linguistic knowledge to automatically perform such a task.
This task involves the automatic extraction of any relation descriptor expressing any type of semantic relation between a pair of entities or concepts mentioned in Portuguese sentences. In this task, we consider a relation description as a text chunk that describes the explicit semantic relation, occurring between two entities in a sentence. This task is a generalization of Task 2 by removing the requirement of the entities being named in the text, meaning that any relation between two Noun Phrases (NP) is to be considered.
The coordinators will be responsible for:
- Providing examples (seeds);
- Providing test datasets;
- Evaluating results;
- Reviewing working notes;
- Camera ready submissions.
The participants will be responsible for:
- Testing data;
- Delivering the results;
- Delivering working notes.
Activities
This RE task consists of the following steps:
- Systems Development Phase: In this phase, the coordinators will make a small annotated dataset (seeds) available for the participants’ use in developing their RE systems;
- Test Phase: The test phase includes two options for participants:
- Test 1: For this test, participants must extract relation descriptors between NP pairs from data provided by the coordinators. This data will already be annotated with NP information when provided, and as such will not necessitate the application of a NER system by participants;
- Test 2: For this test, the data provided will not be annotated with NP information. As such, the goal of the task will be to extract and classify the NPs from the test sentences, and then they must also extract the relation descriptors between pairs of the recognized NPs;
- Evaluation Phase: In this phase the participants will send their results from the Test Phase. They may submit results from Test 1, Test 2 or both to evaluation by the coordinators. Afterwards, the analyzed results will be sent back to the participants. The metrics used for evaluation phase will be Precision, Recall and F-measure.
Schedule
Below is the activity schedule for this task:
Activities | Dates | Responsible Party | Resources |
---|---|---|---|
Release of Examples | 18/03 | Coordinators | Task 3 - Examples |
Release of Test Data | 01/04 | Coordinators | Task3 - Test Corpora |
Test Phase | 01/04 to 06/05 | Participants | |
Results Delivery (Test 1 and/or Test 2) | 06/05 to 20/05 | Participants | Task 3 - Results Submission |
Evaluation | 20/05 to 10/06 | Coordinators | Task3 - Rectification Results |
Working Notes Delivery | 10/06 to | Participants | Paper Publication Instructions |
Working Notes Review | Coordinators | ||
Camera Ready Submisions | 03/07 | Coordinators | |
IberLEF Workshop | 24/09 |
Registration
Registrations for all tasks is now CLOSED.
Organizers
-
- Grupo de Processamento de Linguagem Natural da PUCRS – PLN-PUCRS
Bernardo Consoli
Fabio Moreira Freitas Da Silva
Joaquim Santos
Juliano Terra
Renata Vieira
Sandra Collovini - Departamento de Informática – Universidade de Évora
Paulo Quaresma - Grupo de Formalismo e Aplicações Semânticas (FORMAS) – UFBA
Clarissa Castellã Xavier
Daniela Barreiro Claro
Marlo Souza
Rafael Glauber
- Grupo de Processamento de Linguagem Natural da PUCRS – PLN-PUCRS