Proceedings

IberLEF 2019 Proceedings are published in CEUR Workshop Proceedings Series; http://ceur-ws.org/Vol-2421/

Our overview’s version in the proceedings has been adjusted due to certain inconsistencies found in Table 14. The updated version can be found here.

Overview

The main objective of these tasks is to propose to participants the challenge of applying their systems/solutions to the activities of Named Entity Recognition (NER) and/or Relation Extraction (RE) in Portuguese texts. For this, three independent tasks have been organized, and participants are free to apply for any combination of activities, be it only one, two or all of them.
These tasks will contribute to the progress of Portuguese natural language processing, as there is a demand in the area for the development of new methods, tools and specific resources such as annotated data. These tasks are part of IberLEF 2019.

Task 1: Named Entity Recognition

Description

The first task we propose is NER, the task of identifying proper nouns within a given text and classifying them into one of many relevant categories or within a default category known as Miscellaneous. Our objective with this task is to evaluate the proposed systems in many textual genres. For datasets that have as main textual genres: news, memorandums, e-mails, interviews and magazine articles, we will evaluate the following categories: PER – Person, PLC – Place, ORG – Organization, VAL – Value and TME – Time. On the other hand, for Clinical notes and Legal texts, of which we will only evaluate the PER – Person category.

The coordinators will be responsible for:

Evaluating the systems;
Reviewing working notes;
Camera ready submissions.

The participants will be responsible for:

Development and training of systems;
Submission of systems;
Submissions of working notes.

Activities

The NER task consists of the following steps:

- - Development Phase: For this phase, participants are required to develop a computational approach to NER. This approach, hereby referred to as system, must be capable of solving NER tasks for many textual genres. Participants are free to develop their solution however they see fit, so long as they comply with the requirements described in the training and test phases;
  - Training Phase: The objective of this phase is that participants choose their training datasets. Participants are free to choose any datasets they so desire for training in the various types of textual genres;

- - Test Phase: In this phase the coordinators will evaluate the capacity and reproduction of the submitted systems:
    - Reproduction Stage: For this stage, the participants proposed systems will be executed by the coordinators. Should the coordinators be unable to execute a system, said system will not be evaluated;
    - Evaluation Stage: Inputs composed of corpora in different textual genres will be entered into all systems that passed the Reproduction Stage. The expected output is to be in the “.txt” format, so that it may be evaluated via script following CoNLL-2002 metrics.

Schedule

Below is the activity schedule for this task:

Activities	Dates	Responsible Party	Resources
Development and Training Phases	18/03 to 22/04	Participants	Task 1 - Input and Output Format Examples
Submission of Systems	25/04 to 06/05	Participants	System Submission Instructions ~~System Submission Form~~ Revised System Submission
Evaluation of Systems	01/05 to 10/06	Coordinators	Dataset Explanation and System Evaluations
Working Notes Delivery	11/06 to ~~24/06~~ 27/06	Participants	Paper Publication Instructions
Working Notes Review	~~25/06~~ 28/06 to 01/07	Coordinators
Camera Ready Submissions	03/07	Coordinators
IberLEF Workshop	24/09

Task 2: Relation Extraction for Named Entities

Description

We propose a RE task that involves the automatic extraction of any relation descriptor expressing any type of relation between a pair of Named Entities of the Person, Place and Organization categories in Portuguese language texts.

The coordinators will be responsible for:

Providing examples (seeds);
Providing test datasets;
Evaluating results;
Reviewing working notes;
Camera ready submissions.

The participants will be responsible for:

Testing data;
Delivering the results;
Delivering working notes.

Activities

This RE task consists of the following steps:

Systems Development Phase: In this phase, the coordinators will make a small annotated dataset (seeds) available for the participants’ use in developing their RE systems;
Test Phase: The test phase includes two options for participants:
- Test 1: For this test, participants must extract relation descriptors between NE pairs (of Person, Place or Organization categories) from data provided by the coordinators. This data will already be annotated with NE information when provided, and as such will not necessitate the application of a NER system by participants;
- Test 2: For this test, the data provided will not be annotated with NE information. As such, the goal of the task will be to extract and classify (with Person, Place or Organization categories) the NEs from the test sentences, and then they must also extract the relation descriptors between pairs of the recognized NEs;
Evaluation Phase: In this phase the participants will send their results from the Test Phase. They may submit results from Test 1, Test 2 or both to evaluation by the coordinators. Afterwards, the analyzed results will be sent back to the participants. The metrics used for evaluation phase will be Precision, Recall and F-measure.

Schedule

Below is the activity schedule for this task:

Activities	Dates	Responsible Party	Resources
Release of Examples	18/03	Coordinators	Task 2 - Examples
Release of Data	01/04	Coordinators	Task 2 - Test Corpora
Test Phase	01/04 to 06/05	Participants
Results Delivery (Test 1 and/or Test 2)	06/05 to 20/05	Participants	Task 2 - Results Submission
Evaluation	20/05 to 10/06	Coordinators	Task 2 Evaluation
Working Notes Delivery	10/06 to ~~24/06~~ 27/06	Participants	Paper Publication Instructions
Working Notes Review	~~24/06~~ 28/06 to 01/07	Coordinators
Camera Ready Submissions	03/07	Coordinators
IberLEF Workshop	24/09

Task 3: General Open Relation Extraction

Description

The task of general open relation extraction aims to identify structured representations of the information contained in unstructured sources, such as textual documents. This task faces many challenges, considering the generality of the problem, as well as the required linguistic knowledge to automatically perform such a task.

This task involves the automatic extraction of any relation descriptor expressing any type of semantic relation between a pair of entities or concepts mentioned in Portuguese sentences. In this task, we consider a relation description as a text chunk that describes the explicit semantic relation, occurring between two entities in a sentence. This task is a generalization of Task 2 by removing the requirement of the entities being named in the text, meaning that any relation between two Noun Phrases (NP) is to be considered.

The coordinators will be responsible for:

Providing examples (seeds);
Providing test datasets;
Evaluating results;
Reviewing working notes;
Camera ready submissions.

The participants will be responsible for:

Testing data;
Delivering the results;
Delivering working notes.

Activities

This RE task consists of the following steps:

Systems Development Phase: In this phase, the coordinators will make a small annotated dataset (seeds) available for the participants’ use in developing their RE systems;
Test Phase: The test phase includes two options for participants:
- Test 1: For this test, participants must extract relation descriptors between NP pairs from data provided by the coordinators. This data will already be annotated with NP information when provided, and as such will not necessitate the application of a NER system by participants;
- Test 2: For this test, the data provided will not be annotated with NP information. As such, the goal of the task will be to extract and classify the NPs from the test sentences, and then they must also extract the relation descriptors between pairs of the recognized NPs;
Evaluation Phase: In this phase the participants will send their results from the Test Phase. They may submit results from Test 1, Test 2 or both to evaluation by the coordinators. Afterwards, the analyzed results will be sent back to the participants. The metrics used for evaluation phase will be Precision, Recall and F-measure.

Schedule

Below is the activity schedule for this task:

Activities	Dates	Responsible Party	Resources
Release of Examples	18/03	Coordinators	Task 3 - Examples
Release of Test Data	01/04	Coordinators	Task3 - Test Corpora
Test Phase	01/04 to 06/05	Participants
Results Delivery (Test 1 and/or Test 2)	06/05 to 20/05	Participants	Task 3 - Results Submission
Evaluation	20/05 to 10/06	Coordinators	Task3 - Rectification Results
Working Notes Delivery	10/06 to ~~24/06~~ 27/06	Participants	Paper Publication Instructions
Working Notes Review	~~24/06~~ 28/06 to 01/07	Coordinators
Camera Ready Submisions	03/07	Coordinators
IberLEF Workshop	24/09

Registration

Registrations for all tasks is now CLOSED.

Organizers

- Grupo de Processamento de Linguagem Natural da PUCRS – PLN-PUCRS
  Bernardo Consoli
  Fabio Moreira Freitas Da Silva
  Joaquim Santos
  Juliano Terra
  Renata Vieira
  Sandra Collovini
- Departamento de Informática – Universidade de Évora
  Paulo Quaresma
- Grupo de Formalismo e Aplicações Semânticas (FORMAS) – UFBA
  Clarissa Castellã Xavier
  Daniela Barreiro Claro
  Marlo Souza
  Rafael Glauber

Contact

E-mail