This repository contains code that informs several aspects of the automatic event detection pipeline. It mostly contains code to process our data in different ways in order to prepare it for different experiments. It also contains some (documentation of) preliminary, small experiments mainly used for orientation. This repo still gets changed and updated. There are several seperate repos that contain finished experiments or modules of the pipeline, namely:
- repo for our lexical approach to event detection
- repo for our paper on annotation strategies
- repo introducing our cross-validation on document-level approach
- repo for building and evaluating binary event detection models
Some folders in this repo have their own README for further information.
Probably the most important folder here; contains code to process data outputted from INCEpTION in cas_xmi to different file formats for different tasks. Currently: json for event detection and event classification and conllu for semantic role labelling.
Converts files exported from INCEpTION in XMI 1.0 to json and back with adapted code by Sophie Arnoult and pre-annotates using a lexicon.
In this folder code and data is stored to perform an IAA analysis on event mention detection and classification. For a more documented overview of our IAA data and analysis see our seperate repo on this. annotated_data_processing_for_IAA also contains data and information on how we adjudicate our annotated data to create test data.
Contains thoroughly documented experiments performed in the orientation phase of establishing our annotation strategy.
Contains several versions of a simple version of our event ontology in turtle format
Contains some results of an orientation phase in finetuning models on binary event detection