This repository contains the scripts and data generated to tag lung specific genes in a set of abstracts and is divided in four parts:
- Data-sets: contains both raw and processed data of 456 abstracts
- Codes: scripts to process the data and training and testing of the model with the use of conditional random fields
- It contains the three versions of the training and the script to evaluate the effect of each function
- Models: models generated with each of the version of the codes
- Reports: Reports and statistics of each model
- Testing-outputs: output generated by the training scripts
- Lucia Ramirez Navarro
- Gilberto Durán Bishop
- Luis Fernando Altamirano