Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 764 Bytes

README.md

File metadata and controls

16 lines (12 loc) · 764 Bytes

Identifying gene mentions in text using Conditional Random Fields

This repository contains the scripts and data generated to tag lung specific genes in a set of abstracts and is divided in four parts:

  • Data-sets: contains both raw and processed data of 456 abstracts
  • Codes: scripts to process the data and training and testing of the model with the use of conditional random fields
    • It contains the three versions of the training and the script to evaluate the effect of each function
  • Models: models generated with each of the version of the codes
  • Reports: Reports and statistics of each model
  • Testing-outputs: output generated by the training scripts

Authors:

  • Lucia Ramirez Navarro
  • Gilberto Durán Bishop
  • Luis Fernando Altamirano