This repository contains code and data to reproduce the results of our research project about Self-Admitted Technical Debt in Pull Requests for the course Evidence-Based Software Engineering.
$ pip install -U requests lizard scipy scikit-learn$ cd Preprocessing
$ python preprocess.py
$ cp new.csv ../QualitativeAnalysis/Round1/sampling_input.csv$ cd ../QualitativeAnalysis/Round<x>
$ python ../subsample.py
$ cp nonsampled.csv ../Round<x+1>/sampling_input.csv$ cd ../Round3In subsample.py, set NUM_SAMPLES 172 (1/3rd of the total).
Then, sample for each of the researchers (Lars, Germán, Koen):
$ python ../subsample.py
Use the nonsampled.csv output as the input file for the next researcher.
Set NUM_SAMPLES to 17. Then sample from Round3/sampled_{Lars,German,Koen}.csv:
$ cd ../Round4
$ python ../subsample.py$ cd Agreement
$ python kappa.py$ cd ../../QuantitativeAnalysis
$ python codechanges.py$ cd Languages
$ python languages.py$ cd ../Data_analysis
$ python merge_categories.py$ python generate_metrics.py$ cd ../Statistical_Significance
$ python levene.py
$ python ttest.py