Skip to content

linuswagner/literatureStudy

Repository files navigation

Literature Study

This is the replication package for our literature study.

It contains files used to generate graphs or help us to classify elements. We also describe the files in Google Drive here.

Google Drive

Here you find the following files:

  • Literature Study: contains all the raw data for literature study
    • Query: All papers we identified through our search query. A legend is at the bottom of the sheet for the color of the row. This is a result of excluding a paper later in the process and is more relevant for us than you. The sheet includes selection criteria and whether we accepted the paper or not.
    • Snowballing Forward Iteration 1: All papers we identified from the kept papers of the Query sheet through forward snowballing. The sheet includes selection criteria and whether we accepted the paper or not.
    • Snowballing Back Iteration 1: All papers we identified from the kept papers of the Query sheet through backward snowballing.
    • Data Extraction Back: For backwards snowballing, the sheet includes selection criteria and whether we accepted the paper or not. The other sheet was too messy.
    • RQ1_Languages: Extracted information for RQ1
    • RQ2_XLLs: Extracted information for RQ2
    • RQ3_Methods: Extracted information for RQ3
    • RQ4_Requirements: Extracted information for RQ4
    • Originally, we also planned to assess the quality of the papers. Therefore, for some papers, we have information about that in Quantitative_Final_Try_Query and Qualitative_Final_Try_Query. It's neither complete nor used in the paper.
  • LiteratureStudyGraph: Graph used to show high-level process of literature study
  • rq2.drawio: Graph used to form categories in paper. Started out with 1 bubble per row, then we unified it iteratively.
  • rq3.drawio: Graph used to form categories in paper. Started out without colors. We iteratively grouped the bubbles through coloring.
  • rq4.drawio: Graph used to form categories in paper. Basically same process as others. Note that the tab abstraction contains the graph used in the paper, while classification contains our detailed version.

Note that all IDs in the graph correspond to the row number in the corresponding Literature Study tab.

This repository

General Structure

On the top level, you find Jupyter notebooks that contain the code needed to transform data and generate plots. We describe them in detail below.

Next to them, the folders have the following purpose:

  • data: 1:1 export of the RQ1 and RQ4 data from the Literature Study file to be able to process it here
  • generated: data generated by the notebooks. Not modified by the authors. For RQ1, this also contains the other two graphs mentioned in the literature study.
  • annotated_from_generated: Manual classification of the languages based on the generated file in generated/rq1. This data exists only here and not in Google Drive.
  • util: Two scripts used to manipulate generated LaTeX files to our liking (only styling and comments).

RQ1

Order of execution:

  1. rq1_studiedLanguages: Used for parsing the raw file from Google Drive. Performs some sanity checks and outputs list of XLLs we can use later.
  2. rq1_edgeGeneration: Transforms XLLs into properly annotated dataframe. Used content of annotated_from_generated for that.
  3. rq1_buildHeatMap: Generates heatmaps used in paper.

Note that you can change the variable category in rq1_edgeGeneration to change the grouping of the heatmap created in the next step.

Also note, that we originally planned to include some descriptive statistics in the paper. We also wanted to create a hierarchical edge bundling graph instead of the heatmap using R. Much of the code you see is a remainder of that, meaning that there can be a lot of unused files/structures. Sorry for the confusion.

RQ2

This script translates rq2.drawio into Python code. We manually transferred all the IDs from the graph into Python. Then, we use the script to generate a nice textual representation of the sets to pass to deepvenn.com to generate our Venn diagram.

Note, that we played around with the granularity of the diagram. Therefore, the code, again, expresses more detail than the paper shows, because we found that the Venn diagrams become unreadable with too many details.

RQ4

We were too lazy to type out all the requirements in DrawIO. Therefore, we generated information in a format DrawIO accepts as input. We then uploaded it there and started creating our graph there.

This notebook just transforms the data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published