Skip to content

Used web scraping and social network analysis (SNA) to analyze research of arbitrary Iranian professor.

License

Notifications You must be signed in to change notification settings

tekboart/SNA-thesis

Repository files navigation


               

Web Scraping and Social Network Analysis (SNA) to analyze the research interst(s) of arbitrary Iranian university professors

Python igraph Sklearn Pandas Matplotlib seaborn

This repository contains the code (and a sample of data) of a research project in the field of Social Network Analysis (SNA). We used requests module (in Python) to scrape the information of theses, supervised by a certain university professor, then use SNA to analyze different aspects of research done by that professor.

This research encompasses several steps as follows:

  1. Scraping the information of a professor from irandoc.
  2. Conducting Exploratory Data Analysis (EDA) on the crawled data.
  3. Cleaning the data.

    e.g., cleaning: the kewords and titles, unsupported characters, duplicate records, etc.

  4. Converting data into an Adjacency matrix.
  5. Converting the Adjacency matrix into a graph (using igraph).
  6. Analyzing and visualizing the network by calculating different centrality measures.
  7. Doing community detection (CD) to clustering theses into similar groups.

    CD methods: i.e., Label Propagation, Eigenvector, Infomap, and Components

Requirements

Python Pandas

Project Dir Structure

.
├── data
├── images
│   └── logos
├── logs
├── outputs
│   ├── csv
│   ├── json
│   └── plots
│       ├── 1. networks visualization
│       ├── 2. network with centrality measures
│       └── 3. community detection visualization
│           └── gephi
├── reports
└── utils

14 directories

If you have any questions, feel free to contact TekBoArt @tekboart.

License

Shield: CC BY-NC-SA 4.0

  • Refer to the file LICENSE for more information regarding the license of this repository.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

About

Used web scraping and social network analysis (SNA) to analyze research of arbitrary Iranian professor.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published