Skip to content

Latest commit

 

History

History
53 lines (37 loc) · 1.74 KB

README.md

File metadata and controls

53 lines (37 loc) · 1.74 KB

Table of Contents

Project Overview

This is a project done as a part of Data Science and AI course at Becode in 2024. To do the project we build a dataset gathering information about at least 10.000 properties all over Belgium.

  1. Web scraping the website Immoweb to gather properties data.
  2. Saving the data in CSV format for further processing.

Prerequisites

Make sure you have the following:

  1. Python 3.x installed.
  2. pip for managing Python packages.
  3. for the required libraries please refer to requirements.txt --- install using the command pip install -r utils/requirements.txt

Usage

This script will:

  1. Retrieve a list of properties from the HTML page source of the website.
  2. Extract poperties' information from immoweb for each property.
  3. Save the output in a CSV file which could be used for related analysis.

Structure

The project has the following core components:

  1. utils: is a directory contains data files property_links.csv all_properties_output.csv

  2. main.py --- To execute the project using python main.py

    fetch_links(): Uses Requests and BeautifulSoup to get a list of properties' URLs. get_property_data(): Uses BeautifulSoup to scrape the property's data and saves it to a CSV file, using the list of URLS. clean_save_dataset(): Uses Pandas to clean the dataset and saves it to another csv file.

  3. requirements.txt : contains list of dependencies for the project.

Contributors

This proects is done by:

  1. Tumi
  2. karthika
  3. Fatemeh