Open Online Course Recommender

This repository contains code for an Open Online Course recommender system. The recommender suggests online courses based on the content of the courses using natural language processing techniques.

Project Overview

The Open Online Course Recommender is designed to help users discover suitable online courses by analyzing the content of course descriptions, summaries, instructors, and subjects. The recommender uses a combination of text processing, vectorization, and cosine similarity to provide relevant course recommendations.

Natural Language Processing Techniques

The Open Online Course Recommender employs various natural language processing (NLP) techniques to process and analyze course data:

Text Preprocessing: The course descriptions, summaries, instructors, and subjects are preprocessed by tokenizing the text into individual words. This step helps in breaking down the text into its fundamental units for further analysis.
Stemming:: Stemming is applied to the words to reduce them to their root forms. This helps in reducing variations of words to a common form, improving the efficiency of subsequent analyses.
Vectorization: The text data is converted into numerical vectors using techniques like Count Vectorization. This process represents each course as a vector of word frequencies, allowing mathematical operations to be performed on the text data.
Cosine Similarity: Cosine similarity is used to measure the similarity between courses based on their vectorized representations. Courses with similar content will have a higher cosine similarity score, indicating their relevance to each other.

Dataset

The dataset used in this project is sourced from Kaggle. The dataset contains information about 976 courses that are currently available on the edx.org platform. It provides insights into various online university-level courses across different disciplines.

About the Dataset

Context: edX is a massive open online course (MOOC) provider founded by Harvard and MIT. It hosts a wide range of online university-level courses in different disciplines.
Content: The dataset contains information about 976 courses that are currently available on the edx.org platform.

Web Application

The project also has a web application using Streamlit that allows you to interact with the Open Online Course Recommender in a user-friendly way. The web app allows you to select a course and receive recommendations for similar courses based on the course tags.

Discussion

In the process of developing the Open Online Course Recommender, I modified the recommendation algorithm to provide more focused results within the same subject area. The recommend function was modified as follows:

def recommend(mooc_title):
    index = newdf[newdf['title'].str.lower() == mooc_title.lower()].index[0]
    distances = sorted(enumerate(similarity[index]), reverse=True, key=lambda x: x[1])
    selected_course = newdf.loc[index]
    
    # Filter recommended courses by the same subject as the selected course
    similar_courses = [
        newdf.iloc[i[0]] for i in distances[1:6]
        if selected_course.displayed_subject in newdf.iloc[i[0]].displayed_subject
    ]
    
    return similar_courses

However, it was observed that this modification led to limited recommendations. This limitation is attributed to the relatively small size of the dataset, resulting in a scarcity of courses within the same subject area. Consequently, this leads to a reduced number of recommendations. This situation underscores the critical importance of utilizing a diverse and comprehensive dataset to achieve higher accuracy in recommendations.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
artifacts		artifacts
assets		assets
src		src
Course_data.csv		Course_data.csv
README.md		README.md
WebApp.py		WebApp.py
design.css		design.css
mooc recommender system.ipynb		mooc recommender system.ipynb
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Open Online Course Recommender

Project Overview

Natural Language Processing Techniques

Dataset

About the Dataset

Web Application

Discussion

About

Uh oh!

Releases

Packages

Uh oh!

Languages

fizamusthafa/mooc-recommender-nlp

Folders and files

Latest commit

History

Repository files navigation

Open Online Course Recommender

Project Overview

Natural Language Processing Techniques

Dataset

About the Dataset

Web Application

Discussion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages