- NLP_Project (Home)
- NLP
- RecommEng
- NLP_RecommEng.ipynb --> Jupyter Notebook showing text analytics (NLP preprocessing, LDA topic modelling, etc.) + recommender engine designs
- NLP_Repord.pdf --> Report explaining rationale and results of recommender engines explored.
- Web Scraping
- Selenium_Webscrape.pdf --> Report explaining rationale for scraping White House Press Briefings + webscraper design spec.
- Selenium_Webscrape.ipynb --> Jupyter Notebook for webscraper design
- white_house_news.xlsx --> Text Data Webscraped from White House Press Briefings
- RecommEng
- NLP
I designed and assessed the efficacy of two types of recommender engines (collaborative filtering (CF) and content-based). Of each type, I explored three variations to see which would best recommend university courses and reading titles. To do this, NLP techniques were applied, including stop-word removal, stemming, lemmatisation, and n-gram dependency grammar, and topic modelling using Latent Dirichlet Allocation (LDA). The recommender systems were quantitatively assessed using descriptive statistics of recommendations' cosine similarity scores above a 0.4 threshold.
A selenium webdriver was designed to scrape text from the official US White House Press Briefings website. This was motivated to conduct future text analytics and/or topic modelling that could serve to evaluate the administrations' priorities and agenda.