Skip to content

Justinbenfit23/podium_url_d2v

Repository files navigation

A web scraper and doc2vec project to identify company similarity based on website text data I have included a requirements.txt file to give you dependent libraries and their versions so as long as you have python downloaded you should be able to open this repo in your IDE and run `pip install -r requirements.txt from the root level.

Datasets are only small sample of original in order to fit in github. Feel free to add larger list of urls to train the model to be more robust/accurate

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published