Skip to content
This repository was archived by the owner on Jul 23, 2020. It is now read-only.

Experimenting autoencoder based approaches for NPM insights #2002

Closed
7 of 11 tasks
rootAvish opened this issue Jan 24, 2018 · 2 comments
Closed
7 of 11 tasks

Experimenting autoencoder based approaches for NPM insights #2002

rootAvish opened this issue Jan 24, 2018 · 2 comments

Comments

@rootAvish
Copy link

rootAvish commented Jan 24, 2018

User Story

As an OSIO/IDE-extensions user I should be able to pass in a stack so that I can get companion/alternate and outlier insights for its components.

Description

In the previous experiment done to achieve this target it became clear at the start that simply porting the existing model to a different library/framework is not a good approach to tackle large ecosystems, and that with the high availability of data in the NPM ecosystem we can open ourselves to the world of deep learning approaches that generally lead to higher prediction accuracy. We started researching around those models and the research is documented in this document.

This spike and the related issue cover the autoencoder based approaches (CVAE and supervised autoencoder learning). The task list here is incomplete on its own as it has complementing tasks as a part of related issue.

Task List

  • Implement the custom layer required for the supervised autoencoder approach
  • Implement the custom loss function required for the supervised autoencoder approach
  • Setup the CVAE code on article recommendation to get a feel for the workflow
  • Make changes to the CVAE code so it's able to accommodate our data
  • Fit the NPM data in the form of the content matrix and the rating matrix created in related issue on the CVAE model.
  • Once done with coding the framework for the supervised autoencoder in related issue , put everything together and fit the data to it
  • Document the findings

Optimizations

  • Code the evaluation metrics for the accuracy
  • Tuning the learning rate
  • Tuning the number of latent factors (hidden layer nodes)
  • Tuning the momentum factor and dropout regularization

Related issue: (#2004)

EPIC: #1809

@rootAvish
Copy link
Author

/cc @sivaavkd @krishnapaparaju - this is the issue for the work I'll be doing around the autoencoder approaches, the HPF stuff will be documented in a separate issue.

@rootAvish
Copy link
Author

Since the CVAE approach worked, and has its results documented here: https://docs.google.com/spreadsheets/d/1OKq1BvPHUzKwYrL8XFWZAgkKC7jcaW0QyY0vleq0wV0/edit?usp=sharing we did not go forward with implementing the supervised autoencoder approach. Work around creating a post-filter business logic for the same will be carried out in this sprint.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant