Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topicModelling.py : Add script arguments #6

Open
LaChapeliere opened this issue Jun 4, 2020 · 1 comment
Open

topicModelling.py : Add script arguments #6

LaChapeliere opened this issue Jun 4, 2020 · 1 comment
Labels
enhancement New feature or request on-hold Issue needs another issue to be solved first

Comments

@LaChapeliere
Copy link
Contributor

splitDate, topicNumbersToTry and the LDA args (chunksize, passes, iterations and eval_every) should become script arguments.
splitDate should not have a default. It might be necessary to also add a date format argument so users can specify the format of splitDate.

@LaChapeliere LaChapeliere added the enhancement New feature or request label Jun 4, 2020
LaChapeliere added a commit that referenced this issue Jun 4, 2020
Data formatting + preprocessing + processing + stats extraction pipelines for 4 types of analyses: freq of given words in the dataset over time, classification of racist hate speech (by ethnicity) and freq in the dataset over time, topic modelling over LDA, new words detection and topic modelling using word embedding (word2vec). Some steps are common to several pipelines, others are not
@LaChapeliere
Copy link
Contributor Author

Implementation can be found in the resiliency_challenge-branch

@LaChapeliere LaChapeliere added the on-hold Issue needs another issue to be solved first label Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request on-hold Issue needs another issue to be solved first
Projects
None yet
Development

No branches or pull requests

1 participant