-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amy/package v2 #326
Amy/package v2 #326
Conversation
…to amy/package_v2
* Add Examples Notebook (#294) * Urgent fix to remove LIWC lexicons from public repo (#279) * delete small test lexicons * move .pkl files to assets and remove from GH * filesystem cleanup * update certainty pickle location * remove unpickling certainty * remove lexicons from pyproject * change lexical pkl path * add error handling when lexicons are not found * update warning message * add legal caveat and update name of certainty pkl to be correct * ensure lexicons are ignored * Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289) * new docs * lexicons hotfix * emilys doc edits * update deprecated github actions to latest * update last remaining text features * update index * update docs * update index * update docs * update docs and the feature dictionary * add basics.rst * add new basics page * update docs --------- Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> * update torch requirements to resolve compatibility issue on torch end (#290) * Update Website (#291) * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * deployed website * copyright and team * team headshots and footer * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * whitespace edits * homepage updates * feature table * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * homepage updates * add table of features * updated team page titles * include flask in requirements.txt * updates to table of features * load pages from top * fix to 404 issues * moved build under website folder * updates to package launch * hyperlink ./setup.sh * fix nav bar sizing and hamburger logo * include preprint * updates to "getting started" * update team --------- Co-authored-by: amytangzheng <[email protected]> * update documentation for clarity and correct typos in positivity z-score and information exchange and liwc * add demo notebook * update notebook and add information to docs * update documentation --------- Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]> * Bump path-to-regexp and express in /website Bumps [path-to-regexp](https://github.com/pillarjs/path-to-regexp) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `path-to-regexp` from 0.1.7 to 0.1.10 - [Release notes](https://github.com/pillarjs/path-to-regexp/releases) - [Changelog](https://github.com/pillarjs/path-to-regexp/blob/master/History.md) - [Commits](pillarjs/path-to-regexp@v0.1.7...v0.1.10) Updates `express` from 4.19.2 to 4.21.0 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/4.21.0/History.md) - [Commits](expressjs/express@4.19.2...4.21.0) --- updated-dependencies: - dependency-name: path-to-regexp dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add Examples Notebook (#294) * Urgent fix to remove LIWC lexicons from public repo (#279) * delete small test lexicons * move .pkl files to assets and remove from GH * filesystem cleanup * update certainty pickle location * remove unpickling certainty * remove lexicons from pyproject * change lexical pkl path * add error handling when lexicons are not found * update warning message * add legal caveat and update name of certainty pkl to be correct * ensure lexicons are ignored * Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289) * new docs * lexicons hotfix * emilys doc edits * update deprecated github actions to latest * update last remaining text features * update index * update docs * update index * update docs * update docs and the feature dictionary * add basics.rst * add new basics page * update docs --------- Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> * update torch requirements to resolve compatibility issue on torch end (#290) * Update Website (#291) * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * deployed website * copyright and team * team headshots and footer * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * whitespace edits * homepage updates * feature table * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * homepage updates * add table of features * updated team page titles * include flask in requirements.txt * updates to table of features * load pages from top * fix to 404 issues * moved build under website folder * updates to package launch * hyperlink ./setup.sh * fix nav bar sizing and hamburger logo * include preprint * updates to "getting started" * update team --------- Co-authored-by: amytangzheng <[email protected]> * update documentation for clarity and correct typos in positivity z-score and information exchange and liwc * add demo notebook * update notebook and add information to docs * update documentation --------- Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]> * Bump nltk from 3.8.1 to 3.9 Bumps [nltk](https://github.com/nltk/nltk) from 3.8.1 to 3.9. - [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog) - [Commits](nltk/nltk@3.8.1...3.9) --- updated-dependencies: - dependency-name: nltk dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Update pyproject.toml * Update requirements.txt * Update download_resources.py --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add Examples Notebook (#294) * Urgent fix to remove LIWC lexicons from public repo (#279) * delete small test lexicons * move .pkl files to assets and remove from GH * filesystem cleanup * update certainty pickle location * remove unpickling certainty * remove lexicons from pyproject * change lexical pkl path * add error handling when lexicons are not found * update warning message * add legal caveat and update name of certainty pkl to be correct * ensure lexicons are ignored * Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289) * new docs * lexicons hotfix * emilys doc edits * update deprecated github actions to latest * update last remaining text features * update index * update docs * update index * update docs * update docs and the feature dictionary * add basics.rst * add new basics page * update docs --------- Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> * update torch requirements to resolve compatibility issue on torch end (#290) * Update Website (#291) * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * deployed website * copyright and team * team headshots and footer * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * whitespace edits * homepage updates * feature table * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * homepage updates * add table of features * updated team page titles * include flask in requirements.txt * updates to table of features * load pages from top * fix to 404 issues * moved build under website folder * updates to package launch * hyperlink ./setup.sh * fix nav bar sizing and hamburger logo * include preprint * updates to "getting started" * update team --------- Co-authored-by: amytangzheng <[email protected]> * update documentation for clarity and correct typos in positivity z-score and information exchange and liwc * add demo notebook * update notebook and add information to docs * update documentation --------- Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]> * Bump body-parser and express in /website Bumps [body-parser](https://github.com/expressjs/body-parser) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `body-parser` from 1.20.2 to 1.20.3 - [Release notes](https://github.com/expressjs/body-parser/releases) - [Changelog](https://github.com/expressjs/body-parser/blob/master/HISTORY.md) - [Commits](expressjs/body-parser@1.20.2...1.20.3) Updates `express` from 4.19.2 to 4.21.0 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/4.21.0/History.md) - [Commits](expressjs/express@4.19.2...4.21.0) --- updated-dependencies: - dependency-name: body-parser dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add Examples Notebook (#294) * Urgent fix to remove LIWC lexicons from public repo (#279) * delete small test lexicons * move .pkl files to assets and remove from GH * filesystem cleanup * update certainty pickle location * remove unpickling certainty * remove lexicons from pyproject * change lexical pkl path * add error handling when lexicons are not found * update warning message * add legal caveat and update name of certainty pkl to be correct * ensure lexicons are ignored * Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289) * new docs * lexicons hotfix * emilys doc edits * update deprecated github actions to latest * update last remaining text features * update index * update docs * update index * update docs * update docs and the feature dictionary * add basics.rst * add new basics page * update docs --------- Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]> * update torch requirements to resolve compatibility issue on torch end (#290) * Update Website (#291) * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * deployed website * copyright and team * team headshots and footer * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * whitespace edits * homepage updates * feature table * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * homepage updates * add table of features * updated team page titles * include flask in requirements.txt * updates to table of features * load pages from top * fix to 404 issues * moved build under website folder * updates to package launch * hyperlink ./setup.sh * fix nav bar sizing and hamburger logo * include preprint * updates to "getting started" * update team --------- Co-authored-by: amytangzheng <[email protected]> * update documentation for clarity and correct typos in positivity z-score and information exchange and liwc * add demo notebook * update notebook and add information to docs * update documentation --------- Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]> * update check embeddings with tqdm loading bar and BERT tokenization update * (1) allow BERT sentiments to be generated from the messages with punctuation, rather than the preprocessed messages; (2) batch BERT sentiment generation to make it more efficient; (3) add loading bar for generation of chat-level features --------- Co-authored-by: Shruti Agarwal <[email protected]> Co-authored-by: amytangzheng <[email protected]>
* website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * deployed website * copyright and team * team headshots and footer * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * whitespace edits * homepage updates * feature table * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * homepage updates * add table of features * updated team page titles * include flask in requirements.txt * updates to table of features * load pages from top * fix to 404 issues * moved build under website folder * updates to package launch * hyperlink ./setup.sh * fix nav bar sizing and hamburger logo * include preprint * updates to "getting started" * update team * gh actions and custom domain * deploy to custom url * deploy to custom url * updates to cname * changes to cname * cname updates * testing github actions * updates to github-actions-website * testing github actions * updates to gh actions * updates to github-actions * update home for testing gh actions * updates CNAME * update testing email * updates username/email * updates to email in github-actions-website * testing gh actions for feature_dict * testing github-actions feature_dict * updates to github-actions-feature_dict * Update github-actions-feature_dict.yaml * testing updates to feature_dict.py * testing feature_dict updates * testing updates to feature_dict.py * testing feature_dict deployment * Update github-actions-feature_dict.yaml * testing feature_dict updates * testing updates to feature_dict.py * updates to feature_dict * updates to github actions feature_dict * testing auto updates to feature_dict * Update feature_dict.py * testing feature_dict auto updates * testing feature_dict auto updates * Update feature_dict.py * testing feature_dict auto updates * remove commented code in feature_dict.py * Delete src/team_comm_tools/filtered_dict.json delete test json file * Update github-actions-website.yaml to deploy on update to dev * put 'dev' in quotes * Update github-actions-feature_dict.yaml to update upon dev * re-add filtered dict --------- Co-authored-by: Xinlan Emily Hu <[email protected]> Co-authored-by: Xinlan Emily Hu <[email protected]>
* website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * deployed website * copyright and team * team headshots and footer * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * whitespace edits * homepage updates * feature table * website updates * renaming tpm-website to website * deploying via gh-pages * changed from tpm-website to website * edits to the pages * website updates * updated links * updated homepage * link updates * mobile compatibility * mobile adjustments * navbar mobile updates * homepage updates * add table of features * updated team page titles * include flask in requirements.txt * updates to table of features * load pages from top * fix to 404 issues * moved build under website folder * updates to package launch * hyperlink ./setup.sh * fix nav bar sizing and hamburger logo * include preprint * updates to "getting started" * update team * gh actions and custom domain * deploy to custom url * deploy to custom url * updates to cname * changes to cname * cname updates * testing github actions * updates to github-actions-website * testing github actions * updates to gh actions * updates to github-actions * update home for testing gh actions * updates CNAME * update testing email * updates username/email * updates to email in github-actions-website * testing gh actions for feature_dict * testing github-actions feature_dict * updates to github-actions-feature_dict * Update github-actions-feature_dict.yaml * testing updates to feature_dict.py * testing feature_dict updates * testing updates to feature_dict.py * testing feature_dict deployment * Update github-actions-feature_dict.yaml * testing feature_dict updates * testing updates to feature_dict.py * updates to feature_dict * updates to github actions feature_dict * testing auto updates to feature_dict * Update feature_dict.py * testing feature_dict auto updates * testing feature_dict auto updates * Update feature_dict.py * testing feature_dict auto updates * remove commented code in feature_dict.py * Delete src/team_comm_tools/filtered_dict.json delete test json file * Update github-actions-website.yaml to deploy on update to dev * put 'dev' in quotes * Update github-actions-feature_dict.yaml to update upon dev * re-add filtered dict * update packages for website --------- Co-authored-by: amytangzheng <[email protected]>
* Update package-lock.json * Update package.json
…ges" This reverts commit d04037d.
* address #306 * fix hedges reference and update dictionary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the package requirements in requirements.txt and pyproject.toml and make sure the tests pass!
examples/featurize.py
Outdated
@@ -18,6 +18,9 @@ | |||
juries_df = pd.read_csv("./example_data/full_empirical_datasets/jury_conversations_with_outcome_var.csv", encoding='utf-8') | |||
csop_df = pd.read_csv("./example_data/full_empirical_datasets/csop_conversations_withblanks.csv", encoding='utf-8') | |||
csopII_df = pd.read_csv("./example_data/full_empirical_datasets/csopII_conversations_withblanks.csv", encoding='utf-8') | |||
test_df = pd.read_csv("C:/Users/amyta/Documents/GitHub/team_comm_tools/tests/data/cleaned_data/test_package_aggregation.csv", encoding='utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amytangzheng I think this is a reference to a local path and it needs to be updated!
self.chat_features = list(itertools.chain(*[feature_dict[feature]["columns"] for feature in self.feature_names if feature_dict[feature]["level"] == "Chat"])) | ||
self.conv_features_base = list(itertools.chain(*[feature_dict[feature]["columns"] for feature in self.feature_names if feature_dict[feature]["level"] == "Conversation"])) | ||
self.conv_features_all = [col for col in self.conv_data if col not in self.orig_data and col != 'conversation_num'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note --- check this; we likely want the last line self.conv_features_all
to appear AFTER we actually generate the features, so moving this line of code up may not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UPDATE: fixed
:type regenerate_vectors: bool, optional | ||
|
||
:param compute_vectors_from_preprocessed: If true, computes vectors using preprocessed text (that is, with capitalization and punctuation removed). This was the default behavior for v.0.1.3 and earlier, but we now default to computing metrics on the unpreprocessed text (which INCLUDES capitalization and punctuation). Defaults to False. | ||
:type compute_vectors_from_preprocessed: bool, optional | ||
:param custom_vect_path: If provided, features will be generated using custom vectors rather than default SBERT. Defaults to None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: need to update documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RESOLVED
@@ -358,7 +392,24 @@ def __init__( | |||
if not re.match(r"(.*\/|^)output\/", self.output_file_path_user_level): | |||
self.output_file_path_user_level = re.sub(r'/user/', r'/output/user/', self.output_file_path_user_level) | |||
|
|||
self.vect_path = vector_directory + "sentence/" + ("turns" if self.turns else "chats") + "/" + base_file_name | |||
if custom_vect_path is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: it seems like this PR build in some of the initial infrastructure for custom vectors (document this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RESOLVED -- custom vector infrastructure has been removed
Pull Request Template:
If you are merging in a feature or other major change, use this template to check your pull request!
Basic Info
What's this pull request about?
My PR Adds or Improves Documentation
If your feature is about documentation, ensure that you check the boxes relevant to you.
Docstrings
Feature Wiki
.._
and:
) in the header, as this is important for referencing the feature in the Table of Contents!General Documentation
make clean
andmake html
do not generate breaking errors.My PR is About Adding a New Feature to the Code Repository
Adding Feature to the Feature Dictionary
feature_dictionary.py
file with an appropriate entry for my feature. Below is a sample entry; I confirm that all fields are accurately filled out.### Conversation Level
).Documentation
Did you document your feature? You should follow the same requirements as above:
Code Basics
my_feature
, NOTmyFeature
(camel case).NAME_features.py
, where NAME is the name of my feature.src/features/
.Testing
tests/
folder.The location of my tests are here:
If you check all the boxes above, then you ready to merge!