|
66 | 66 | </ul>
|
67 | 67 | </li>
|
68 | 68 | <li class="toctree-l3"><a class="reference internal" href="#additional-featurebuilder-considerations">Additional FeatureBuilder Considerations</a></li>
|
| 69 | +<li class="toctree-l3"><a class="reference internal" href="#inspecting-generated-features">Inspecting Generated Features</a><ul> |
| 70 | +<li class="toctree-l4"><a class="reference internal" href="#feature-information">Feature Information</a></li> |
| 71 | +<li class="toctree-l4"><a class="reference internal" href="#feature-column-names">Feature Column Names</a></li> |
| 72 | +</ul> |
| 73 | +</li> |
69 | 74 | </ul>
|
70 | 75 | </li>
|
71 | 76 | </ul>
|
@@ -373,6 +378,53 @@ <h3>Additional FeatureBuilder Considerations<a class="headerlink" href="#additio
|
373 | 378 | </ul>
|
374 | 379 | </div></blockquote>
|
375 | 380 | </section>
|
| 381 | +<section id="inspecting-generated-features"> |
| 382 | +<h3>Inspecting Generated Features<a class="headerlink" href="#inspecting-generated-features" title="Link to this heading"></a></h3> |
| 383 | +<section id="feature-information"> |
| 384 | +<h4>Feature Information<a class="headerlink" href="#feature-information" title="Link to this heading"></a></h4> |
| 385 | +<p>Every FeatureBuilder object has an underlying property called the <strong>feature_dict</strong>, which lists information and references about the features included in the toolkit. Assuming that <strong>jury_feature_builder</strong> is the name of your FeatureBuilder, you can access the feature dictionary as follows:</p> |
| 386 | +<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">feature_dict</span> |
| 387 | +</pre></div> |
| 388 | +</div> |
| 389 | +<p>The keys of this dictionary are the formal feature names, and the value is a JSON blob with information about the feature or collection of features. A more nicely-displayed version of this dictionary is also available on our <a class="reference external" href="https://teamcommtools.seas.upenn.edu/HowItWorks">website</a>.</p> |
| 390 | +<p><strong>New in v.0.1.4</strong>: To access a list of the formal feature names that a FeatureBuilder will generate, you can use the <strong>feature_names</strong> property:</p> |
| 391 | +<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">feature_names</span> <span class="c1"># a list of formal feature names included in featurization (e.g., "Team Burstiness")</span> |
| 392 | +</pre></div> |
| 393 | +</div> |
| 394 | +<p>You can also use the <strong>feature_names</strong> property in tandem with the <strong>feature_dict</strong> to learn more about a specific feature; for example, the following code will show the dictionary entry for the first feature in <strong>feature_names</strong>:</p> |
| 395 | +<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">feature_dict</span><span class="p">[</span><span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">feature_names</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span> |
| 396 | +</pre></div> |
| 397 | +</div> |
| 398 | +<p>Here is some example output (for the RoBERTa sentiment feature):</p> |
| 399 | +<div class="highlight-text notranslate"><div class="highlight"><pre><span></span>{'columns': ['positive_bert', 'negative_bert', 'neutral_bert'], |
| 400 | + 'file': './utils/check_embeddings.py', |
| 401 | + 'level': 'Chat', |
| 402 | + 'semantic_grouping': 'Emotion', |
| 403 | + 'description': 'The extent to which a statement is positive, negative, or neutral, as assigned by Cardiffnlp/twitter-roberta-base-sentiment-latest. The total scores (Positive, Negative, Neutral) sum to 1.', |
| 404 | + 'references': '(Hugging Face, 2023)', |
| 405 | + 'wiki_link': 'https://conversational-featurizer.readthedocs.io/en/latest/features_conceptual/positivity_bert.html', |
| 406 | + 'function': <function team_comm_tools.utils.calculate_chat_level_features.ChatLevelFeaturesCalculator.concat_bert_features(self) -> None>, |
| 407 | + 'dependencies': [], |
| 408 | + 'preprocess': [], |
| 409 | + 'vect_data': False, |
| 410 | + 'bert_sentiment_data': True} |
| 411 | +</pre></div> |
| 412 | +</div> |
| 413 | +</section> |
| 414 | +<section id="feature-column-names"> |
| 415 | +<h4>Feature Column Names<a class="headerlink" href="#feature-column-names" title="Link to this heading"></a></h4> |
| 416 | +<p>Once you call <strong>.featurize()</strong>, you can also obtain a convenient list of the feature columns generated by the toolkit:</p> |
| 417 | +<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">chat_features</span> <span class="c1"># a list of the feature columns generated at the chat (utterance) level</span> |
| 418 | +<span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">conv_features_base</span> <span class="c1"># a list of the base (non-aggregated) feature columns at the conversation level</span> |
| 419 | +<span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">conv_features_all</span> <span class="c1"># a list of all feature columns at the conversation level, including aggregates</span> |
| 420 | +</pre></div> |
| 421 | +</div> |
| 422 | +<p>These lists may be useful to you if you’d like to inspect which features in the output dataframe come from the FeatureBuilder; for example:</p> |
| 423 | +<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">jury_output_chat_level</span><span class="p">[</span><span class="n">jury_feature_builder</span><span class="o">.</span><span class="n">chat_features</span><span class="p">]</span> |
| 424 | +</pre></div> |
| 425 | +</div> |
| 426 | +</section> |
| 427 | +</section> |
376 | 428 | </section>
|
377 | 429 | </section>
|
378 | 430 |
|
|
0 commit comments