[backend][app] Always send generic email result as best GPT on the transcript, other workflows yolo #2

petercsiba · 2024-09-11T23:33:33Z

✨ Description by Cal

PR Description

This PR introduces a new feature to always send a generic email result as the best GPT on the transcript. It includes changes to various backend files to support this feature, such as modifying email sending functions, updating workflow handling, and adding new utility functions. Additionally, some CSS styles have been updated.

Diagrams of code changes

sequenceDiagram
    participant User
    participant API
    participant GPTClient
    participant EmailService
    participant AWS

    User->>API: GET /upload/voice
    API->>User: Returns presigned URL

    API->>GPTClient: Transcribe audio with prompt hint
    GPTClient->>API: Returns transcription

    API->>EmailService: Send networking result email
    EmailService->>API: Email sent confirmation

    API->>AWS: Check if running in AWS
    AWS-->>API: Returns status

    API->>GPTClient: Process generic prompt
    GPTClient->>API: Returns processed result

    API->>EmailService: Send generic result email
    EmailService->>API: Email sent confirmation

Key Issues

Possible Bug: In /backend/app/app.py, the variable GPTo_MODEL seems to be a typo and should likely be GPT_MODEL or similar. This could lead to runtime errors if the variable is used elsewhere.
Performance Concern: The poor_mans_token_counter function in /backend/app/contacts_dump.py may not accurately count tokens, which could lead to incorrect processing of transcripts. Consider using a more reliable token counting method.
Possible Bug: In /backend/app/gsheets.py, there is a TODO comment indicating a potential AttributeError which should be addressed to prevent runtime errors.

Files Changed

File: /backend/api/app.py

Added a TODO comment to ensure compatibility with file uploads.

File: /backend/app/app.py

Modified email sending functions and workflow handling to support the new feature. Added a new function `process_generic_prompt`.

File: /backend/app/contacts_dump.py

Replaced `num_tokens_from_string` with `poor_mans_token_counter` for token counting.

File: /backend/app/emails.py

Updated email body formatting and added a new function `send_generic_result`.

File: /backend/app/gsheets.py

Added a TODO comment regarding a potential `AttributeError`.

File: /backend/database/email_log.py

Added a return type hint to `get_email_reply_params_for_account_id`.

File: /backend/deploy_lambda_container.sh

Added a TODO comment to move a command into the `/app` directory.

File: /backend/input/app_upload.py

Added `prompt_hint` and `use_cache_hit` parameters to `transcribe_audio` calls.

File: /backend/input/call.py

Added `prompt_hint` and `use_cache_hit` parameters to `transcribe_audio` calls.

File: /backend/input/common.py

Added TODO comments regarding video format support and file size limitations.

File: /backend/input/email.py

Added `prompt_hint` and `use_cache_hit` parameters to `transcribe_audio` calls.

File: /webapp/styles/app.css

Added new CSS classes for margin, padding, and background color.

…anscript, other workflows yolo

vercel · 2024-09-11T23:33:36Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
dumpsheet	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Sep 12, 2024 4:50am

callstackai-action

No issues found.

callstackai-action

PR Review Summary

This pull request has been reviewed. Please check the comments and suggestions provided.

callstackai-action · 2024-09-12T04:46:02Z

backend/input/email.py

+                gpt_client.transcribe_audio(
+                    audio_filepath=audio_filepath,
+                    prompt_hint="voice memo",
+                    use_cache_hit=not is_running_in_aws(),


🐛 Bug
The use_cache_hit parameter in the gpt_client.transcribe_audio method is set to not is_running_in_aws(). This logic might be flawed because it assumes that caching should only be disabled when running in AWS. However, if the environment variable checks in is_running_in_aws() are not comprehensive or if the environment changes, this could lead to unexpected behavior where caching is incorrectly enabled or disabled. Consider explicitly setting use_cache_hit based on a more reliable configuration or environment setting.

Suggested change

use_cache_hit=not is_running_in_aws(),

use_cache_hit=False

i guess fair

the suggested change is quite bad though

backend/app/emails.py

callstackai-action · 2024-09-12T04:46:03Z

backend/app/app.py

+        add paragraphs if needed. Especially make sure you keep all mentioned facts and details.
+    """
+    # CHEAPEST_MODEL leads to overly short answers.
+    transcription = gpt_client.run_prompt(transcript_prompt + data_entry.output_transcript, model=GPTo_MODEL)


🐛 Bug
The variable GPTo_MODEL used in the gpt_client.run_prompt call on line 377 seems to be a typo or undefined variable. It should likely be BEST_MODEL or another defined model variable. This will lead to a runtime error if GPTo_MODEL is not defined elsewhere in the code.

Suggested change

transcription = gpt_client.run_prompt(transcript_prompt + data_entry.output_transcript, model=GPTo_MODEL)

transcription = gpt_client.run_prompt(transcript_prompt + data_entry.output_transcript, model=BEST_MODEL)

Wrong, this is defined on line 62

callstackai-action · 2024-09-12T04:46:04Z

backend/app/gsheets.py

@@ -228,6 +228,7 @@ def add_form_datas_to_spreadsheet(self, form_datas: List[FormData]):
        sheet_cache = {}

        for form_data in form_datas:
+            # TODO: AttributeError: 'str' object has no attribute 'value'


🐛 Bug
The line form_name = form_data.form.form_name.value assumes that form_data.form.form_name is an object with a value attribute. However, the comment suggests that an AttributeError is occurring because form_data.form.form_name is a str object, which does not have a value attribute. This will lead to a runtime error. To fix this, ensure that form_data.form.form_name is correctly accessed as a string, without the .value attribute.

Suggested change

# TODO: AttributeError: 'str' object has no attribute 'value'

form_name = form_data.form.form_name

Correct, done in a follow up diff

callstackai-action · 2024-09-12T04:46:04Z

backend/app/contacts_dump.py

@@ -17,6 +16,13 @@
 MAX_TRANSCRIPT_TOKEN_COUNT = 2500  # words


+# https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
+def poor_mans_token_counter(text: str) -> int:


🐛 Bug
The calculation in poor_mans_token_counter might lead to incorrect token counts due to integer division. The expression int(by_character + by_words) // 2 will first convert the sum to an integer, potentially losing precision before the division. This could lead to an inaccurate token count, especially for small values of text. To ensure more accurate results, the division should be performed before converting to an integer.

Additionally, the calculation could be simplified by using a more straightforward approach or by explaining the rationale behind the chosen formula. This would help in understanding why dividing by 4 and multiplying by 3/4 is used. The function uses a heuristic to estimate the number of tokens, but the logic could be more clearly documented or explained. Consider renaming the variables by_character and by_words to something more descriptive, such as character_estimate and word_estimate, to improve readability.

Suggested change

def poor_mans_token_counter(text: str) -> int:

return int((by_character + by_words) / 2)

There is a link explaining this formula
it's an estimate

The comment is too long I didn't read it

Co-authored-by: callstackai[bot] <177242706+callstackai[bot]@users.noreply.github.com>

callstackai-action

No issues found.

petercsiba

Callstack Review
1 bug discovered with params.body_text, but proposed change was incorrect code
1 reasonable suggestion on is_running_in_aws, but suggested change wrong
1 comment was just plainly wrong

petercsiba · 2024-09-12T13:38:52Z

backend/app/app.py

+        add paragraphs if needed. Especially make sure you keep all mentioned facts and details.
+    """
+    # CHEAPEST_MODEL leads to overly short answers.
+    transcription = gpt_client.run_prompt(transcript_prompt + data_entry.output_transcript, model=GPTo_MODEL)


Wrong, this is defined on line 62

petercsiba · 2024-09-12T13:40:07Z

backend/app/contacts_dump.py

@@ -17,6 +16,13 @@
 MAX_TRANSCRIPT_TOKEN_COUNT = 2500  # words


+# https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
+def poor_mans_token_counter(text: str) -> int:


There is a link explaining this formula
it's an estimate

petercsiba · 2024-09-12T13:40:27Z

backend/app/contacts_dump.py

@@ -17,6 +16,13 @@
 MAX_TRANSCRIPT_TOKEN_COUNT = 2500  # words


+# https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
+def poor_mans_token_counter(text: str) -> int:


The comment is too long I didn't read it

petercsiba · 2024-09-12T13:40:43Z

backend/app/gsheets.py

@@ -228,6 +228,7 @@ def add_form_datas_to_spreadsheet(self, form_datas: List[FormData]):
        sheet_cache = {}

        for form_data in form_datas:
+            # TODO: AttributeError: 'str' object has no attribute 'value'


Correct, done in a follow up diff

petercsiba · 2024-09-12T13:41:25Z

backend/input/email.py

+                gpt_client.transcribe_audio(
+                    audio_filepath=audio_filepath,
+                    prompt_hint="voice memo",
+                    use_cache_hit=not is_running_in_aws(),


i guess fair

petercsiba · 2024-09-12T13:43:08Z

backend/app/emails.py

@@ -179,7 +179,7 @@ def create_raw_email_with_attachments(params: EmailLog):
            <head></head>
            <body>
              """
-            + params.body_text
+            + (params.body_text.replace("\n", "<br />") if params.body_text else "")


I have accepted your suggestion to use the above instead just params.body_text BUT it is actually a bad syntax code leading to this production error (I have tested the version pre-this change accepted).

bad operand type for unary +: 'str' Traceback (most recent call last): File "/home/dumpsheet/app/app.py", line 453, in lambda_handler second_lambda_handler_wrapper(data_entry) File "/home/dumpsheet/app/app.py", line 415, in second_lambda_handler_wrapper process_generic_prompt(gpt_client, data_entry) File "/home/dumpsheet/app/app.py", line 385, in process_generic_prompt send_generic_result( File "/home/dumpsheet/app/emails.py", line 536, in send_generic_result return send_email(params=email_params) File "/home/dumpsheet/app/emails.py", line 236, in send_email raw_email = create_raw_email_with_attachments(params) File "/home/dumpsheet/app/emails.py", line 182, in create_raw_email_with_attachments + + (params.body_text.replace("\n", " ") if params.body_text else "") TypeError: bad operand type for unary +: 'str'

petercsiba · 2024-09-12T13:45:50Z

backend/input/email.py

+                gpt_client.transcribe_audio(
+                    audio_filepath=audio_filepath,
+                    prompt_hint="voice memo",
+                    use_cache_hit=not is_running_in_aws(),


the suggested change is quite bad though

[backend][app] Always send generic email result as best GPT on the tr…

7f4f90f

…anscript, other workflows yolo

callstackai-action bot reviewed Sep 11, 2024

View reviewed changes

make it work locally, update gpt-form-filler to use use_cache_hit=True

5dd16bc

vercel bot deployed to Preview September 12, 2024 04:44 View deployment

callstackai-action bot reviewed Sep 12, 2024

View reviewed changes

Apply suggestions from code review

752208e

Co-authored-by: callstackai[bot] <177242706+callstackai[bot]@users.noreply.github.com>

vercel bot deployed to Preview September 12, 2024 04:50 View deployment

petercsiba merged commit 6d21c70 into main Sep 12, 2024
3 checks passed

callstackai-action bot reviewed Sep 12, 2024

View reviewed changes

petercsiba commented Sep 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[backend][app] Always send generic email result as best GPT on the transcript, other workflows yolo #2

[backend][app] Always send generic email result as best GPT on the transcript, other workflows yolo #2

petercsiba commented Sep 11, 2024 •

edited by callstackai-action bot

Loading

vercel bot commented Sep 11, 2024 •

edited

Loading

callstackai-action bot left a comment

callstackai-action bot left a comment

callstackai-action bot Sep 12, 2024

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

callstackai-action bot Sep 12, 2024

petercsiba Sep 12, 2024

callstackai-action bot Sep 12, 2024

petercsiba Sep 12, 2024

callstackai-action bot Sep 12, 2024

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

callstackai-action bot left a comment

petercsiba left a comment

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

petercsiba Sep 12, 2024

	transcription = gpt_client.run_prompt(transcript_prompt + data_entry.output_transcript, model=GPTo_MODEL)
	transcription = gpt_client.run_prompt(transcript_prompt + data_entry.output_transcript, model=BEST_MODEL)

	# TODO: AttributeError: 'str' object has no attribute 'value'
	form_name = form_data.form.form_name

	def poor_mans_token_counter(text: str) -> int:
	return int((by_character + by_words) / 2)

[backend][app] Always send generic email result as best GPT on the transcript, other workflows yolo #2

[backend][app] Always send generic email result as best GPT on the transcript, other workflows yolo #2

Conversation

petercsiba commented Sep 11, 2024 • edited by callstackai-action bot Loading

Description by Cal

PR Description

Key Issues

Files Changed

vercel bot commented Sep 11, 2024 • edited Loading

callstackai-action bot left a comment

Choose a reason for hiding this comment

callstackai-action bot left a comment

Choose a reason for hiding this comment

PR Review Summary

callstackai-action bot Sep 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

callstackai-action bot Sep 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

callstackai-action bot Sep 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

callstackai-action bot Sep 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

callstackai-action bot left a comment

Choose a reason for hiding this comment

petercsiba left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petercsiba commented Sep 11, 2024 •

edited by callstackai-action bot

Loading

vercel bot commented Sep 11, 2024 •

edited

Loading