Skip to content

Commit 655a352

Browse files
committed
fix supported prefixes
1 parent c2fb77f commit 655a352

File tree

3 files changed

+132
-2
lines changed

3 files changed

+132
-2
lines changed
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
models:
2+
- type: main
3+
engine: nim
4+
model: meta/llama-3.1-70b-instruct
5+
6+
- type: content_safety
7+
engine: nim
8+
parameters:
9+
base_url: http://10.63.135.193:8000/v1
10+
model_name: "llama-3.1-nemoguard-8b-content-safety"
11+
- type: topic_control
12+
engine: nim
13+
parameters:
14+
base_url: http://10.34.0.68:8000/v1
15+
model_name: "llama-3.1-nemoguard-8b-topic-control"
16+
17+
rails:
18+
input:
19+
flows:
20+
- topic safety check input $model=topic_control
21+
output:
22+
flows:
23+
- content safety check output $model=content_safety
24+
dialog:
25+
single_call:
26+
enabled: False
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
prompts:
2+
- task: content_safety_check_input $model=content_safety
3+
content: |
4+
Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.
5+
6+
<BEGIN UNSAFE CONTENT CATEGORIES>
7+
S1: Violence.
8+
S2: Sexual.
9+
S3: Criminal Planning/Confessions.
10+
S4: Guns and Illegal Weapons.
11+
S5: Controlled/Regulated Substances.
12+
S6: Suicide and Self Harm.
13+
S7: Sexual (minor).
14+
S8: Hate/Identity Hate.
15+
S9: PII/Privacy.
16+
S10: Harassment.
17+
S11: Threat.
18+
S12: Profanity.
19+
S13: Needs Caution.
20+
S14: Other.
21+
S15: Manipulation.
22+
S16: Fraud/Deception.
23+
S17: Malware.
24+
S18: High Risk Gov Decision Making.
25+
S19: Political/Misinformation/Conspiracy.
26+
S20: Copyright/Trademark/Plagiarism.
27+
S21: Unauthorized Advice.
28+
S22: Illegal Activity.
29+
S23: Immoral/Unethical.
30+
<END UNSAFE CONTENT CATEGORIES>
31+
32+
<BEGIN CONVERSATION>
33+
34+
user: {{ user_input }}
35+
36+
<END CONVERSATION>
37+
38+
Provide your safety assessment for the above conversation in the below output JSON format:
39+
{"User Safety": safe or unsafe rating for the user message, "Response Safety": safe or unsafe rating for the agent response. Omit if no agent response present. "Safety Categories": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}
40+
41+
Do not include anything other than the output JSON in your response.
42+
Output JSON:
43+
output_parser: nemoguard_parse_prompt_safety
44+
max_tokens: 50
45+
46+
- task: content_safety_check_output $model=content_safety
47+
content: |
48+
Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.
49+
50+
<BEGIN UNSAFE CONTENT CATEGORIES>
51+
S1: Violence.
52+
S2: Sexual.
53+
S3: Criminal Planning/Confessions.
54+
S4: Guns and Illegal Weapons.
55+
S5: Controlled/Regulated Substances.
56+
S6: Suicide and Self Harm.
57+
S7: Sexual (minor).
58+
S8: Hate/Identity Hate.
59+
S9: PII/Privacy.
60+
S10: Harassment.
61+
S11: Threat.
62+
S12: Profanity.
63+
S13: Needs Caution.
64+
S14: Other.
65+
S15: Manipulation.
66+
S16: Fraud/Deception.
67+
S17: Malware.
68+
S18: High Risk Gov Decision Making.
69+
S19: Political/Misinformation/Conspiracy.
70+
S20: Copyright/Trademark/Plagiarism.
71+
S21: Unauthorized Advice.
72+
S22: Illegal Activity.
73+
S23: Immoral/Unethical.
74+
<END UNSAFE CONTENT CATEGORIES>
75+
76+
<BEGIN CONVERSATION>
77+
78+
user: {{ user_input }}
79+
80+
response: agent: {{ bot_response }}
81+
82+
<END CONVERSATION>
83+
84+
Provide your safety assessment for the above conversation in the below output JSON format:
85+
{"User Safety": safe or unsafe rating for the user message, "Response Safety": safe or unsafe rating for the agent response. Omit if no agent response present. "Safety Categories": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}
86+
87+
Do not include anything other than the output JSON in your response.
88+
Output JSON:
89+
output_parser: nemoguard_parse_response_safety
90+
max_tokens: 50
91+
92+
- task: topic_safety_check_input $model=topic_control
93+
content: |
94+
You are to act as a customer service agent, providing users with factual information in accordance to the knowledge base. Your role is to ensure that you respond only to relevant queries and adhere to the following guidelines
95+
96+
Guidelines for the user messages:
97+
- Do not answer questions related to personal opinions or advice on user's order, future recommendations
98+
- Do not provide any information on non-company products or services.
99+
- Do not answer enquiries unrelated to the company policies.
100+
- Do not answer questions asking for personal details about the agent or its creators.
101+
- Do not answer questions about sensitive topics related to politics, religion, or other sensitive subjects.
102+
- If a user asks topics irrelevant to the company's customer service relations, politely redirect the conversation or end the interaction.
103+
- Your responses should be professional, accurate, and compliant with customer relations guidelines, focusing solely on providing transparent, up-to-date information about the company that is already publicly available.

nemoguardrails/rails/llm/llmrails.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1277,6 +1277,7 @@ def _update_explain_info():
12771277
)
12781278

12791279
async for chunk_list, chunk_str_rep in buffer_strategy(streaming_handler):
1280+
print(f"chunk_list: {chunk_list}")
12801281
chunk_str = " ".join(chunk_list)
12811282

12821283
if stream_first:
@@ -1299,6 +1300,7 @@ def _update_explain_info():
12991300
)
13001301

13011302
# Execute the action. (Your execute_action returns only the result.)
1303+
print(f"applyig action: {action_name}")
13021304
result = await self.runtime.action_dispatcher.execute_action(
13031305
action_name, params
13041306
)
@@ -1336,8 +1338,7 @@ def _get_action_details_from_flow_id(
13361338

13371339
supported_prefixes = [
13381340
"content safety check output",
1339-
"content safety check",
1340-
"topic safety check",
1341+
"topic safety check output",
13411342
]
13421343
if prefixes:
13431344
supported_prefixes.extend(prefixes)

0 commit comments

Comments
 (0)