Skip to content

Commit 7b778a4

Browse files
geoo89istride
andauthored
Create surveys feature (#147)
RapidPro contact fields have a name and a key. In the UI, users enter a name, and it is converted to a key (which is all lowercase and has spaces converted to underscores). The key is used everywhere when we want to reference a variable, e.g. via `@fields.key`. The name is used for display, but otherwise is pretty useless. Yet, RapidPro enforces that the name may not contain underscores, and thus it must differ from the key if we want to have underscores in the key. This is cumbersome. Therefore, in flow definitions, we now allow specifying a `save_name` for `save_value` rows which may contain underscores. Thus the exact same string can be used both in this column and when referencing the field via `@fields.key`. A name is autogenerated by replacing the underscores with spaces (but for a flow author, the name is irrelevant anyway). --------- Co-authored-by: Ian Stride <[email protected]>
1 parent 13687d9 commit 7b778a4

31 files changed

+1464
-365
lines changed

docs/sheets.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -223,7 +223,7 @@ def sheet_to_list_of_nested_dict(sheet, user_model):
223223
rather than List[RowModel]).
224224
'''
225225
row_parser = RowParser(user_model, CellParser())
226-
sheet_parser = SheetParser(row_parser, sheet.table)
226+
sheet_parser = SheetParser(sheet.table, row_parser)
227227
data_rows = sheet_parser.parse_all() # list of row model
228228
return [row.dict() for row in data_rows]
229229
# Below is what the content index parser does:

docs/surveys.md

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
# Surveys
2+
3+
Surveys can be created by defining a data sheet of questions, indexing it in the content index and adding a `survey` row in the content index.
4+
5+
A basic usage example can be found in `TestSurveyParser.test_basic_survey` in `tests/test_surveyparser.py`.
6+
7+
8+
## The question data sheet
9+
10+
Each survey consist of questions. Questions have an underlying data model `SurveyQuestionRowModel`. This consists of the fields defined in `SurveyQuestionModel` in `src/rpft/parsers/creation/surveymodels.py` and an additional `ID` field.
11+
12+
Each question consists of the question text, an associated variable that the user input is stored in, and a variety of other fields.
13+
14+
### Basic question fields
15+
16+
These are the basic fields of a question definition (can be used as column headers for question data sheets).
17+
18+
- `ID`: Identifier, used for flow and variable name generation.
19+
- `type`: Question type. Pre-defined types include `text`, `mcq`, ..., but custom ones can be used if the specific templates are defined by the user.
20+
- `messages`: The question text. This is a list of multiple messages, each message having a `text` and optional `image`/`audio`/`video` attachment fields, as well as a list `attachments` of generic attachments.
21+
- `question`: Shorthand for `messages.1.text`; you may use this instead of `messages` if none of your questions send more than 1 message.
22+
- `attachment`: Shorthand for `messages.1.attachment`; you may use this instead of `messages` if none of your questions send more than 1 message.
23+
- Note that these shorthands can NOT be used within template definitions.
24+
- `variable`: Variable to store the user input in. If blank, generated from the question ID as `sq_{survey_id}_{question_id}`. The survey_id/question_id is the survey's name/question ID, **in all lowercase with non-alphanumeric characters removed**
25+
- `completion_variable`: Variable indicating whether question has been completed. If blank, generated from the variable as `{variable}_complete`
26+
- `choices`: For multiple choice questions: a list of choices
27+
- `expiration.message`: Message that gets send when the user doesn't respond in a long time
28+
- `expiration.time`: [not implemented]
29+
30+
It is possible to reuse questions across multiple surveys (see `tags` below). In that case, we need to make sure that each copy of a question gets a unique name for its variables. Auto-generating the variable names from the question ID solves the question of creating unique variable names, however, we also need a way to refer to these variable names independent of the `surveyid` which is used for this.
31+
32+
Therefore we have the following shorthands, which can be used within any field of a question:
33+
34+
- `@answer` is short for `@fields.{variable}`. This is useful even without reusing questions, e.g. within confirmation/validation/stop conditions (see below).
35+
- `@answerid` is short for `{variable}`. This can be used when defining new variables (in postprocessing steps) whose names should depend on the variable in the question.
36+
- `@prefix` is short for `@fields.sq_{surveyid}`. This is useful when referencing variables from previous questions of the survey, by using e.g. `@prefix_{questionid}`
37+
- `@prefixid` is short for `sq_{surveyid}`. Similar to above
38+
39+
40+
### Special question fields
41+
42+
These are the more complex fields of a question definition (can be used as column headers for question data sheets).
43+
44+
#### `tags`: Tags for filtering
45+
46+
Data sheets can be created by filtering an existing data sheet by a condition (e.g. `'my_tag' in tags`), so that only rows fulfilling the condition are included. This way, the same pool of questions can be used for multiple surveys, by selecting questions via a survey-specific tag.
47+
48+
#### `relevant`: Omit a question based on previous answers
49+
50+
If any of the given conditions does not hold, skip the question. These conditions will commonly depend on previous answers.
51+
52+
#### `confirmation`: Conditional Answer confirmation
53+
54+
If one of the conditions holds, print the confirmation message associated with that condition, with options Yes/No. If user enters No, repeat the question.
55+
56+
Example:
57+
58+
- Do you confirm that you're under 18? [if @answer < 18]
59+
- Please confirm your input @answer [Unconditional confirmation can be realized by specifying a condition that is always true]
60+
61+
#### `stop`: Conditional premature end of survey (later: forward skip?)
62+
63+
If one of the conditions holds, send the message associated with the condition and end the survey.
64+
65+
Example:
66+
67+
- user's age is less than 18
68+
- user is not a parent
69+
- user does not live in the target region
70+
71+
#### `validation`: Validation / conditional repetition of question
72+
73+
If one of the conditions holds, send the message associated with the condition and repeat the question.
74+
75+
Example:
76+
77+
- Your name is too short. Please enter again.
78+
79+
#### `postprocessing`: Variable postprocessing
80+
81+
Postprocessing to do after a user's answer is successfully stored. This could be an assignment (of the same or another variable), or a flow that is triggered.
82+
83+
Examples:
84+
85+
- take the user's entered name and capitalize it (stored in the same variable)
86+
- create a new age_bucket variable based on the user's age input. If the age variable is called `sq_sid_age`, specifying the new variable in the assignment to be `@answerid_bucket` with create a variable `sq_sid_age_bucket`
87+
88+
#### `skipoption`: Optional questions
89+
90+
A way for the user to skip the question by typing in a specific phrase.
91+
92+
## Content index rows
93+
94+
After creating a data sheet with questions, in the content index, you can create a row of type `data_sheet` and specify the `data_model` as `SurveyQuestionRowModel`. This is a global model that does not need to be defined by the user in a custom module.
95+
96+
Then, create a row of type `survey`. For this, the following columns are relevant:
97+
98+
- `data_sheet`: A data sheet with questions
99+
- `new_name`: Name of the survey. If not provided, the name of the `data_sheet` is used.
100+
- `config`: A SurveyConfig object, see `src/rpft/parsers/creation/surveymodels.py`
101+
- `variable_prefix`: Prefix to apply to all RapidPro variables that are created by the survey. For each `SurveyQuestion`, this is the `variable`, `completion_variable` and `postprocessing.assignments.*.variable`. Ideally, avoid this feature in favor of using auto-generated variable names, `@answer`, `@answerid` and `@prefix`.
102+
- `expiration_message`: Message to send when a question flow expires. If a question does not specify an expiration message, this message is used by default.
103+
- `template arguments`: Template arguments to be passed down to the survey template
104+
105+
This will create one flow for each question, named `survey - {survey name} - question - {question ID}`, as well as a survey flow `survey - {survey name}` that invokes each question via `start_new_flow`. This is achieved via templating. The templates can be customized if needed.
106+
107+
108+
## Survey templates
109+
110+
We define global templates that are used by surveys. These templates can be found in `src/rpft/parsers/creation/survey_templates/`. They are as follows:
111+
112+
- `template_survey_wrapper`: Flow rendering all the questions.
113+
- Receives the following context variables that can be used in the template:
114+
- `questions`: a list of `SurveyQuestionRowModel`
115+
- `survey_name`: Name of the survey
116+
- `survey_id`: ID of the survey (generated from name)
117+
- In the content index, a `survey` row can have `template_arguments`. If present, these are passed to the `template_survey_wrapper` template when creating a survey.
118+
- `template_survey_question_wrapper`: Question functionality that is common to all input types. Invoked by the survey via `start_new_flow`
119+
- Receives the fields of the `SurveyQuestionRowModel` as its context variables
120+
- Currently, it is not possible to pass template arguments to this template.
121+
- `template_survey_question_block_{type}`: For each question input type `{type}`, there is a template to read the user data. These are included into the `template_survey_question_wrapper` via `insert_as_block`
122+
- Because this template is inserted as a block, any context that is available in `template_survey_question_wrapper` (in particular, `question`) is also available in this template.
123+
124+
The user can overwrite these by defining a template of the same name in the content index, thereby using their own custom templates. There is no constraint on what `{type}` can be, therefore the user can also create their own question types.

src/rpft/parsers/common/rowparser.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55

66
from pydantic.v1 import BaseModel
77

8+
from rpft.parsers.common.cellparser import CellParser
9+
810

911
class RowParserError(Exception):
1012
pass
@@ -122,10 +124,10 @@ class RowParser:
122124
TYPE_ANNOTATION_SEPARATOR = ":"
123125
DEFAULT_VALUE_SEPARATOR = "="
124126

125-
def __init__(self, model, cell_parser):
127+
def __init__(self, model, cell_parser=None):
126128
self.model = model
127129
self.output = None # Gets reinitialized with each call to parse_row
128-
self.cell_parser = cell_parser
130+
self.cell_parser = cell_parser or CellParser()
129131

130132
def try_assign_as_kwarg(self, field, key, value, model):
131133
# If value can be interpreted as a (field, field_value) pair for a field of

src/rpft/parsers/common/sheetparser.py

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,39 @@
11
import copy
22
from rpft.parsers.common.rowdatasheet import RowDataSheet
3+
from rpft.parsers.common.rowparser import RowParser
34
from rpft.logger.logger import get_logger, logging_context
45

56
LOGGER = get_logger()
67

78

89
class SheetParser:
9-
def __init__(self, row_parser, table, context={}):
10+
def parse_sheet(table, row_model):
1011
"""
1112
Args:
13+
table: Tablib Dataset representing the table to be parsed.
14+
row_model: Data model to convert rows of the sheet into.
15+
16+
Returns:
17+
RowDataSheet instance containing a list of row_model instances
18+
"""
19+
20+
sheet_parser = SheetParser(table, row_model)
21+
return sheet_parser.get_row_data_sheet()
22+
23+
def __init__(self, table, row_model=None, row_parser=None, context={}):
24+
"""
25+
Either a row_parser or a row_model need to be provided.
26+
27+
Args:
28+
table: Tablib Dataset representing the table to be parsed.
29+
row_model: Data model to convert rows of the sheet into.
1230
row_parser: parser to convert flat dicts to RowModel instances.
1331
context: context used for template parsing
14-
table: Tablib Dataset representing the table to be parsed.
1532
"""
1633

17-
self.row_parser = row_parser
34+
if not (row_parser or row_model):
35+
raise ValueError("SheetParser: needs either row_parser or row_model")
36+
self.row_parser = row_parser or RowParser(row_model)
1837
self.bookmarks = {}
1938
self.input_rows = []
2039
for row_idx, row in enumerate(table):

src/rpft/parsers/creation/__init__.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
from rpft.logger.logger import get_logger
2+
from rpft.parsers.creation.models import TemplateSheet
3+
4+
LOGGER = get_logger()
5+
6+
7+
def map_template_arguments(template: TemplateSheet, args, context, data_sheets) -> dict:
8+
"""
9+
Map template arguments, which are positional, to the arguments from the template
10+
definition, and add the values of the arguments to the context with the appropriate
11+
variable name (from the definition).
12+
"""
13+
arg_defs = template.argument_definitions
14+
15+
if arg_defs and len(args) > len(arg_defs):
16+
# Once the row parser is cleaned up to eliminate trailing '' entries, this
17+
# won't be necessary
18+
extra_args = args[len(arg_defs) :]
19+
non_empty_extra_args = [ea for ea in extra_args if ea]
20+
21+
if non_empty_extra_args:
22+
LOGGER.warning(
23+
"Too many template arguments provided, "
24+
+ str(
25+
{
26+
"template": template.name,
27+
"extra": non_empty_extra_args,
28+
"definition": arg_defs,
29+
"arguments": args,
30+
}
31+
)
32+
)
33+
34+
args = args[: len(arg_defs)]
35+
36+
args_padding = [""] * (len(arg_defs) - len(args))
37+
38+
for arg_def, arg in zip(arg_defs, args + args_padding):
39+
value = arg if arg != "" else arg_def.default_value
40+
41+
if value == "":
42+
LOGGER.critical(f'Required template argument "{arg_def.name}" not provided')
43+
44+
value = data_sheets[value].rows if arg_def.type == "sheet" else value
45+
46+
if arg_def.name in context and value != context[arg_def.name]:
47+
LOGGER.warn(
48+
"Template argument reassigned, "
49+
+ str(
50+
{
51+
"template": template.name,
52+
"name": arg_def.name,
53+
"before": context[arg_def.name],
54+
"after": value,
55+
}
56+
)
57+
)
58+
59+
context[arg_def.name] = value
60+
61+
return context

0 commit comments

Comments
 (0)