Skip to content

Commit

Permalink
MDL-83541 question/quiz: restoring questions with identical stamps
Browse files Browse the repository at this point in the history
Historically it was possible, through a series of question restores,
moves and edits, to end up with multiple questions in the same
category with the same stamp, but differences in other question
or answer fields. This, combined with changes in versions, led
to errors when restoring or duplicating quizzes using these questions.

While recent changes have made it impossible to create this situation
in current Moodle versions, as any edits will create a new question
version with a new stamp, this situation may exist on long-standing
Moodle sites which have been upgraded since pre-4.0.

This change performs a much wider-ranging comparison of restored
existing questions, generating a hash of all the data for a question in
a backup file, and a corresponding hash for each question in the target
category, to decide if a restored question matches a question already in
the database.
  • Loading branch information
marxjohnson committed Feb 14, 2025
1 parent f91e3d4 commit 7b9f56a
Show file tree
Hide file tree
Showing 6 changed files with 938 additions and 15 deletions.
30 changes: 30 additions & 0 deletions .upgradenotes/MDL-83541-2025020615341872.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
issueNumber: MDL-83541
notes:
core_question:
- message: >
Duplication or multiple restores of questions has been modified to avoid
errors where a question with the same stamp already exists in the target
category.
To achieve this all data for the question is hashed, excluding any ID
fields.
If a qtype plugin calls any `$this->add_question_*()` methods in its
`restore_qtype_*_plugin::define_question_plugin_structure()` method, the
ID fields used in these records will be excluded automatically.
If a qtype plugin defines its own tables with ID fields, it must define
`restore_qtype_*_plugin::define_excluded_identity_hash_fields()` to return
an array of paths to these fields within the question data. This should be
all that is required for the majority of plugins.
See the PHPDoc of `restore_qtype_plugin::define_excluded_identity_hash_fields()`
for a full explanation of how these paths should be defined, and
`restore_qtype_truefalse_plugin` for an example.
If the data structure for a qtype returned by calling
`get_question_options()` contains data other than ID fields that are not
contained in the backup structure or vice-versa, it will need to
override `restore_qtype_*_plugin::remove_excluded_question_data()`
to remove the inconsistent data. See `restore_qtype_multianswer_plugin` as
an example.
type: fixed
209 changes: 209 additions & 0 deletions backup/moodle2/restore_qtype_plugin.class.php
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@ abstract class restore_qtype_plugin extends restore_plugin {
*/
private $questionanswercacheid = null;

/**
* @var array List of fields to exclude form hashing during restore.
*/
protected array $excludedhashfields = [];

/**
* Add to $paths the restore_path_elements needed
* to handle question_answers for a given question
Expand All @@ -62,6 +67,10 @@ protected function add_question_question_answers(&$paths) {
$elename = 'question_answer';
$elepath = $this->get_pathfor('/answers/answer'); // we used get_recommended_name() so this works
$paths[] = new restore_path_element($elename, $elepath);
$this->exclude_identity_hash_fields([
'/options/answers/id',
'/options/answers/question',
]);
}

/**
Expand All @@ -78,6 +87,10 @@ protected function add_question_numerical_units(&$paths) {
$elename = 'question_numerical_unit';
$elepath = $this->get_pathfor('/numerical_units/numerical_unit'); // we used get_recommended_name() so this works
$paths[] = new restore_path_element($elename, $elepath);
$this->exclude_identity_hash_fields([
'/options/units/id',
'/options/units/question',
]);
}

/**
Expand All @@ -94,6 +107,7 @@ protected function add_question_numerical_options(&$paths) {
$elename = 'question_numerical_option';
$elepath = $this->get_pathfor('/numerical_options/numerical_option'); // we used get_recommended_name() so this works
$paths[] = new restore_path_element($elename, $elepath);
$this->exclude_identity_hash_fields(['/options/question']);
}

/**
Expand All @@ -114,6 +128,21 @@ protected function add_question_datasets(&$paths) {
$elename = 'question_dataset_item';
$elepath = $this->get_pathfor('/dataset_definitions/dataset_definition/dataset_items/dataset_item');
$paths[] = new restore_path_element($elename, $elepath);
$this->exclude_identity_hash_fields([
'/options/datasets/id',
'/options/datasets/question',
'/options/datasets/category',
'/options/datasets/type',
'/options/datasets/items/id',
// The following fields aren't included in the backup or DB structure, but are parsed from the options field.
'/options/datasets/status',
'/options/datasets/distribution',
'/options/datasets/minimum',
'/options/datasets/maximum',
'/options/datasets/decimals',
// This field is set dynamically from the count of items in the dataset, it is not backed up.
'/options/datasets/number_of_items',
]);
}

/**
Expand Down Expand Up @@ -395,4 +424,184 @@ public static function define_plugin_decode_contents() {

return $contents;
}

/**
* Add fields to the list of fields excluded from hashing.
*
* This allows common methods to add fields to the exclusion list.
*
* @param array $fields
* @return void
*/
private function exclude_identity_hash_fields(array $fields): void {
$this->excludedhashfields = array_merge($this->excludedhashfields, $fields);
}

/**
* Return fields to be excluded from hashing during restores.
*
* @return array
*/
final public function get_excluded_identity_hash_fields(): array {
return array_unique(array_merge(
$this->excludedhashfields,
$this->define_excluded_identity_hash_fields(),
));
}

/**
* Return a list of paths to fields to be removed from questiondata before creating an identity hash.
*
* Fields that should be excluded from common elements such as answers or numerical units that are used by the plugin will
* be excluded automatically. This method just needs to define any specific to this plugin, such as foreign keys used in the
* plugin's tables.
*
* The returned array should be a list of slash-delimited paths to locate the fields to be removed from the questiondata object.
* For example, if you want to remove the field `$questiondata->options->questionid`, the path would be '/options/questionid'.
* If a field in the path is an array, the rest of the path will be applied to each object in the array. So if you have
* `$questiondata->options->answers[]`, the path '/options/answers/id' will remove the 'id' field from each element of the
* 'answers' array.
*
* @return array
*/
protected function define_excluded_identity_hash_fields(): array {
return [];
}

/**
* Convert the backup structure of this question type into a structure matching its question data
*
* This should take the hierarchical array of tags from the question's backup structure, and return a structure that matches
* that returned when calling {@see get_question_options()} for this question type.
* See https://docs.moodle.org/dev/Question_data_structures#Representation_1:_%24questiondata for an explanation of this
* structure.
*
* This data will then be used to produce an identity hash for comparison with questions in the database.
*
* This base implementation deals with all common backup elements created by the add_question_*_options() methods in this class,
* plus elements added by ::define_question_plugin_structure() named for the qtype. The question type will need to extend
* this function if ::define_question_plugin_structure() adds any other elements to the backup.
*
* @param array $tags The array of tags from the backup.
* @return \stdClass The questiondata object.
*/
public static function convert_backup_to_questiondata(array $tags): \stdClass {
$questiondata = (object) array_filter($tags, fn($tag) => !is_array($tag)); // Create an object from the top-level fields.
$qtype = $questiondata->qtype;
$questiondata->options = new stdClass();
if (isset($tags["plugin_qtype_{$qtype}_question"][$qtype])) {
$questiondata->options = (object) $tags["plugin_qtype_{$qtype}_question"][$qtype][0];
}
if (isset($tags["plugin_qtype_{$qtype}_question"]['answers'])) {
$questiondata->options->answers = array_map(
fn($answer) => (object) $answer,
$tags["plugin_qtype_{$qtype}_question"]['answers']['answer'],
);
}
if (isset($tags["plugin_qtype_{$qtype}_question"]['numerical_options'])) {
$questiondata->options = (object) array_merge(
(array) $questiondata->options,
$tags["plugin_qtype_{$qtype}_question"]['numerical_options']['numerical_option'][0],
);
}
if (isset($tags["plugin_qtype_{$qtype}_question"]['numerical_units'])) {
$questiondata->options->units = array_map(
fn($unit) => (object) $unit,
$tags["plugin_qtype_{$qtype}_question"]['numerical_units']['numerical_unit'],
);
}
if (isset($tags["plugin_qtype_{$qtype}_question"]['dataset_definitions'])) {
$questiondata->options->datasets = array_map(
fn($dataset) => (object) $dataset,
$tags["plugin_qtype_{$qtype}_question"]['dataset_definitions']['dataset_definition'],
);
}
if (isset($questiondata->options->datasets)) {
foreach ($questiondata->options->datasets as $dataset) {
if (isset($dataset->dataset_items)) {
$dataset->items = array_map(
fn($item) => (object) $item,
$dataset->dataset_items['dataset_item'],
);
unset($dataset->dataset_items);
}
}
}
if (isset($tags['question_hints'])) {
$questiondata->hints = array_map(
fn($hint) => (object) $hint,
$tags['question_hints']['question_hint'],
);
}

return $questiondata;
}

/**
* Remove excluded fields from the questiondata structure.
*
* This removes fields that will not match or not be present in the question data structure produced by
* {@see self::convert_backup_to_questiondata()} and {@see get_question_options()} (such as IDs), so that the remaining data can
* be used to produce an identity hash for comparing the two.
*
* For plugins, it should be sufficient to override {@see self::define_excluded_identity_hash_fields()} with a list of paths
* specific to the plugin type. Overriding this method is only necessary if the plugin's
* {@see question_type::get_question_options()} method adds additional data to the question that is not included in the backup.
*
* @param stdClass $questiondata
* @param array $excludefields Paths to the fields to exclude.
* @return stdClass The $questiondata with excluded fields removed.
*/
public static function remove_excluded_question_data(stdClass $questiondata, array $excludefields = []): stdClass {
// All questions will need to exclude 'id' (used by question and other tables), 'questionid' (used by hints and options),
// 'createdby' and 'modifiedby' (since they won't map between sites).
$defaultexcludes = [
'/id',
'/createdby',
'/modifiedby',
'/hints/id',
'/hints/questionid',
'/options/id',
'/options/questionid',
];
$excludefields = array_unique(array_merge($excludefields, $defaultexcludes));

foreach ($excludefields as $excludefield) {
$pathparts = explode('/', ltrim($excludefield, '/'));
$data = $questiondata;
self::unset_excluded_fields($data, $pathparts);
}

return $questiondata;
}

/**
* Iterate through the elements of path to an excluded field, and unset the final element.
*
* If any of the elements in the path is an array, this is called recursively on each element in the array to unset fields
* in each child of the array.
*
* @param stdClass|array $data The questiondata object, or a subsection of it.
* @param array $pathparts The remaining elements in the path to the excluded field.
* @return void
*/
private static function unset_excluded_fields(stdClass|array $data, array $pathparts): void {
while (!empty($pathparts)) {
$element = array_shift($pathparts);
if (!isset($data->{$element})) {
// This element is not present in the data structure, nothing to unset.
return;
}
if (is_object($data->{$element})) {
$data = $data->{$element};
} else if (is_array($data->{$element})) {
foreach ($data->{$element} as $item) {
self::unset_excluded_fields($item, $pathparts);
}
} else if (empty($pathparts)) {
// This is the last element of the path and it's a scalar value, unset it.
unset($data->{$element});
}
}
}
}
3 changes: 3 additions & 0 deletions backup/tests/quiz_restore_decode_links_test.php
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ public function test_restore_quiz_decode_links(): void {

$questiondata = \question_bank::load_question_data($question->id);

$DB->set_field('question', 'questiontext', $CFG->wwwroot . '/mod/quiz/view.php?id=' . $quiz->cmid, ['id' => $question->id]);

$firstanswer = array_shift($questiondata->options->answers);
$DB->set_field('question_answers', 'answer', $CFG->wwwroot . '/course/view.php?id=' . $course->id,
['id' => $firstanswer->id]);
Expand All @@ -87,6 +89,7 @@ public function test_restore_quiz_decode_links(): void {
$questionids = [];
foreach ($quizquestions as $quizquestion) {
if ($quizquestion->questionid) {
$this->assertEquals($CFG->wwwroot . '/mod/quiz/view.php?id=' . $quiz->cmid, $quizquestion->questiontext);
$questionids[] = $quizquestion->questionid;
}
}
Expand Down
56 changes: 44 additions & 12 deletions backup/util/dbops/restore_dbops.class.php
Original file line number Diff line number Diff line change
Expand Up @@ -567,7 +567,7 @@ public static function precheck_categories_and_questions($restoreid, $courseid,
* @return array A separate list of all error and warnings detected
*/
public static function prechek_precheck_qbanks_by_level($restoreid, $courseid, $userid, $samesite, $contextlevel) {
global $DB;
global $DB, $CFG;

// To return any errors and warnings found
$errors = array();
Expand Down Expand Up @@ -669,21 +669,37 @@ public static function prechek_precheck_qbanks_by_level($restoreid, $courseid, $
} else {
self::set_backup_ids_record($restoreid, 'question_category', $category->id, $matchcat->id, $targetcontext->id);
$questions = self::restore_get_questions($restoreid, $category->id);
$transformer = self::get_backup_xml_transformer($courseid);

// Collect all the questions for this category into memory so we only talk to the DB once.
$questioncache = $DB->get_records_sql_menu('SELECT q.stamp, q.id
FROM {question} q
JOIN {question_versions} qv
ON qv.questionid = q.id
JOIN {question_bank_entries} qbe
ON qbe.id = qv.questionbankentryid
JOIN {question_categories} qc
ON qc.id = qbe.questioncategoryid
WHERE qc.id = ?', array($matchcat->id));
$recordset = $DB->get_recordset_sql(
"SELECT q.*
FROM {question} q
JOIN {question_versions} qv ON qv.questionid = q.id
JOIN {question_bank_entries} qbe ON qbe.id = qv.questionbankentryid
JOIN {question_categories} qc ON qc.id = qbe.questioncategoryid
WHERE qc.id = ?",
[$matchcat->id],
);

// Compute a hash of question and answer fields to differentiate between identical stamp-version questions.
$questioncache = [];
foreach ($recordset as $question) {
$question->export_process = true; // Include all question options required for export.
get_question_options($question);
unset($question->export_process);
// Remove some additional properties from get_question_options() that isn't included in backups
// before we produce the identity hash.
unset($question->categoryobject);
unset($question->questioncategoryid);
$cachekey = restore_questions_parser_processor::generate_question_identity_hash($question, $transformer);
$questioncache[$cachekey] = $question->id;
}
$recordset->close();

foreach ($questions as $question) {
if (isset($questioncache[$question->stamp])) {
$matchqid = $questioncache[$question->stamp];
if (isset($questioncache[$question->questionhash])) {
$matchqid = $questioncache[$question->questionhash];
} else {
$matchqid = false;
}
Expand Down Expand Up @@ -1918,6 +1934,22 @@ public static function delete_course_content($courseid, ?array $options = null)
private static function password_should_be_discarded(#[\SensitiveParameter] string $password): bool {
return (bool) preg_match('/^[0-9a-f]{32}$/', $password);
}

/**
* Load required classes and return a backup XML transformer for the specified course.
*
* These classes may not have been loaded if we're only doing a restore in the current process,
* so make sure we have them here.
*
* @param int $courseid
* @return backup_xml_transformer
*/
protected static function get_backup_xml_transformer(int $courseid): backup_xml_transformer {
global $CFG;
require_once($CFG->dirroot . '/backup/util/includes/backup_includes.php');
require_once($CFG->dirroot . '/backup/moodle2/backup_plan_builder.class.php');
return new backup_xml_transformer($courseid);
}
}

/*
Expand Down
Loading

0 comments on commit 7b9f56a

Please sign in to comment.