SWIFT-Model-of-Eye-Movements

This repository implements Bayesian parameter inference for the SWIFT eye-tracking model using real reading data.It combines fixation sequences and word-level corpus features into an enhanced dataset (swift_model_enhanced.csv) and applies BayesFlow (v1/v2) or Approximate Bayesian Computation (ABC) for inference. The project includes: Data preprocessing & integration (fixations + corpus merge, imputation, normalization), SWIFT-inspired simulator for fixation durations and saccades, Bayesian inference pipeline with posterior estimation, Diagnostic visualizations (posterior plots, PPC checks, ECDFs), Enhanced dataset with 30+ well-documented features for reading research.

Goal

The SWIFT model is a dynamic generative model of eye movement control during reading. It simulates how a reader’s gaze shifts across a sentence as they process its content. The model incorporates:

Fixation duration – how long the eye stays on a word Saccades – eye movements to the next word Because the full SWIFT model is computationally intensive and has an intractable likelihood, this project uses a simplified SWIFT model with BayesFlow for Bayesian parameter inference.

Project Tasks

Implement the simplified SWIFT model in BayesFlow Use real eye-tracking data from a controlled reading experiment Estimate parameters related to gaze control and reading dynamics Investigate how well the model captures: Observed fixation durations Saccade patterns and regressions

Features

Automatic data preprocessing
- Schema normalization for fixation data
- Corpus merging for word properties (frequency, predictability, word length)
- Missing value imputation (fixation durations, regressions, jumps)
Flexible inference methods
- BayesFlow (v1/v2 APIs)
- Approximate Bayesian Computation (ABC) fallback
Simulation & diagnostics
- SWIFT-inspired fixation simulator
- Posterior distribution plots
- Posterior predictive checks (PPC)
- ECDF comparisons between observed and simulated fixations
- Data quality plots (histograms, scatterplots)

##Repository Structure

.
├── sbi_version_5.ipynb   # Main Jupyter notebook
├── sbi_version_5.py      # Colab-exported script
├── README.md             # Project documentation
└── swift_outputs/        # Auto-generated figures and posterior plots

##Data Requirements

Source Data

This project uses eye-tracking data from natural reading experiments, combining multiple sources:

CopCo – Copenhagen Corpus of Natural Reading (Danish) Eye-tracking recordings from natural reading of Danish texts 1,832 sentences (34,897 tokens) Gaze data from 22 participants Contains fixation durations, saccades, and gaze behavior in natural text reading Ideal for reading research and statistical modeling of eye-movement control 🔗 CopCo Corpus (ArXiv Reference) Size: < 300 MB
Controlled Reading Experiment Dataset Fixation sequences for an individual participant OSF Dataset – Fixation sequences
Corpus File (Word Properties) Word-level features: length, frequency, predictability OSF Corpus File

Export Final dataset saved as: swift_model_enhanced.csv Ensures compatibility with the BayesFlow SWIFT inference pipeline

Suitability

The final dataset contains: Observed fixation durations Saccade behavior (forward jumps, regressions) Word-level predictors (length, frequency, predictability) This makes it well-suited for Bayesian parameter inference of the SWIFT model, enabling the estimation of parameters related to gaze control and reading dynamics.

The fixation dataset should be a CSV file with (at minimum) the following columns: swift_model_enhanced.csv

Final Column Names

Column Name	Type	Description
`sentence_id`	Integer	Unique identifier for each sentence (groups words and fixations).
`word_id`	Integer	Identifier for the word within the text/corpus.
`fix_onset_ms`	Integer	Onset time of fixation in milliseconds (relative to trial start).
`fix_dur_ms`	Float (ms)	Duration of the fixation in milliseconds.
`saccade_word_jump`	Integer	Number of words jumped during the saccade (positive = forward, negative = back).
`word_length`	Integer	Number of characters in the word.
`word_frequency`	Float	Raw frequency of the word in the corpus.
`fixation_idx_in_sentence`	Integer	Index of fixation within the sentence sequence.
`is_first_fixation_on_word`	Boolean	Whether this fixation is the first on the word (`True`/`False`).
`first_fix_dur_ms`	Float (ms)	Duration of the first fixation on the word.
`gaze_total_ms`	Float (ms)	Total gaze duration on the word (sum of all fixations).
`is_regression`	Binary	Indicator if the saccade is a regression (`1` = regression, `0` = forward).
`forward_jump_size`	Integer	Forward saccade distance in word units (auto-computed).
`regression_size`	Integer	Size of the regression movement (negative word jump).
`prev_fix_dur_ms`	Float (ms)	Duration of the previous fixation (imputed if missing).
`log10_word_frequency`	Float	Log10-transformed word frequency.
`fix_dur_ms_z`	Float	Z-score of fixation duration (normalized within dataset).
`first_fix_dur_ms_z`	Float	Z-score of first fixation duration on the word.
`gaze_total_ms_z`	Float	Z-score of total gaze duration on the word.
`word_length_z`	Float	Z-score of word length.
`log10_word_frequency_z`	Float	Z-score of log10 word frequency.
`sent_id`	Integer	Alternate sentence ID (kept for compatibility).
`word_index`	Integer	Position of the word in the sentence (starting at 0).
`log10_freq`	Float	Duplicate/alternate log10 word frequency (for merging with corpus).
`predictability`	Float (0–1)	Probability of predicting the word given context (cloze or proxy).
`predictability_method`	String	Method used to assign predictability (e.g., `proxy_from_freq_len_position`).
`z_word_length`	Float	Z-score normalized word length.
`z_log10_freq`	Float	Z-score normalized log10 word frequency.
`z_predictability`	Float	Z-score normalized predictability.
`sent_pos_frac`	Float	Relative position of the word in the sentence (0–1).

Quick Start Guide

Run on Google Colab

Open Google Colab
Create a new notebook
Upload this repository’s notebook/script

from google.colab import files
uploaded = files.upload()   # upload fixation CSV (swift_model.csv)

Run all cells
```
Runtime > Run all
```
Results (plots, enhanced CSV, posterior samples) are saved in:
```
/content/swift_outputs/
```

##Outputs

Enhanced fixation dataset (swift_model_enhanced.csv)
Histograms of fixation durations
Scatterplots (duration vs. word length, frequency, predictability)
Posterior distributions of 10 SWIFT parameters
Posterior Predictive Check (PPC) plots
ECDF plots (observed vs. simulated fixations)

##SWIFT Model Parameters

The Bayesian inference estimates the following 10 parameters:

base_logdur – baseline log fixation duration
beta_freq – frequency effect
beta_wlen – word length effect
beta_pred – predictability effect
extra_sd – noise in fixation durations
sac_base – baseline saccade size
sac_wlen – effect of word length on saccade size
sac_sd – noise in saccades
p_reg – base regression probability
reg_scale – regression probability scaling

##Installation

Dependencies (automatically installed in Colab):

Python ≥ 3.8
NumPy, Pandas, Matplotlib
BayesFlow (pip install bayesflow)
Torch (for BayesFlow v2)

Install manually if running locally:

pip install -r requirements.txt

##Citation

If you use this code in your research, please cite the original SWIFT model and BayesFlow:

Engbert, R., Nuthmann, A., Richter, E. M., & Kliegl, R. (2005). SWIFT: A dynamical model of saccade generation during reading. Psychological Review, 112(4), 777.
Radev, S. T., Mertens, U. K., Voss, A., Ardizzone, L., & Köthe, U. (2020). BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE Transactions on Neural Networks and Learning Systems.

License

MIT License – free to use, modify, and distribute.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
16.zip		16.zip
README.md		README.md
sbi_version_5 (1).py		sbi_version_5 (1).py
sbi_version_5 (2).ipynb		sbi_version_5 (2).ipynb
swift_model_enhanced.csv		swift_model_enhanced.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SWIFT-Model-of-Eye-Movements

Quick Start Guide

Run on Google Colab

License

About

Uh oh!

Releases

Packages

Languages

divinezacharias/SWIFT-Model-of-Eye-Movements

Folders and files

Latest commit

History

Repository files navigation

SWIFT-Model-of-Eye-Movements

Quick Start Guide

Run on Google Colab

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages