class: title-slide, middle <div style = "position:fixed; visibility: hidden"> `$$\require{color}\definecolor{red}{rgb}{0.698039215686274, 0.133333333333333, 0.133333333333333}$$` `$$\require{color}\definecolor{green}{rgb}{0.125490196078431, 0.698039215686274, 0.666666666666667}$$` `$$\require{color}\definecolor{blue}{rgb}{0.274509803921569, 0.509803921568627, 0.705882352941177}$$` `$$\require{color}\definecolor{yellow}{rgb}{0.823529411764706, 0.411764705882353, 0.117647058823529}$$` `$$\require{color}\definecolor{purple}{rgb}{0.866666666666667, 0.627450980392157, 0.866666666666667}$$` </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { red: ["{\\color{red}{#1}}", 1], green: ["{\\color{green}{#1}}", 1], blue: ["{\\color{blue}{#1}}", 1], yellow: ["{\\color{yellow}{#1}}", 1], purple: ["{\\color{purple}{#1}}", 1] }, loader: {load: ['[tex]/color']}, tex: {packages: {'[+]': ['color']}} } }); </script> <style> .red {color: #B22222;} .green {color: #20B2AA;} .blue {color: #4682B4;} .yellow {color: #D2691E;} .purple {color: #DDA0DD;} </style> ### Statistical Modeling in Experimental Psychology # W14 Drift Diffusion Models ## Decision Dynamics With RT #### Han Hao @ Tarleton State University --- ## From SDT to DDM - In many cognitive tasks, we sometimes observe: - A response choice - And **.red[a response time]** - In those paradigms, RTs are used to reflect information so accuracy alone does not tell the whole story - How we process information and make decisions - **.red[Why is a person slower]**? Hard task/trial, cautious person, or they are just slow to process information and respond? - Mean RT still does not tell the whole story, and due to the distribution of RT, it's **.red[less robust to assume normality]** - We want a model that explains both together --- # A familiar task: Stroop .pull-left[ In a Stroop task, a participant may see a word like: - **.red[RED]** shown in red ink - **.blue[RED]** shown in blue ink The task is to name the **color** of the word, not read the word. ] .pull-right[ #### Congruent - **.blue[BLUE]** - **.purple[PURPLE]** #### Incongruent - **.green[RED]** - **.yellow[BLACK]** ] --- ## Basic idea of DDM .pull-left[ For each trial, we usually record: - whether the response was correct (Accuracy) - how long the response took (RT) The Drift Diffusion Model assumes that: - Noisy evidence builds up over time - The evidence "drifts" up or down - A response is made when the evidence hits a boundary ] .pull-right[  ] --- ## What makes up the observed RT? The 4 main parts constitute observed RT in a trial: - **Non-Decision Time (NDT)**: time spent on things outside of decision - **Drift rate**: how strong or clear the evidence is accumulating (plays a role similar to d' in SDT) - **Boundaries**: how much evidence is needed before responding - **Starting point**: any bias toward one response at start --- ## Non-decision time .pull-left[ Non-Decision Time (NDT,$t_0$) reflects time that is **not** the decision itself. Examples: - General lags in "mental preparation" - Sensing and perceiving the stimulus - Pressing the response key ] .pull-right[  ] --- .pull-left[ ## Drift rate Drift rate (`\(\nu\)`) reflects the **.red[quality of the information]** going into the decision ("the speed of information accumulation"). - Think about why I say this plays a role in DDM similar to d' in SDT Higher drift rate usually means (in general): - clearer evidence - faster decisions - more accurate decisions ] .pull-right[   ] --- .pull-left[ ## Boundaries Boundary separation (`\(\alpha\)`) reflects **the amount of evidence needed** between the two possible choices ("the total distance it takes", a measure of caution?) Larger boundary separation means: - The person needs more evidence to make the choice - Potentially slower responses - But maybe higher accuracy ] .pull-right[ ## Bias Bias (`\(\beta\)`, `\(omega\)` or `\(z\)`?) reflects the **preference or tendency** before the evidence begins to accumulate ("the start point") A shifted starting point means: - One option is favored early - Less evidence is needed for that side - Condition differences and/or individual differences ] --- ## A plain-language expectation for Stroop Task .pull-left[ ### Congruent trials We would expect: - easier evidence - faster responses - higher accuracy ] .pull-right[ ### Incongruent trials We would expect: - weaker or noisier evidence - slower responses - lower accuracy ] --- class: inverse # A small demo in R ## Simulated Stroop-style data --- ## The demo dataset For the demo, I created a small simulated Stroop-style dataset. - multiple participants - congruent and incongruent trials - trial-level RTs - trial-level responses In the hidden setup code, the dataset is also saved as: `stroop_ddm_demo.csv` ``` r stroop_dat <- stroop_dat |> mutate(response = case_when(correct == 0 ~ 1, correct == 1 ~ 2)) ``` --- ## Look at one subject ``` r onep <- stroop_dat |> filter(id == "P01") ``` | | vars| mean| sd| skew| kurtosis| se| |:----------|----:|----:|----:|-----:|--------:|----:| |rt | 1| 0.69| 0.32| 1.69| 2.80| 0.02| |condition* | 2| 1.50| 0.50| 0.00| -2.01| 0.04| |correct | 3| 0.88| 0.32| -2.40| 3.76| 0.02| |response | 4| 1.89| 0.32| -2.40| 3.76| 0.02| <!-- --> --- ## Subject 1 SDT Info ``` r knitr::kable(table(Condition = onep$condition, Correct = factor(onep$correct, levels = c(0, 1), labels = c("Incorrect", "Correct")))) ``` | | Incorrect| Correct| |:-----------|---------:|-------:| |Congruent | 0| 100| |Incongruent | 23| 77| --- ## Subject 1 Accuracy and RT <!-- --> ??? This helps students see that the model is trying to explain a familiar pattern, not replace it. The DDM is a way of explaining why these two bars differ, not a substitute for understanding the data. --- # A simple DDM fit with "fddm" package ``` r library(fddm) model1 <- ddm(rt + response ~ 0 + condition, boundary = ~ condition, data = onep) summary(model1) ``` ``` ## ## Call: ## ddm(drift = rt + response ~ 0 + condition, boundary = ~condition, data = onep) ## ## DDM fit with 3 estimated and 2 fixed distributional parameters. ## Fixed: bias = 0.5, sv = 0 ## ## drift coefficients (identity link): ## Estimate Std. Error z value Pr(>|z|) ## conditionCongruent 3.0317 0.1976 15.340 < 2e-16 *** ## conditionIncongruent 0.7898 0.1384 5.705 1.16e-08 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## boundary coefficients (identity link): ## Estimate Std. Error ## (Intercept) 1.55236 0.069 ## conditionIncongruent -0.02614 0.092 ## ## ndt coefficients (identity link): ## Estimate Std. Error ## (Intercept) 0.2995 0.006 ``` ``` r logLik(model1) ``` ``` ## 'log Lik.' -13.53722 (df=5) ``` --- # What this model is doing The model uses both RT and response (accuracy recoded for fitting) - The predictor is **condition**: In this simple example, we are mainly asking whether evidence quality (`\(\nu\)` and `\(\alpha\)`) differs across congruent and incongruent trials - With fixed bias and inter-trial variability in the drift rate - This package uses C++ approximations to fit classical Diffusion Decision Model (5 parameters) using **.red[maximum likelihood estimation]**. - FYI, for Subject 1, its responses were simulated based on the following set of parameters: - `\(t_0\)` = 0.305 - `\(\alpha\)` = 1.57, `\(\omega\)` (relative bias) = 0.54 - `\(v_c\)` = 2.55, `\(v_i\)` = 1.00 --- ## Let's just run it across all subjects **This will not mean much in real-world practice**, because for DDMs we usually want to have individual-level models and parameters. **.red[Here I am only doing it for simple demo purpose (and it is a simulated dataset with pre-specified parameters anyway).]** FYI, in the simulation, the average values for the key parameters are: - `\(\overline{t_0}\)` = 0.35 - `\(\overline{\alpha}\)` = 1.69, `\(\overline{\omega}\)` = 0.50 - `\(\overline{v_c}\)` = 2.32, `\(\overline{v_i}\)` = 1.10 Let's see how well this model performs. ``` r model_all <- ddm(rt + response ~ 0 + condition, boundary = ~ condition, data = stroop_dat) summary(model_all) ``` ``` ## ## Call: ## ddm(drift = rt + response ~ 0 + condition, boundary = ~condition, data = stroop_dat) ## ## DDM fit with 3 estimated and 2 fixed distributional parameters. ## Fixed: bias = 0.5, sv = 0 ## ## drift coefficients (identity link): ## Estimate Std. Error z value Pr(>|z|) ## conditionCongruent 2.14192 0.03238 66.15 <2e-16 *** ## conditionIncongruent 1.02174 0.02604 39.24 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## boundary coefficients (identity link): ## Estimate Std. Error ## (Intercept) 1.79353 0.017 ## conditionIncongruent -0.01005 0.023 ## ## ndt coefficients (identity link): ## Estimate Std. Error ## (Intercept) 0.3112 0.002 ``` ``` r logLik(model_all) ``` ``` ## 'log Lik.' -1677.184 (df=5) ``` --- ## Takeaways - Mean RT and accuracy are useful, but limited - They tell us what happened on average and whether one condition looks easier - DDM adds a process explanation - Was the evidence weaker? - Was the person more cautious? - Was there extra time outside the decision? - ... - The estimated parameters from DDM tell different cognitive stories and can be used for further correlational and/or computational modeling --- ## Final Notes - Next week - Final project (Presentation and Write-up)