The effect of practice on inhibition in task switching: Controlling for episodic retrieval

Previous work has shown that extended practice leads to a reduction in a key measure of cognitive inhibition during task switching: The n–2 task repetition cost. However, it has been demonstrated that this n–2 task repetition cost is increased by a non-inhibitory process—namely episodic retrieval—raising the question of whether the observed reduction of the cost with practice is driven by a reduction in inhibition, episodic retrieval effects, or a combination of both. The current study addresses this question by utilising a practice protocol using a task switching paradigm capable of controlling for episodic retrieval. The results showed a reduction in the n–2 task repetition cost with extended practice. The results also showed a clear increase of the n–2 task repetition cost due to episodic retrieval effects. The reduction of the cost with practice was driven by a reduction in inhibition and episodic retrieval contributions to the cost with practice, although there was a larger reduction in the episodic retrieval contribution with practice. The results are discussed with reference to current theoretical models of inhibition in task switching, which need to accommodate episodic retrieval and practice effects.


Introduction
Cognitive control refers to a set of abilities that allow us to control our ongoing behaviour to ensure goal-directed action. Without efficient cognitive control, our behaviour would be driven by bottom-up stimulus-evoked actions. Cognitive control is essential as many stimuli in our environment afford more than one action. For example, consider a typical modern mobile ("cell") phone; gone are the days when this stimulus would just afford the single action of "making a phone call"; now the user is faced with hundreds of tasks that could be performed. Therefore, some form of task selection is required upon this single stimulus. Once selected, this task must dominate the user's attention so that task-irrelevant intrusions do not occur. However, it must not become so dominant that-once the goal of the user changes-this task cannot be switched away from. Therefore, a tension exists between the stability and the flexibility of mental task representations that guide behaviour: how can a stable representation also maintain flexibility?
This stability-flexibility dilemma (Goschke, 2000) is studied using the task switching paradigm (Grange & Houghton, 2014b;Kiesel et al., 2010;Vandierendonck, Liefooghe, & Verbruggen, 2010). Within this paradigm, participants are required to rapidly switch between simple cognitive tasks requiring fast and accurate responses. For example, in the current study, participants are presented with a circular stimulus appearing in one of the four corners of a square frame (see Figure 1). Participants are required to mentally move the stimulus according to one of three spatial-transformation rules ("diagonal", "horizontal", "vertical"), and make a spatially-congruent response as to the corner the stimulus would move to according to the currently-relevant rule (Grange, Kowalczyk, & O'Loughlin, 2017;Mayr, 2002). This paradigm thus requires stability (e.g., select the "horizontal" rule and maintain it) and flexibility (e.g., switch to the "vertical" rule when cued to do so).

Inhibition in Task Switching
One mechanism thought to aid task switching is the active inhibition of recentlyperformed tasks when they are no longer relevant (Koch, Gade, Schuch, & Philipp, 2010;Mayr & Keele, 2000). Evidence for such a role of inhibition in task switching comes from the n-2 task repetition cost: the observed poorer performance (longer response times [RTs] and lower accuracy) to ABA sequences compared to CBA sequences (where A, B, & C Figure 1 . Schematic of the experimental paradigm. The arrows represent the spatial transformation required on each trial; these were not shown to participants. Time runs from the top to bottom of figure. Note that the image is not drawn to scale. Figure available at https://www.flickr.com/photos/150716232@N04/shares/5413G0 under CC licence https: //creativecommons.org/licenses/by/2.0/ are arbitrary labels for different tasks). This cost is thought to reflect the carry-over of inhibition of task A across the ABA sequence, which hinders reactivation attempts on the final trial of the sequence.
One empirical observation-predicted by a computational model of inhibition in task switching (Grange, Juvina, & Houghton, 2013)-is that the n-2 task repetition cost reduces with extended practice (Grange & Juvina, 2015;Scheil, 2016). This effect was explained by Grange and Juvina (2015) with reference to two different processes: (1) gradual automisation of cue-task translation and (2) a gradual increase in the activation of the memory representations required for successful task performance, which outweighs the negative effects of inhibition. Cue-task translation refers to the process by which the presented cue (which informs the participant of the currently-relevant task) leads to retrieval of the correct task representations from memory for successful performance. Following Logan (1988), it was assumed that at early stages of practice, cue-task translation is a slow process due to a lack of association between the cue and the relevant memory representations; but as practice proceeds, cue-task associations are stored in memory and are automatically retrieved upon cue presentation, leading to faster cue-task translation. In the model, memory representations have an activation value which increases with practice (the "baselevel activation"); to model inhibition, each time a memory representation is used it becomes inhibited by adding a short-term negative activation value to the base-level activation. Over time, the increasing base-level activation value negates the (constant) short-term inhibition input.
Although the model of Grange and colleagues (Grange & Juvina, 2015;Grange et al., 2013) predicted and explained the observation of reduced n-2 task repetition costs with practice, there are several issues with the model which likely make it a poor candidate to explain task switching behaviour beyond the narrow focus of the n-2 task repetition cost. For example, the model does not predict the very robust finding of switch costs: Slower and less accurate performance for task switch trials (e.g., A-B) compared to immediate task repetitions (e.g., B-B). A more successful model-which can explain n-2 task repetition costs and switch costs-was proposed by Sexton and Cooper (2017). However, this model does not predict the observed reduction in n-2 task repetition cost with practice. Thus, the practice effect observed in the n-2 task repetition cost remains an interesting challenge for theoretical accounts of task switching.

Episodic Retrieval & The N-2 Task Repetition Cost
Until recently, the n-2 task repetition cost was a candidate for a robust marker of cognitive inhibition, as several non-inhibitory accounts of the cost failed to predict the observed behaviour (Mayr, 2002(Mayr, , 2007. However, one non-inhibitory account-namely episodic retrieval-was recently shown by Grange et al. (2017) to contribute significantly to the n-2 task repetition cost. Extending the work introduced by Mayr (2002), Grange et al. (2017) found evidence across three experiments that episodic retrieval can explain much of the n-2 task repetition cost.
In a task switching context, episodic retrieval refers to the assumption that once a task is performed, all perceptual (e.g., cue characteristics, stimulus identity etc.) and action representations (e.g., the response that was made to the task) of that task are bound into a single memory representation and stored in episodic memory (Hommel, 1998(Hommel, , 2004Logan, 1988). When this task is cued again, this episodic memory trace is automatically retrieved. This retrieval can benefit performance if the presented elements on the current trial match those of the episodic trace (e.g., if the stimulus identity is the same, and the response required is the same). However, if the presented elements mismatch those of the episodic trace (e.g., if a different response is required), a mismatch cost occurs. From this perspective, the n-2 task repetition cost across an ABA sequence can be explained by a mismatch cost between the elements of the retrieved episodic trace from trial n-2 and the presented elements on the current trial n. Mayr (2002) introduced a paradigm capable of estimating the contributions of episodic retrieval to the n-2 task repetition cost; a variation of this paradigm is presented in Figure 1. Due to the simplicity of the task and stimulus display, it is straightforward to control whether task elements match or mismatch across an ABA sequence (see right side of Figure 1). By the episodic retrieval account, n-2 task repetition costs should only occur for n-2 response switches, because it is in this scenario that the current task elements (e.g., hexagon cue, top-right stimulus location, bottom-right response required) mismatches what would be retrieved from episodic memory from two trials ago (hexagon cue, bottom-left stimulus location, top-left response executed). For n-2 response repetitions, a perfect match occur on ABA sequences between trial n and n-2, leading to facilitated performance. Note that from an inhibition perspective, equivalent n-2 task repetition costs should emerge from n-2 response repetitions and n-2 response switches, because in both cases an inhibited task is being performed. Mayr (2002) tested this prediction, and found found significant overall n-2 task repetition costs, but no statistically-significant difference in n-2 task repetition cost for n-2 response repetitions and n-2 response switches. This presented evidence against the episodic retrieval account of the n-2 task repetition cost. However, across three experiments, Grange et al. (2017) found clear evidence for a reduction in the n-2 task repetition cost for n-response repetitions (episodic matches) compared to n-2 response switches (episodic mismatches), suggesting episodic retrieval can greatly modulate estimates of the n-2 task repetition cost (see also Grange, in press), however some "residual" cost remained even under conditions of episodic match. The authors conclude that the n-2 task repetition cost is comprised of episodic mismatch costs (reflected by the cost found in the n-2 response switch condition) and some degree of inhibition (reflected by the cost found in the n-2 response repetition condition).

The Current Study
The purpose of the current study was to revisit the findings of Grange and Juvina (2015; see also Scheil, 2016) and ask whether practice is reducing the n-2 task repetition cost due to a reduction in episodic retrieval effects, a reduction of inhibition effects, or a reduction in both. The study used a similar protocol to that of Grange and Juvina (2015) (i.e., extended practice across five separate experimental sessions), but some important modifications were made. First, we used the paradigm of Mayr (2002; with some modifications introduced by Grange et al., 2017) allowing separation of episodic and inhibition contributions to the n-2 task repetition cost. Secondly, a dramatic increase in the participant sample size was employed. Although several thousand experimental trials were used (which can increase power significantly), one of the shortcoming of the study by Grange and Juvina (2015) was that only 9 participants' data were analysed. The current study addressed this shortcoming by quadrupling the sample size.

Method
Participants 38 participants were recruited from Keele University in exchange for £50 (£10 per session). All participants were aged between 18-30. One participant was removed as they failed to attend the final session. One additional participant was removed for having less than 90% experiment-wise accuracy.

Apparatus & Stimuli
Experimental stimuli were presented via PsychoPy (Pierce, 2007) on a PC with a 17in. LCD monitor. The code for the program can be downloaded from https://bit.ly/2JwIq29. Responses were made on a 1ms-precise USB keyboard. Stimuli were presented on a light-grey background within a black square frame (width & height of 250 pixels). The cues were the shapes hexagon, triangle, and a square, all with a radius of 50 pixels. The stimulus was a filled black circle with a radius of 25 pixels.

Procedure
Participants were tested individually across five separate sessions, with each session separated by at least 4 hours (and no more than 2 sessions per day). Each session comprised of 10 blocks of 120 trials, preceded by a small practice block of 16 trials.
The task required participants to mentally make a spatial-transformation of the position of the circular stimulus according to which spatial rule was indicated by the currently-presented cue, and to make a spatially-congruent response to the transformed position. At the beginning of each session, participants were instructed which cue was paired with which spatial rule. The cue-rule pairing was consistent for each participant for the whole experiment, but the cue-rule pairings were fully counterbalanced across participants. After instruction, participants engaged with the practice block. This was kept deliberately short so as to capture as much practice effects as possible in the main experimental blocks; however, to keep errors to an acceptable level, if the participant made more than 20% errors in the practice block it was repeated just once. After this, the 10 experimental blocks were presented with a self-paced rest screen after each block.
A trial began with the presentation of a cue in the centre of the screen for 150ms, after which time the stimulus was presented in one corner of the square frame. Stimulus position was randomised on each trial. The probability of an n-2 response repetition was thus 0.25. Participants made a spatially-congruent response as to where the stimulus would move according to the current rule using the numeric component of the keyboard, and the keys "1" (bottom-left), "2" (top-left), "4" (top-right), and "5" (bottom-right). For example, if the current rule was "horizontal" and the stimulus was in the top-right position, participants would need to make a top-left response ("1"). Participants were instructed to respond as quickly and as accurately as possible after stimulus onset using the index finger of their right hand. Participants were instructed to move their index finger back to the centre of the four response keys after each response.
Once a response was made, the square farm went blank for 150ms before the next trial began. If an error was made, the word "Error!" appeared in red font in the centre of the screen for 1000ms. The cue for the next trial was selected randomly with the constraint that no immediate rule-repetitions could occur, thus maximising the number of ABA and CBA trials (see also Philipp & Koch, 2006).

Design
Three independent variables were manipulated in this fully repeated-measures design: Task Sequence (ABA vs. CBA), Response Repetition (n-2 response repetition vs. n-2 response switch) and Session (sessions 1-5). The dependent variables were response times (RT, measured in milliseconds, ms) and percentage error.

Data preparation
The statistical programming language R was used together with various packages for data preparation, analysis, and visualisation. Specifically, we used R ( Grange, 2018). The data were prepared in the following way. For the RT and accuracy analysis, the first two trials from each block were removed as they cannot be classified as ABA or CBA sequences. For the RT analysis, trials on which an error was made and the two trials following that error were removed (6.86% of trials removed). For accuracy analysis, just the two trials following an error were removed. RTs were trimmed by removing RTs shorter than 150ms (assumed to be anticipatory guesses) and RTs longer than 2.5 standard deviations above the mean for each participant for each cell of the experimental design (this removed a further 3.1% of trials). After trimming, response times were log-transformed to mitigate the expected large reduction in overall response time with practice which might artificially reduce the n-2 task repetition cost (Wagenmakers, Kryptos, Criss, & Iverson, 2012).  Figure 2 . Mean log transformed response times (in milliseconds, ms) for ABA and CBA sequences as a function of n-2 response repetition (repetition vs. switch) and practice session (1-5). Error bars denote +/-1 standard error around the mean.

Response Times
The mean log-transformed response times for the full design can be seen in Figure 2. Response times improved with practice. The n-2 repetition cost was present, and appears larger for n-2 response switches than for n-2 response repetitions (especially for the first sessions of practice). The n-2 repetition cost appears to decrease with practice, but this reduction appears larger for n-2 response switches than for n-2 response repetitions. All of these observations are confirmed by standard frequentist ANOVA (see Table 1).
The primary analysis focused on analysis at the n-2 task repetition cost level. Specifically, Bayesian multilevel regression was conducted predicting the n-2 task repetition cost from the fixed factors Response Repetition and Session. Random intercepts and slopes for Response Repetition and Session per participant were included in the models. Plots of the n-2 task repetition cost for n-2 response repetitions and n-2 response switches at the individual participant level are shown in Figure 9.
Four models were constructed, each varying on their inclusion of fixed factors plus their interaction. (All models had the same random effects structure.) Model 1 predicted the n-2 task repetition cost from just a main effect of Session; Model 2 predicted the n-2 Each model was fit to the data using the brms package in R; each fit ran 4 chains of the "no-U-turn" sampling (NUTS) of the posterior distribution for each parameter, with each chain having 10,000 iterations, 5,000 of which were treated as burn-in. Visual inspection of the chains showed good convergence, and allR were close to 1.
Model comparison comprised of calculating the widely applicable information criterion (WAIC). WAIC provides an estimate of out-of-sample deviance of model predictions, dealing with the trade-off between the goodness of fit of the model and the model complexity (because models with more parameters provide superior goodness of fit); smaller WAIC among a set of model comparisons indicate models with relatively lower out-of-sample deviance, and are hence preferred (McElreath, 2016). The WAIC values of each of the four models are shown in Table 2. This table also shows that Akiake Weight of each model, which is an estimate of the probability that the model will provide superior predictions to new data compared to other models involved in the comparison (McElreath, 2016). Values closer to one, therefore, indicate superior models.
As can be seen, the model involving the two main effects plus their interaction was the best-fitting model of the data provided (with an Akiake Weight of 0.93). Figure 3 show Note. ges = Generalised eta squared. Greenhouse-Geisser corrections to degrees of freedom were applied for violations of sphericity for effects involving Session.
density functions of each population-level (i.e., fixed-effect) parameter from this best-fitting model, together with the 95% highest density interval estimates. From these plots it is clear that the n-2 task repetition cost decreases as a function of Session, replicating the previous findings of Grange and Juvina (2015). The results also show that n-2 response switches increase the n-2 task repetition cost, replicating the findings of Grange et al. (2017). The negative parameter for the interaction suggests that the reduction of the n-2 task repetition cost with practice is larger for n-2 response switches than it is for n-2 response repetitions. This interaction is best understood by Figure 4, which shows the sample data from the current experiment together with 300 draws from the posterior prediction of the best-fitting model.

Accuracy
The mean error rates for the full design can be seen in Figure 5, from which a few observations can be made. First, accuracy generally improved with practice. Second, Response Repetition appears to have an effect on the n-2 task repetition cost: for n-2  The coloured lines represent 300 draws from the posterior distribution of the best-fitting Bayesian regression model. response switches, there appears a large n-2 task repetition cost, but for n-2 response repetitions there appears to be no cost; if anything, there appears a very small n-2 task repetition benefit. In terms of how these n-2 task repetition costs change with practice, the cost for the n-2 response switches appears to reduce slightly with practice; the small n-2 task repetition benefit for n-2 response repetitions appears to also reduce slightly with practice. These observations are generally supported by the standard frequentist ANOVA (see Table 3). See Figure 10 for a plot of individual differences in these observations.
As with the RTs, the primary analysis focussed on Bayesian multilevel regression on the n-2 task repetition cost. Plots of the n-2 task repetition cost for n-2 response repetitions and n-response switches at the individual participant level are shown in Appendix B (Figure 10). Four models were again constructed, each varying on the inclusion of different predictors. The random effect structure was the same as for the RT analysis, and the model fit routine and model comparison process was the same as for the RT analysis. WAIC values for all models can be found in Table 4.
As can be seen, the model involving the two main effects plus their interaction was the best-fitting model, but the evidence for this being superior to the Response model is rather weak: The Akiake weight for the interaction model is 0.45, and for model with just   . N-2 repetition costs for error rates as a function of Session of practice and n-2 response repetition (repetition vs. switch). Points represent the sample data. The coloured lines represent 300 draws from the posterior distribution of the best-fitting Bayesian regression model. the main effect of Response is 0.36, suggesting that the probability the interaction model would only perform slightly better than the Response model in predicting new data. Figure 6 show density functions of each population-level (i.e., fixed-effect) parameter from this best-fitting model, together with the 95% HDIs. These posterior distributions confirm a large effect of Response Repetition, but little overall main effect of Session. As indicated by the slight superior fit of this model, the posterior for the interaction parameter suggests that the change in n-2 task repetition cost with Session is different for n-2 response repetitions and switches.
This interaction is best understood by examining Figure 7, which shows the error n-2 task repetition cost for each level of Response Repetition as a function of Session, together with posterior predictions of the best-fitting interaction model. This Figure demonstrates a slight reduction in the n-2 task repetition cost for n-2 response switches, and a slight reduction in the (very small) n-2 task repetition benefit for n-2 response repetitions.

General Discussion
The aim of the present study was to revisit the findings of Grange and Juvina (2015) and Scheil (2016) of a reduction in the n-2 task repetition cost with extended practice; importantly, the current study controlled for episodic retrieval effects, which has been shown to influence measures of inhibition . The current study therefore addressed the question of whether the observed reduction in cost in Grange and Juvina (2015) was due to a reduction of inhibition, episodic retrieval effects, or both.
The results show a clear influence of episodic retrieval on measures of the n-2 task repetition cost, replicating Grange et al. (2017; see also Grange, in press), with larger n-2 task repetition costs for n-2 response switches (reflecting episodic mismatches) compared to n-2 response repetitions (reflecting episodic matches). This was true for both response times and error rates. For n-2 response switches, there was a large n-2 task repetition cost in both the RT and error data. However, for n-2 response repetitions, there was an n-2 task repetition cost for the RT data, but an-albeit rather small-n-2 task repetition benefit for the accuracy data (see left panel of Figure 5 and Figure 7). We return to discuss this pattern later in the Discussion.
The results also show a reduction in the n-2 task repetition cost with increasing practice for response times, replicating Grange and Juvina (2015) and Scheil (2016). Importantly for the current paper's aims, this reduction was present for both n-2 response switches and n-2 response repetitions in the RT data, suggesting that episodic and inhibition contributions to the n-2 task repetition cost reduce with practice. However, this reduction was larger for n-2 response switches than for n-2 response repetitions (see Figure 4), suggesting that a larger proportion of the reduction in the overall n-2 task repetition cost for RTs observed in the current study-and perhaps, by extension, to the cost observed in the studies of Grange and Juvina (2015) and Scheil (2016)-is due to a reduction in the contribution of episodic retrieval to the n-2 task repetition cost with practice. In the error data, there was a reduction of the n-2 task repetition cost for n-2 response switches with practice, and a reduction of the n-2 task repetition benefit for n-2 response repetitions.
Together, these findings suggest that the reduction of n-2 task repetition costs observed in the present study are due to a reduction in both inhibition and episodic retrieval effects, but that the latter contributes more to this reduction.

Theoretical Models
The results reported in the current paper are not easily accommodated by current versions of theoretical models of the n-2 task repetition cost (Grange & Houghton, 2014a;Grange & Juvina, 2015;Grange et al., 2013;Sexton & Cooper, 2017). In this section we discuss the implication of the current results on these models.
Grange & colleagues' model.. The model of Grange et al. (2013) and Grange and Juvina (2015) might be able to accommodate these findings with an extension taking into account episodic retrieval effects. This might be a relatively straightforward extension, as the ACT-R architecture which was used for the modelling is primarily a theory of memory retrieval (Anderson, 2007), with rich accounts of associative and episodic retrieval in a task switching context already successfully modelled (Altmann & Gray, 2008). But, given that the Grange et al. (2013) model does not predict task switching behaviour beyond the narrow focus of the n-2 task repetition cost, it would perhaps be more fruitful to look to the model of Sexton and Cooper (2017) to accommodate these findings. (2017) is a connectionist model, extending the model of Gilbert and Shallice (2002) to accommodate switching between three tasks. Output (i.e., response) units in the model are biased toward the correct response via task-demand units (one for each task) that are activated via the relevant task cue. Each task demand unit has excitatory connections to the output units associated with their task, and inhibitory connections to the other tasks. Thus, when a task-demand unit for Task A is activated (for example), output units associated with Task A become active, and output units associated with Tasks B and C become inhibited.

Sexton & Cooper's model.. The model of Sexton and Cooper
Activity of the task-demand units is also influenced by the activity of units in the conflict monitoring layer. These units continuously monitor for interference between task demand units (as measured by the degree of simultaneous activity in the task-demand units; see Appendix B for more details). Active but irrelevant task-demand units receive an inhibitory input from the conflict monitoring units. Inhibition in the model is thus a consequence of interference between activity in the competing task-demand units when the task switches, leading to inhibition of the active but irrelevant demand units. This inhibition of the task-demand unit persists over time, and hinders reactivation of the demand unit soon after, leading to n-2 task repetition costs.
As it stands, the model does not accommodate the observed reduction of the n-2 task repetition cost with practice. But, one can imagine how this might be accommodated by implementing some form of mechanism by which interference between competing taskdemand units decreases as practice increases. One possibility could be to assume that the cue-based activation of the relevant task-demand unit increases with practice-a similar idea to that implemented in the model of Grange and Juvina (2015), who assumed that cue-task association strength increases with practice. Our intuition was that such an increase could lead to a reduction in inhibition in the model for the following reason: As the cue-based activation of task-demand units increases, the relevant demand unit would reach peak activity more quickly, leaving less time for interference between demand units to build, leading to less deployment of inhibition.
As this proposal is theoretically similar to that used by Grange and Juvina (2015) to successfully model their practice data (and also because model behaviour can often be unintuitive; Lewandowsky & Farrell, 2010) we took this opportunity to explore this account by implementing a practice-based version of Sexton and Cooper's (2017) model (see Appendix B for full implementation details). In the simulation, we assumed that cue-based activation of task-demand units increases linearly with practice; no other parameters in the model were allowed to change with practice session. This thus provides the cleanest implementation of the assumption of Grange and Juvina (2015). The simulated response time (measured in number of model cycles per trial) and error rates for five sessions of practice are shown in Figure 8. As can be seen, the model's response time gets shorter with practice, and the accuracy generally improves (but note the accuracy is very high even at Session 1). However, the n-2 task repetition cost for response time is constant across all levels of practice. Although there is some reduction in the n-2 task repetition cost in the error data, accuracy is generally very high and the variable nature of this reduction likely reflects simulation noise. Based on these data, gradual increase in cue-task association strength as assumed by Grange and Juvina (2015) does not lead to the observed reduction in n-2 task repetition cost with practice within the architecture of Sexton and Cooper (2017). It remains for future simulations to explore practice effects within this framework.

Modelling episodic retrieval with Sexton & Cooper's model.
The model of Sexton and Cooper (2017) currently has no mechanism by which episodic retrieval effects can be accommodated. But this model could be combined with the Parallel Episodic Processing (PEP) model of Schmidt, DeHouwer, and Rothermund (2016). This model shares a similar connectionist architecture to that of Sexton and Cooper (2017), but additionally has a formal mechanism by which episodic traces are stored consisting of a bound representation of the presented stimulus features and the executed response. This episodic trace is retrieved when the stimulus is presented again, leading to either facilitation or a cost, depending on whether the retrieved trace matches the current task demands (cf., the current study). Thus, the PEP model provides a possible way to extend the Sexton and Cooper (2017) model to accommodate episodic retrieval effects in task switching.

On the Residual N-2 Task Repetition Cost
The data in the current paper demonstrated an n-2 task repetition cost even for n-2 response repetitions, which constitute episodic matches. Such a cost is not predicted by a pure episodic retrieval account because the elements of the retrieved episodic trace (from trial n-2) match the currently-presented elements on the current trial, which should lead to facilitated performance. Such a residual n-2 task repetition cost in studies controlling for episodic retrieval Mayr, 2002) has been interpreted as potential evidence of residual inhibition. However, this conclusion is complicated by the observation in the current study of an n-2 task repetition benefit for n-2 response repetitions in the error data, which is fully congruent with an episodic retrieval account. Thus there appears to be a speed-accuracy trade-off for n-2 response repetition data which complicates conclusions regarding whether the residual cost reflects inhibition: The inhibition account predicts the pattern observed in the RT data, but not the pattern observed in the error data (and vice versa for the episodic retrieval account).
Although this trade-off was small in the current study (because the n-2 task repetition benefit in the error data was small), we have observed this trade-off (with larger benefits in the error data) in other data sets (see e.g., Grange, in press; Grange & Kowalczyk, under review), suggesting it is reproducible. One intriguing possibility is that the residual n-2 task repetition cost captures some form of strategic change of response thresholds during episodic matches, leading to prolonged RTs and improved accuracy. In models of choice response time (such as the Ratcliff diffusion model; Voss, Nagler, & Lerche, 2013), speed-accuracy trade-offs can be modelled by assuming an increase to the response threshold (the amount of evidence required by the cognitive system before a response is selected). Such an increase in the threshold means that it takes longer for the evidence to reach that threshold (leading to a longer response time), but it also increases the probability that the evidence reaches the correct threshold (leading to greater accuracy). Future work should fit a formal version of the diffusion model to practice data to ascertain whether in fact an increase in response threshold is occurring for n-2 response repetitions. Note that this is difficult to do with the current data due to some participants obtaining 100% accuracy (diffusion modelling requires errors to model the error RT distribution).
If such an increase were found, it would not necessarily be incongruent with an inhibition account of the residual n-2 task repetition cost. Indeed, if a task representation is under the influence of inhibition when it is cued, the cognitive system might register difficulty with response selection and adjust the response threshold in response. But it could also point to other explanations of an increase in response threshold which could be tested. Perhaps this future work might establish that even the residual n-2 task repetition cost can be explained by non-inhibitory accounts (cf., Mayr, 2007).
Cognitive training. The finding of a reduction of the residual n-2 task repetition cost-which potentially is a purer measure of cognitive inhibition )-fits within a wider literature on effets of cognitive training, and in particular the finding that cognitive inhibition measured with different tasks reduces with practice (for a review, see Spierer, Cheva, & Manuel, 2013). For example, Manuel, Bernasconi, and Spieler (2013) utilised a practice protocol together with the stop-signal task, a paradigm which measures the stop signal response time (SSRT)-the estimation of the latency of motoric response inhibition (for reviews, see Verbruggen & Logan, 2008, 2009. Participants performed 10 blocks of 102 trials, with 33% of trials requiring inhibiting a pre-potent response; the remaining trials-so-called "go" trials-required making a rapid response. Even across such a short practice interval (in comparison to the current study, for example), Manuel et al. (2013) found a clear reduction of SSRT across the 10 blocks; in models of response inhibition, a reduction of SSRT reflects more efficient response inhibition. The behavioral data was complimented by electrophysiological recording, upon which source localisation was used to assess the change in neural response to go trials with practice. These results showed clear reduction of neural activity with practice within the right inferior frontal gyrus, the pre-supplementary motor area, the primary motor area, and the basal ganglia. Interestingly, the pre-supplementary motor area and the basal ganglia have been associated with inhibitory control during task switching (Whitmer & Banich, 2012), suggesting a similar cortical network may be involved in both forms of inhibition.
Thus, given the potential that the residual n-2 task repetition cost reflects a "purer" measure of inhibition in task switching, researchers may find some utility in assessing this cost within cognitive training protocols.

Appendix B -Modelling Practice Effects with Sexton & Cooper's (2017) Model
The model of Sexton & Cooper (2017) is a connectionist model which extends the model of Gilbert and Shallice (2002) to allow switching between three tasks. In the full model, there are 5 layers of units representing (1) top-down control input units; (2) task demand units; (3) conflict monitoring units; (4) output (i.e., response) units; and (5) stimulus input units. Note that as the current simulation was interested in the dynamics of top-down control input via the cue, I did not model stimulus input, and the output layer was simplified with just one unit per task. (The full model of Sexton & Cooper had a separate output unit for each response category [i.e., 2 per task]). The dynamics of the model is interactive, such that activity of units propogate through the network and influence the activity of units in other layers. The architecture of the model we implemented is shown in Figure 11.

Dynamics in the Model
Below we describe the dynamics in each layer of the model.

Top-Down Control Layer
This layer represents the cue that is presented to the model on the current trial. There is one unit per task (A, B, & C). When a task is cued, the unit representing that task takes on a value. This is the parameter which changes with practice in the reported simulation, taking on a value of 1.0, 1.2, 1.4, 1.6, or 1.8 (depending on the current practice session). Each top-down control unit is connected to just one unit in the task-demand layer via a connection weight representing the strength of the top-down control, S tdc .

Task-Demand Layer
There is one task-demand unit per task in the model. Each demand unit receives activation from the top-down control layer, the output layer, and-importantly-from the conflict monitoring layer. Demand units send activation (positive or negative) to the conflict monitoring layer and to the output layer. The total input to each task-demand unit, I td is: (1) where S tdc is the strength of the top-down control connection, α c and α o are the current activation of the sending units at the conflict and output layers, ω c and ω o are the weights of these connections, and β td is a constant negative bias. Note that if a connected conflict unit's activity is below zero, it does not send any activation to the demand layer units.

Conflict Monitoring Layer
There are three units in the conflict layer. Each unit receives activation from two task-demand units, and sends inhibitory activation back to these units (but only if the activation of the conflict unit is above zero). Each conflict unit can thus be conceptualised as monitoring the co-activation of two units in the demand layer. When both units in the demand layer are active, the conflict unit will become active and send inhibitory signals to both demand units. Note that units in the conflict layer only interact with demand units.
The input activation to each conflict unit is calculated as: where α1 and α 2 are the activation levels of each demand unit being monitored, and β c is a constant negative bias. γc is a gain parameter.

Output Layer
There is one output unit per task. Units in this layer have lateral inhibitory connections to other output units. In addition, each output unit receives excitatory activation from the demand unit associated with that task, and inhibitory input from the demand units unassociated with that task. Units in the output layer send excitatory activation to the unit in the demand layer associated with the task, and inhibition to the units in the demand layer unassociated with the task.
The input activation to each output unit is calculated as: where α td and α o are the activation levels of the sending units in the task-demand and output layer, and ω td and ω o are the connection weights. β o is a constant negative bias.

Activation Updating
Each unit's activation is updated iteratively in a number of cycles. On each cycle, the total activation input to each unit is calculated. The change in activation level for the current cycle i, ∆α i , is given by where α i is the unit's current activation, α max and α min is the unit's maximum and minimum activation value (clipped to 1 and -1, respectively). σ is a step-size parameter (fixed at 0.0015). On each cycle, noise is added to the unit's activation ( ). The noise is a simulated draw from a normal distribution with the standard deviation controlled by a noise parameter (fixed at 0.006).

Trial Dynamics
At the beginning of each trial, a task is randomly selected with the constraint that no immediate task-repetitions could occur. The relevant top-down control unit is then activated, taking on a value which changed with practice (increasing in equal steps from 1.0 to 1.8 across five sessions of practice). Activation then propogates through the model on each cycle, given the updating dynamics stated above. The model is considered to have selected a response when the activity of a unit in the output layer exceeds the activation of the next-highest-active unit by a given response threshold (fixed at 0.15). The number of cycles taken to pass this response threshold is the model's simulated response time, and the accuracy of the model is dictated by whether the selected output unit matches the cued task.
At the end of each trial, activation in the top-down control units and the output units are reset to zero. The activity of the units in the conflict and task-demand layers are squashed, and this residual activation (positive or negative) then persists into the next trial. Units in the conflict layer are squashed by 50%, and units in the demand layer are squashed by 80%.

Simulation Details
We simulated five sessions of "practice". The model was the same in each session with the exception of the input to the top-down control units, which increased linearly with practice. For each practice session, we simulated 100 participants, each performing 1,000 trials each. Below we report all parameters and their values used in the simulation. All parameters were the same as Simulation 1 in Sexton and Cooper (2017), with the exception of the input to the top-down control units, which increased with practice Session.