A well-known effect in multisensory perception is that congruent information received by different senses usually leads to faster and more accurate responses. Less well understood are trial-by-trial interactions, whereby the multisensory composition of stimuli experienced during previous trials shapes performance during a subsequent trial. We here exploit the analogy of multisensory paradigms with classical flanker tasks to investigate the neural correlates underlying trial-by-trial interactions of multisensory congruency. Studying an audio-visual motion task, we demonstrate that congruency benefits for accuracy and reaction times are reduced following an audio-visual incongruent compared to a congruent preceding trial. Using single trial analysis of motion-sensitive EEG components we then localize current-trial and serial interaction effects within distinct brain regions: while the multisensory congruency experienced during the current trial influences the encoding of task-relevant information in sensory-specific brain regions, the serial interaction arises from task-relevant processes within the inferior frontal lobe. These results highlight parallels between multisensory paradigms and classical flanker tasks and demonstrate a role of amodal association cortices in shaping perception based on the history of multisensory congruency.