Beyond the arrow of time in predictive processing

Within the bayesian brain framework, temporality is usually taken for granted: causes are assumed to precede their effects, and the brain is supposed to predict forward in time, tracking the arrow of time from past to future. Yet nothing in the core mathematics of predictive processing strictly requires this asymmetry. Generative models specify probabilistic dependencies among variables, and Bayesian inversion recovers the most likely hidden causes given observed data. These dependencies can be arranged along a temporal dimension, but the resulting inference can in principle be performed in either direction: from past to future, from future to past, or across entire temporal windows treated as single objects of neural inference.

To see this, consider a simple temporal generative model in which hidden states at time t cause observations at time t and also influence hidden states at t+1. Standard formulations let the model evolve forward: states propagate, predictions are generated, and prediction errors update beliefs as new data arrive. However, once the full sequence of data over an interval is available, Bayesian inversion can be applied retrospectively to the entire time series, revising beliefs about states at earlier times in light of later observations. The math does not distinguish whether inference is carried out “online” in a forward direction or “offline” in a backward direction; it only prescribes how to combine likelihoods and priors to obtain posteriors. This opens the conceptual space for reversing temporality in bayesian brain models without introducing physical retrocausality.

In practice, this means that neural systems encoded as Bayesian message-passing networks can be interpreted as flowing information both forward and backward in time. Forward messages convey predictions based on priors and currently inferred states, while backward messages carry information about how later evidence should change beliefs about earlier states. Computationally, this looks like smoothing rather than filtering: instead of merely predicting the next moment, the system revises the entire trajectory of inferred causes over a temporal window. When instantiated neurally, such smoothing can manifest as late-arriving signals that reshape the representation of events that are already “in the past” from the vantage of ongoing processing.

Reversing temporality in this sense does not require positing that future events literally cause past neural states. Rather, it suggests that the brain’s generative model treats past and future observations as jointly constraining a coherent latent history. Forward and backward passes through this model operate over an abstract temporal scaffold that can be traversed in either direction during inference. The observed arrow of time then reflects constraints from physiology, resource limitations, and environmental regularities, rather than a fundamental restriction in the formalism itself. Under ideal conditions with sufficient time and data, the same model that predicts forward could equally well revise interpretations backward, effectively dissolving any strict computational boundary between past and future.

One way to formalize this reversal is to treat entire sequences of sensory input as single probabilistic objects, with latent trajectories serving as their hidden causes. Hierarchical generative models can then assign priors over these trajectories that do not privilege a particular temporal direction. For example, a trajectory prior might encode that certain patterns of change are likely—smooth motion, rhythmic cycles, or consistent causal chains—without specifying whether inference must proceed from earlier to later timepoints. When the brain inverts such a model, it effectively searches over possible histories that could have produced the observed sequence, adjusting both early and late parts of the trajectory in response to any new evidence anywhere in the sequence.

Empirically, perception often appears to be anchored to the immediate present, but many phenomena hint at temporally extended inference that retroactively restructures what has just been experienced. Postdictive perceptual effects, where later stimuli alter the perceived timing or identity of earlier stimuli, are naturally expressed in a Bayesian framework that allows inference to run backward over short time windows. The brain can initially form a provisional interpretation of early input and subsequently refine it when later evidence arrives, effectively performing local reversals of temporality at the level of inference, even though physical events themselves remain ordered in time.

Neurally, reversing temporality can be mapped onto recurrent and feedback architectures that allow information to circulate across multiple timescales. Fast feedforward sweeps may implement initial, forward-looking predictions, while slower feedback and lateral interactions perform revisions that reach back into the neural representations of moments that have just passed. Within a predictive processing scheme, this corresponds to iteratively relaxing prediction errors not only for the present moment but also for preceding states encoded in working memory or short-term temporal buffers. The result is that what counts as the “state of the past” in the brain is not fixed; it is updated in light of new data, as though inference were flowing upstream in time.

This perspective allows the arrow of time, at the level of mental events, to emerge from the interplay between reversible Bayesian computations and irreversible biophysical dynamics. While the underlying probabilistic relations among causes and effects may be temporally symmetric, neural implementation is constrained by metabolic costs, noise, and limited capacity. These constraints mean that full backward inference over long intervals is rare or impossible; instead, the system settles for partial temporal reversal over short windows, prioritizing segments of the recent past that are most behaviorally relevant. Still, the important conceptual shift is that temporality in Bayesian brain models becomes a variable to be explained and constrained, rather than a primitive assumption built into the formalism.

From this vantage, reversing temporality is not an exotic add-on but an intrinsic possibility of any sufficiently rich generative model over time. When sensory data are treated as evidence about structured sequences rather than isolated snapshots, future observations necessarily bear on beliefs about earlier states. Inference that respects this structure will, to some extent, run backward, retrofitting the brain’s internal narrative to accommodate new information. The apparent fixity of the past in experience may thus reflect the stabilization of these backward revisions rather than the absence of temporally reversed computation.

Bidirectional inference in hierarchical generative architectures

If hierarchical generative models are to accommodate reversals of temporality in neural inference, they must support information flow both up and down the hierarchy and both “forward” and “backward” across implicit temporal dimensions. In predictive processing, this is already partly the case: ascending signals carry prediction errors, and descending signals convey predictions shaped by higher-level priors. What is less often emphasized is that these same architectures can be interpreted as implementing bidirectional inference over time, where higher levels encode temporally extended patterns and their feedback reshapes the inferred trajectory of lower-level states, including those representing the immediate past.

In such architectures, each layer encodes hidden causes that unfold over a characteristic timescale. Lower levels track rapidly changing sensory details; higher levels capture slower regularities, such as object identity, action plans, or contextual structure. Crucially, these higher levels are not locked to a single moment in time. Instead, they summarize patterns across windows of input, effectively representing abstract “temporal chunks.” When inference updates these higher-level states, their revised predictions propagate downward and alter the inferred sequence of states at lower levels, as if the model were rewriting its own short-term history to better fit a newly discovered pattern.

A concrete way to understand this is to consider that each hierarchical level effectively runs its own miniature temporal model. A level might encode transitions among its own hidden states (for example, phonemes evolving into syllables, or postures evolving into goal-directed actions) and use these to predict the expected pattern of lower-level activity. When incoming data violate these expectations, prediction errors travel upward, revising beliefs about the sequence of higher-level states. Once revised, these states send new predictions downward that retroactively alter which lower-level configurations are now deemed most probable at earlier moments within the same temporal segment. In this way, temporal “smoothing” emerges naturally from hierarchical message passing.

This smoothing is not just a post hoc adjustment but an integral part of real-time operation, implemented through iterative cycles of inference. Initial feedforward sweeps approximate a forward-only pass, providing a quick, coarse estimate of the most likely unfolding sequence. Subsequent recurrent and feedback exchanges perform more refined, bidirectional inference, altering beliefs about both near-future expectations and the just-past trajectory. At any given moment, the system’s representation of what has happened and what will happen is jointly determined, with neither direction granted absolute priority in the underlying probabilistic computations.

When the arrow of time is treated as an emergent property rather than a fixed constraint, hierarchical architectures gain additional expressive power. Higher levels can encode priors over whole patterns that are temporally symmetric or weakly directional—for instance, “a rhythmic alternation” or “a reversible transformation”—and let the actual sequence direction be inferred from context. Lower levels will then be recruited to disambiguate directionality based on fine-grained cues, such as motion energy, causal consistency, or learned regularities of the environment. This division of labor allows the system to treat direction as another hidden variable to be inferred, rather than a rigid backdrop.

Bidirectional inference is especially natural when generative models represent events in relational rather than strictly sequential terms. A high-level cause might encode that “object A occludes object B,” or that “a contact event occurs between two bodies,” without constraining whether the model is presently inferring antecedent conditions or consequent outcomes. From the standpoint of Bayesian inference, both directions correspond to the same joint distribution over causes and effects. The hierarchical structure then decomposes this joint distribution into layers: contextual priors at the top, event templates in the middle, and sensory details at the bottom. Belief updates propagate throughout this structure, allowing late-arriving evidence about an effect to reshape beliefs about its putative cause.

Recurrent connectivity plays a central role in physically realizing this bidirectional architecture. Local recurrent loops within a level allow that level to explore alternative temporal segmentations or orderings of its states, while long-range feedback connections transmit revised beliefs to lower levels. Because these loops can run for multiple cycles within the timeframe of a single perceptual event, the architecture can effectively simulate multiple passes over the same “mental timeline,” testing different alignments of causes and observations until prediction errors are minimized. This yields a form of dynamic “retrodictive coding,” in which inferences about earlier states are continuously refined in parallel with forward-looking predictions.

Importantly, bidirectional inference in hierarchical generative models does not entail retrocausality at the physical level. Neurons still operate under standard biophysics; spikes propagate forward in real time. The apparent “backward influence” arises because higher-level populations encode summary statistics over temporally extended input and send predictions that redraw the boundaries and contents of lower-level temporal representations. What was previously categorized as one event may be split into two; what was treated as noise may be reinterpreted as a leading indicator. From the perspective of the model, these revisions amount to altering its reconstructed past trajectory to improve coherence with the whole pattern of observed data.

Different levels of the hierarchy can thus be seen as specializing in different forms of temporal inference. Sensory levels approximate near-online filtering, maintaining a rapidly updating estimate of the current state based on recent inputs. Intermediate levels emphasize local smoothing, revising short sequences in light of subsequent context. Higher levels perform global reconstruction, integrating evidence over much longer horizons—minutes, hours, or even learned statistical regularities across a lifetime. Together, these levels jointly implement a multi-scale, bidirectional inference engine that constantly rebalances its depiction of past, present, and anticipated future.

This multi-scale arrangement has direct implications for how prediction errors are handled. At lower levels, errors primarily signal moment-to-moment discrepancies between predicted and received sensory data. At higher levels, errors reflect mismatches between expected temporal patterns and the evolving interpretation of the situation. Because prediction errors are recursively minimized across the hierarchy, a shift in high-level interpretation can cascade downward, reducing many locally puzzling errors at once by reassigning them to a different global pattern. This is functionally equivalent to reinterpreting an entire segment of the temporal stream under a new narrative constraint.

From the standpoint of predictive processing, then, bidirectional inference is not an optional add-on but a natural outcome of hierarchical design. The brain can use early, fast passes through the hierarchy to provide a tentative, forward-tracking interpretation of sensory input, while slower, recurrent passes integrate additional evidence and reconfigure both earlier and later parts of the inferred sequence. The arrow of time in conscious experience may track the stabilized result of this ongoing negotiation between levels, rather than the raw ordering of sensory hits at the periphery. The architecture thereby supports a flexible, context-sensitive temporal organization of perception and cognition, grounded in the same generative principles that underwrite all Bayesian models of the bayesian brain.

Temporal symmetry and precision-weighted prediction errors

Temporal symmetry in generative models becomes concrete in predictive processing once precision enters the story. Prediction errors are not all treated equally; they are weighted by their estimated reliability. This precision-weighting determines how strongly each error can drive updates to beliefs about hidden states. When temporal relations are symmetric at the level of the model—so that past and future data jointly constrain latent trajectories—precision becomes the chief mechanism by which an effective arrow of time is imposed on otherwise reversible neural inference.

Within a standard formulation, precision is the inverse variance of expected noise associated with a particular signal. At any moment, the system maintains beliefs not only about what is likely to be the case but also about how much trust to place in bottom-up sensory evidence and top-down predictions. When sequence data are considered, each point in time can in principle contribute prediction errors that update states across the whole window. Yet which direction these updates preferentially run—whether later evidence strongly revises the inferred past, or earlier evidence dominantly shapes expectations of the future—depends on the relative precision assigned to error signals anchored at different temporal positions and at different hierarchical levels.

Consider a temporal generative model that produces a trajectory of hidden states and corresponding observations. When inferring this trajectory, prediction errors at every time step could be used to refine beliefs about the entire path of latent states. However, if the model assigns higher precision to errors closer to the present and lower precision to those associated with more temporally distant observations, then updates will be biased toward explanations that prioritize recently encountered data. The same mathematical structure that would otherwise support temporally symmetric smoothing now yields an effectively forward-directed process, simply because near-present errors are granted more inferential authority than those arising from the more remote past or future.

This selectivity can be described as a temporal precision profile: a distribution of expected reliability over time. When this profile is symmetric around a central point in a temporal window, the model will treat early and late data as roughly equally informative. When it is skewed, inference becomes directionally biased. For instance, in a brief sensory integration window, the system might initially treat later input as more reliable than earlier input—favoring postdiction and retroactive revision of just-past events—because later evidence resolves ambiguities that were invisible at the start of the sequence. Over longer scales, the same system may treat early contextual cues as highly precise and subsequent fluctuations as mostly noise, effectively anchoring interpretation in the distant past and restricting the degree of post hoc revision.

Precision is also distributed across the hierarchy. Lower-level sensory prediction errors often reflect relatively stable noise characteristics of receptors and early sensory pathways; their precision profiles may focus tightly on the near-present and decay quickly over time. Higher-level errors, in contrast, gauge the fit between extended temporal patterns and overarching contextual expectations. Their precision can remain elevated over longer intervals, supporting inferences that span seconds, minutes, or more. As a result, high-level errors arising from later parts of a sequence can still exercise substantial influence over beliefs about earlier parts, especially when the model expects that an entire pattern should cohere according to strong, high-precision priors.

When such high-precision, long-horizon error signals are strong, the system effectively allows later evidence to reshape the inferred past. This is a form of temporally extended smoothing implemented through precision modulation: post hoc re-evaluation of earlier states is permitted because the model regards pattern-level discrepancies as more trustworthy than local, time-locked sensory deviations. In contrast, when higher-level precisions are weak or diffuse, the hierarchy will lean more heavily on fast, local errors. In that regime, inference looks more like online filtering, with minimal backward revision and a relatively rigid arrow of time.

The dynamics of precision-weighting provide a way to reconcile reversible generative models with the irreversibility of neural implementation. Biophysical processes unfold in one temporal direction; spikes cannot be sent backward in time, and synaptic changes accumulate in a path-dependent way. Yet within these constraints, recurrent networks can re-enter and re-weight representations of earlier moments stored in working memory or short-term synaptic traces. Precision signals—often modeled as gain or modulatory factors on neuronal populations encoding prediction errors—decide how forcefully these re-engaged representations can be altered. In this sense, the apparent malleability of the immediate past in experience reflects the precision structure of feedback-driven revisions rather than any genuine retrocausality.

A useful analogy is to think of precision as setting a temporal “viscosity” of inference. When precision is high and evenly distributed across a short temporal window, the trajectory of inferred states is fluid within that window: new evidence can easily reconfigure the entire segment, irrespective of whether it comes earlier or later in the physical sequence. When precision is sharply peaked at the current moment and decays rapidly, the past becomes viscous or even rigid. Updates will mostly adjust the present state estimate and propagate forward into expectations about the near future, leaving prior states largely untouched. Both regimes are compatible with the same underlying generative model; it is precision that determines how freely inference can move backward as well as forward in subjective time.

This framing sheds new light on classic predictive processing notions of attention and uncertainty. Attention has been proposed to correspond to the flexible allocation of precision to certain prediction errors, enhancing their impact on belief updating. If attention can be deployed not only across space and content but also across time, then the system can selectively amplify errors stemming from particular segments of a temporal stream. Temporally targeted attention—such as focusing on the onset of an event, or on its expected outcome—amounts to sculpting the temporal precision profile, thereby determining where along the sequence inference is anchored and how much backward revision is possible.

For example, in tasks where the outcome of a sequence is behaviorally crucial, attention may increase the precision of prediction errors associated with later timepoints, making them powerful drivers of inference. This will favor postdictive phenomena: the brain defers commitment to an interpretation until outcome-relevant information arrives, then uses that information to re-interpret earlier, ambiguous sensory fragments. In contrast, during rapid interactions that demand immediate action, attention may amplify early prediction errors, effectively front-loading precision. In such cases, the system locks onto an interpretation quickly, tolerating later mismatches rather than engaging in extensive retroactive correction.

Temporal asymmetries in precision also interact with learning. Across development and through repeated exposure, the system learns the typical reliability structure of its environment. Some domains may furnish early, highly predictive cues (for example, preparatory gestures before an action), whereas others are only resolvable by late-arriving information (for example, the final configuration of an ambiguous figure). The brain can internalize these regularities as priors over the expected precision of prediction errors at different temporal stages. Over time, this yields domain-specific temporal signatures of inference: some kinds of events are processed in a strongly feedforward, anticipatory fashion, while others are routinely left open until later evidence arrives and the past is reconfigured.

These considerations extend to the neural encoding of causality itself. In a strictly forward, filtering-centric view, causal relations are inferred by extrapolating from early causes to later effects, and mismatches are corrected primarily by adjusting beliefs about present and future states. When precision allows for robust backward updates, however, causal inference acquires a more holistic character. Beliefs about both causes and effects are jointly tuned to minimize prediction error across entire temporal segments. From the standpoint of the bayesian brain, the causal arrow of time is thus softly enforced by the emergent precision structure overlaying a fundamentally symmetric probabilistic substrate.

Even at the level of thermodynamic constraints, precision offers a useful lens. Information processing is tied to entropy production and metabolic cost. Full temporal smoothing—where every data point can strongly update every other—would be energetically expensive, as it requires extensive recurrent activity and widespread reconfiguration of neural states. Precision-weighting acts as an economical compromise: only those errors deemed sufficiently reliable and behaviorally relevant are given enough gain to justify the metabolic cost of reworking prior states. In effect, the brain leverages precision to limit the scope of backward inference in a way that keeps energetic expenditure within viable bounds, aligning the practical arrow of time in neural computation with physical and metabolic irreversibility.

In contexts where consciousness appears to integrate information over time, precision-weighted prediction errors may help determine which temporal segments are admitted into the coherent, globally broadcast contents of experience. If conscious access involves a form of global workspace or high-level integration process, then only those representations that survive and drive high-precision error minimization across the hierarchy are likely to be stabilized. Late-arriving evidence that produces large, precise prediction errors can retroactively alter which earlier states attain conscious visibility, as in postdictive illusions where the perceived timing or identity of an event is revised. Here again, temporal symmetry at the level of generative structure is filtered through a precision-controlled, asymmetric gate that determines the felt ordering and solidity of events.

In sum, precision-weighted prediction errors provide the primary currency through which temporally symmetric generative models are translated into temporally structured experience. By modulating which errors count, how strongly they count, and when they count, precision sculpts the effective directionality of inference. The same probabilistic machinery can implement anything from near-reversible smoothing over short windows to hard-edged, forward-only tracking, depending on how precision is distributed over time and across the predictive processing hierarchy. This makes the arrow of time in cognition an emergent, controllable parameter of neural inference, rather than a rigid, pre-specified boundary condition.

Empirical signatures of time-reversed processing in cognition

Empirical evidence for time-reversed processing in cognition comes mainly from situations where later events clearly influence how earlier events are experienced, without any change in the physical stimulus sequence. These phenomena provide behavioral and neural signatures of retroactive updating over short temporal windows, in ways that predictive processing can naturally accommodate. Rather than indicating literal retrocausality, they suggest that neural inference operates over temporally extended segments, allowing the brain to revise its reconstruction of the recent past once more information becomes available.

A paradigmatic case is the class of postdictive perceptual illusions, where a stimulus presented after a target changes the perceived attributes or even the detectability of that target. In the classic color phi phenomenon, two differently colored dots are flashed successively at distinct locations. Observers report a single dot moving between the positions whose color changes in mid-trajectory, even though there was never a moving stimulus. Crucially, the perceived change in color is experienced at an intermediate position that is not actually stimulated, and its timing appears to precede the second flash. A forward-only model of perception struggles here; a model in which the brain waits for both flashes before constructing a coherent motion trajectory, and then retrofits the intervening segment, makes the effect more intelligible. Bayesian smoothing over the temporal sequence naturally yields the impression that the “past” motion already contained the color transition implied by the later evidence.

Similar logic underwrites phenomena like the flash-lag and flash-drag effects. When a moving object and a briefly flashed static object are presented simultaneously, the moving object is perceived as ahead of the flash. One influential interpretation is that the visual system extrapolates the motion forward to compensate for processing delays. However, postdictive accounts show that perception can also be shifted by events that occur after the flash, suggesting that the brain uses a short integration window, within which later motion information helps reconstruct where the moving object “must have been” when the flash appeared. Empirically, when that window is manipulated—for example, by varying the timing and predictability of subsequent motion—the magnitude and direction of the flash-lag illusion change accordingly. This is precisely what one would expect if the system is performing time-symmetric smoothing constrained by a limited, precision-weighted temporal buffer.

Backward masking provides another robust example. A faint target presented briefly and followed within tens of milliseconds by a stronger “mask” stimulus often fails to reach awareness or is misperceived. Critically, the same target can be consciously seen when the mask arrives slightly later, even though all early sensory processing of the target is complete by that time. From a predictive processing perspective, the mask delivers high-precision evidence that reshapes the generative model’s interpretation of what happened in the preceding instant. Neural signatures support this: early sensory components evoked by the target (for instance, in occipital cortex) are present regardless of masking, but later components associated with higher-level integration and conscious access are strongly modulated by the presence and timing of the mask. The mask thus retroactively determines whether the prior target-related activity is stabilized as part of the inferred scene or overwritten as noise.

Neurophysiological recordings during masking and related paradigms reveal a characteristic temporal pattern: an initial, relatively stereotyped feedforward sweep, followed by a more variable, context-sensitive feedback phase. The initial sweep carries information about the physical input, whereas the later phase encodes the model’s selected interpretation, shaped by subsequent stimuli. Changes in late feedback activity correlate with whether participants report seeing the target, even when early responses are nearly identical across conditions. This dissociation indicates that what becomes part of conscious perception depends on later, higher-level neural inference that reinterprets the immediate past in light of new evidence.

Temporal order judgments and apparent motion displays offer more subtle signatures of time-reversed processing. In certain stimulus regimes, participants misreport the order in which two stimuli occurred, or experience continuous motion between briefly flashed positions that are actually presented in the “wrong” order for such motion. Behavioral data show that perceived order is highly sensitive to contextual cues that arrive after both stimuli, such as an additional frame indicating a plausible motion path. When such cues support a particular narrative of what “must have happened,” people’s judgments of temporal order shift toward that narrative, even at the cost of contradicting the actual physical sequence. This is naturally explained if the brain treats temporal order as a latent variable to be inferred, jointly with other features of the scene, rather than as a fixed given. Late-arriving evidence with high precision can then prompt a reordering of events in the inferred timeline.

Neural measures during temporal order and apparent motion tasks reveal corresponding dynamics in distributed networks. Early event-related responses tend to encode local features with correct physical timing. Later components, especially those associated with fronto-parietal and higher visual areas, show patterns more aligned with the subject’s reported perception than with the external sequence. Multivariate decoding often finds that, after a certain latency, neural activity reflects the perceived order or trajectory, even in trials where this perception is illusory. This dissociation between veridical early encoding and post hoc, perception-aligned later encoding suggests that a second stage of processing retrofits the “story” of what occurred, smoothing or reshaping the earlier representations stored in short-term buffers.

Another source of evidence comes from the integration of multisensory information. In audio-visual speech perception, for example, visual mouth movements often precede the corresponding sounds, yet observers experience a unified, temporally aligned speech event. Experiments manipulating asynchrony show that within a certain window, the brain flexibly shifts the perceived onset and duration of one modality to align with the other. Critically, auditory perception can be biased by visual information that arrives only slightly later than the sound, indicating that auditory representations of the recent past remain open to revision. Neuroimaging studies reveal that higher-order multisensory regions exhibit activity patterns suggesting that they encode a fused estimate of timing, which then feeds back to modality-specific cortices to adjust their temporal representations. This pattern again fits a model in which bidirectional neural inference reconfigures earlier modality-specific states to maintain a coherent cross-modal scene.

Studies of the subjective “specious present” also point toward temporally extended inference. When participants judge the simultaneity of stimuli or the duration of brief intervals, their reports are systematically biased by events that occur just after the judged interval. For instance, the perceived duration of a target interval can be stretched or compressed by an unexpected event immediately following it. These aftereffects are hard to model if the judged interval is sealed off at its physical endpoint but become expected if the brain estimates durations over windows within which subsequent context still influences how the earlier segment is segmented and labeled. Electroencephalography and magnetoencephalography data frequently show that the neural correlates of such duration judgments peak well after the interval has elapsed, implying that the estimate is consolidated only after later evidence is incorporated.

Working memory tasks reveal related dynamics of retrospective updating. In “retro-cueing” paradigms, participants are asked to remember multiple items, and a cue presented after stimulus offset indicates which item will be probed. Behavioral performance improves markedly for cued items, often beyond what can be explained by mere protection from interference. Neural markers, such as alpha-band lateralization or item-specific patterns in sensory cortices, show that the retro-cue selectively enhances and sharpens the representation of the cued item, sometimes reactivating sensory-specific codes that had previously decayed. This suggests that the system reweights and refines memory traces of past stimuli based on later relevance signals, effectively re-editing the content of the recent past to match current goals and expectations.

At a larger scale, narrative and event-segmentation studies provide a window into how the brain retrospectively structures extended experiences. When people watch movies or read stories, they spontaneously parse continuous streams into discrete events. The perceived boundaries between events are highly sensitive to information that occurs after candidate boundary points, such as the resolution of a sub-plot or a shift in character goals. Functional neuroimaging shows that high-level regions in the default mode and fronto-parietal networks reorganize their activity patterns at these narrative boundaries, and that representations of earlier scenes are recoded once their significance becomes clear. For instance, the hippocampus and medial prefrontal cortex show pattern similarity changes indicating that previously unrelated scenes are “bound together” once a connecting plot twist is revealed. This pattern is consistent with an ongoing re-interpretation of the narrative past in light of future developments, realized through slow, recurrent updating across distributed cortical networks.

Empirical work on expectation and surprise offers more direct ties to the underlying mechanics of predictive processing. When a later event violates a strong expectation established by an earlier cue, neural signatures of surprise—such as mismatch negativity or fronto-central prediction error responses—are often accompanied by changes in how the preceding cue is represented. Decoding analyses show that, following a strong violation, neural patterns associated with the original cue can shift toward representations consistent with a revised interpretation: what was previously encoded as, say, a reliable predictor might be recoded as ambiguous or misleading. This is compatible with the idea that high-level prediction errors, especially when endowed with high precision, drive retroactive revaluation of earlier states, effectively rewriting their role in the causal structure that the bayesian brain infers from the sequence.

Empirical signatures of time-reversed processing are also evident in the temporal dynamics of consciousness itself. In binocular rivalry and other forms of perceptual multistability, the dominant percept flips between alternative interpretations despite unchanging input. Just before a perceptual switch, subtle changes in higher-level activity predict which interpretation will become dominant. Intriguingly, shortly after the switch, neural patterns corresponding to the newly dominant percept can be detected not only in current sensory responses but also in patterns indexing the immediate past, as if the brain is re-encoding the prior interval in a way that matches the new interpretation. Subjectively, people often report that the new percept seems to have “been there all along” just before they became aware of it, aligning experiential time with the outcome of ongoing neural inference rather than with a snapshot of raw sensory data.

Across these domains, a recurring theme emerges: early neural responses track the physical sequence of events, while later, more distributed activity reflects a constructed sequence that best fits the totality of the evidence and the organism’s current priors and goals. Behavior follows the constructed sequence. The difference between physical time and experiential time—between the objective ordering of stimuli and the felt arrow of time in perception—is therefore an empirical marker of temporally extended, bidirectional inference. Where these diverge, and where later events systematically reshape earlier representations and reports, we see the functional footprint of a brain that does not merely read the world forward, but continually reweaves a coherent temporal tapestry from both past and future constraints within its limited integration windows.

Implications for agency, memory, and consciousness

Allowing that neural inference can revise the recent past has immediate consequences for how agency is understood. In a purely forward, stimulus–response picture, an agent first forms an intention, then acts, and finally monitors the consequences. Within a temporally extended predictive processing framework, by contrast, intentions, actions, and outcomes are inferred jointly over a short temporal window. The system constructs the most coherent trajectory in which bodily movements, environmental changes, and internal states hang together as a self-generated action. Later sensory feedback and internal signals can therefore retroactively determine whether some motor event is treated as “my doing” or as an externally caused perturbation.

This is particularly clear in models of active inference, where actions are selected to minimize predicted prediction error relative to prior preferences. Here, a motor command is not just a forward instruction to muscles; it is also a hypothesis about a future bodily trajectory that the system attempts to realize. When subsequent sensory data fit this hypothesis within an integration window, the brain retrospectively labels the trajectory as self-caused and attributes agency to it. When the match is poor, the same physical movement may be reclassified as unintended, accidental, or even externally imposed. Thus, the experience of willing an action can be seen as the late-arriving verdict of a temporally extended inference, not a simple readout of an antecedent mental cause.

Empirical phenomena such as the “intentional binding” effect fit naturally within this picture. People tend to perceive their voluntary actions and their outcomes as closer together in time than they really are. One interpretation is that the brain retrospectively adjusts the timing of both the action and its effect to fit a prior that intentional actions reliably produce outcomes after some characteristic delay. The arrow of time as consciously experienced is slightly warped to honor this prior. Predictive processing reframes this as smoothing over the action–outcome sequence: a later, outcome-driven prediction error leads to a subtle re-timing of the earlier motor event in the inferred trajectory, knitting them into a unified episode of agency.

The same logic scales to more complex decisions. When people retrospectively justify a choice—explaining why they “really wanted” what they picked, even under experimental manipulations that secretly swapped the chosen option—later cues about commitment and outcome can reshape the reconstructed pre-decisional state. Neural signatures suggest that post-choice feedback influences how earlier deliberative activity is stored and reactivated, supporting a reconstruction in which the chosen option appears consistent with prior preferences. Autobiographical narratives of agency thus reflect not only what actually preceded the decision but also what came after, filtered through a generative model that prefers coherent, goal-directed trajectories.

Memory is perhaps the domain where time-reversed processing is most striking. In standard views, encoding lays down a trace that is later retrieved, potentially degraded but essentially fixed. In a temporally symmetric, hierarchical generative architecture, memories are not static records but hypotheses about past causes of present evidence—including present cues, current goals, and other memories reactivated in context. Retrieval is another round of inference, and subsequent experiences can reshape the inferred past each time a memory is revisited.

This fits well with reconsolidation research, where reactivated memories become labile and can be updated or even transformed before being stored again. On a predictive processing account, reactivation exposes a previously inferred trajectory to new prediction errors arising from current context. If these new errors are granted sufficient precision, the generative model revises its best explanation of the prior event, and this revised explanation becomes the new memory. From the subjective point of view, the earlier episode now “always was” as currently remembered, illustrating how the bayesian brain can redefine the past without any hint of physical retrocausality.

Narrative memory and life stories extend this process to much longer timescales. High-level priors about personal identity, social roles, and moral coherence provide strong constraints on how past events are interpreted. When such priors change—through psychotherapy, religious conversion, political realignment, or major life transitions—people often report “seeing their past differently.” Specific episodes acquire new meanings: an event once encoded as a failure may be reclassified as a necessary learning step, or a relationship formerly remembered as harmonious may be reinterpreted as subtly coercive. Hierarchically, this can be modeled as a high-level shift that sends new predictions downward, reorganizing which lower-level details are retrieved, how they are segmented into events, and how they are temporally ordered within the broader life narrative.

The constructive nature of episodic recollection is strongly compatible with temporally extended neural inference. Hippocampal–cortical interactions during recall show that the initial cue evokes a coarse pattern, which is then refined over hundreds of milliseconds through recurrent exchanges. Details that fit the current high-level narrative are more likely to be reinstated and strengthened, while incompatible details are suppressed or left unreactivated. Memory thus becomes a site where future-oriented concerns—current goals, expectations about what will matter—reach back to reshape representations of the past, within the constraints of what the system’s priors and stored traces allow.

This bidirectional interplay between past and future is central to prospective cognition and mental time travel. Imagining future scenarios uses much of the same neural circuitry as recalling past experiences. In generative terms, both activities involve sampling trajectories from the same model under different boundary conditions: in one case, the “data” are long-term priors and a present cue; in the other, they include specific sensory traces. Because the model encodes structured regularities across time without a built-in direction, inferences about the past and simulations of the future are deeply entangled. What one anticipates can shape what one comes to remember, and what one remembers constrains what one expects, both through the same inferential machinery.

These dynamics have important consequences for understanding consciousness. If conscious experience reflects, at least in part, the globally broadcast or integrated result of predictive processing, then it is inherently shaped by temporally extended inference rather than by instantaneous sensory states. The contents of consciousness at any moment encode a best-guess trajectory that spans a short interval of time, with both slightly earlier and slightly later events contributing to what is experienced as the “now.” Postdictive illusions and masking already show that what becomes conscious can depend on events that physically occur after the stimulus in question. This suggests that the window of conscious access is delayed and smoothed, allowing time for backward revisions before the system commits to a stable scene.

On this view, the subjective present is a moving inference window whose boundaries are determined by precision-weighted prediction errors and resource constraints. Within this window, the brain can update its representation of events both forward and backward, minimizing overall surprise under its generative model. Consciousness then tracks the stabilized outcome of these revisions: once prediction error has been mostly quenched for a given segment, its contents can be treated as settled and less open to further re-interpretation. The felt fixity of the immediate past is thus a late achievement, not an inherent property of the processing stream.

Temporal integration theories of consciousness, such as those positing a “specious present,” align naturally with this idea. Rather than being a single instant, the experienced now covers a brief interval during which sensory and internal signals are combined. A temporally symmetric generative model suggests that this interval is not just a passive buffer but an active workspace for constructing a coherent temporal narrative from partially overlapping constraints. Future-directed predictions and past-directed revisions are both in play, with precision determining which direction dominates at any moment. Conscious time, in this sense, is the phenomenological shadow cast by an underlying, reversible computational process constrained by irreversible biophysics.

Agency and responsibility look different through this lens. If the conscious experience of deciding and acting is itself a product of post hoc temporal smoothing, then the intuitive idea that awareness of intention must always precede action is undermined. In many cases, motor preparation and early movement may unfold under subpersonal predictions, with awareness of intention arising only when higher-level systems retrospectively recognize a coherent pattern and label it as “my choice.” This does not eliminate agency; instead, it locates agency in the integrity of the whole inferential loop that links long-term priors, present goals, bodily dynamics, and environmental feedback, rather than in a single instantaneous mental cause.

Disorders of agency can be modeled as disruptions in this temporally extended inference. In schizophrenia, for example, self-generated thoughts or actions may not be properly predicted by high-level models of the self, leading to prediction errors that are misattributed to external agents. Within a time-reversed framework, a failure to retrospectively integrate motor signals and their consequences into a coherent self-caused trajectory could yield experiences of passivity and thought insertion. The brain still carries out movements and generates thoughts, but later inference fails to weave them into the “I did this” narrative, leaving them to be interpreted as alien intrusions in the experiential timeline.

Similarly, in certain movement disorders and functional neurological symptoms, patients may perform complex behaviors while denying that they intended them. One way to capture this is to posit a mismatch between low-level motor predictions and higher-level narratives of agency, exacerbated by abnormal precision allocation. If late-arriving proprioceptive and visual feedback are not granted sufficient precision to revise high-level beliefs about what was intended, the movement may fail to be retrospectively enveloped by a sense of ownership, even though it was generated by the patient’s own nervous system. The arrow of time in agency experience fractures: the body moves forward in physical time, but the inferred self does not catch up.

Memory pathologies reveal parallel distortions. In confabulation, patients confidently report detailed events that never occurred, often to fill gaps created by frontal or limbic damage. A temporally extended generative model suggests that when high-level priors about narrative coherence and self-consistency retain high precision while access to episodic traces is compromised, the system will generate plausible past trajectories that fit its expectations. Later questions, cues, and social feedback further sculpt these constructed memories, which can stabilize into fixed but inaccurate stories. The past is inferred under strong top-down constraint with limited bottom-up correction, leading to a chronically misaligned experiential timeline.

Amnesia offers a complementary pattern: new experiences fail to be woven into a coherent temporal tapestry. Damage to hippocampal–cortical loops impairs the ability to bind events across time, so that the generative model lacks rich, episodic trajectories to support future inference. As a result, the subjective sense of continuity—of a self persisting through a structured past—degrades. Here the problem is not that the brain overwrites the past too aggressively but that it cannot maintain the kinds of multiscale temporal hypotheses that would allow both backward and forward inference over extended intervals. Consciousness narrows to an ever-shifting present with little anchor in a reconstructible history.

These considerations also bear on philosophical debates about free will and temporal metaphysics. If the brain implements a fundamentally probabilistic, temporally symmetric generative model, and if the arrow of time in experience emerges from constraints on precision, energy, and physiology, then conscious volition is neither a simple illusion nor a primitive causal power. It is an emergent property of a system that must continuously predict and retrodict its own behavior in a noisy world. Willing an action becomes equivalent to endorsing a particular trajectory as best explaining both prior intentions and subsequent outcomes, within the limits imposed by thermodynamic and computational resources such as entropy production and metabolic cost.

Under this view, responsibility is not grounded in the existence of a metaphysically special decision point in absolute time, but in the integrity and flexibility of the generative model across time. An agent is responsible when their longer-term priors, current states, and expected futures interact in a relatively stable, self-consistent manner, such that backward revisions produce narratives that can be socially and practically negotiated. Pathologies of agency and memory can thus be seen as breakdowns in the mechanisms by which the brain usually aligns its constructed temporal stories with both its own bodily trajectory and the shared temporal framework of others.

Consciousness itself may be recharacterized as a mode of operation in which the generative model not only predicts and retrodicts but also monitors its own temporal inferences. Meta-cognitive processes track the reliability of these inferences over time, adjusting precision, reallocating attention, and sometimes marking certain episodes as uncertain, contested, or in need of revision. Experiences of doubt, déjà vu, and “not feeling like oneself” can be interpreted as moments where higher-order monitoring detects instability in the temporal narrative that supports agency and memory. In such moments, the normally invisible work of time-reversed processing becomes phenomenologically salient, as the mind grapples with its own capacity to redraw the boundaries between past, present, and future within the unfolding stream of experience.

Beyond the arrow of time in predictive processing

Bidirectional inference in hierarchical generative architectures

Temporal symmetry and precision-weighted prediction errors

Empirical signatures of time-reversed processing in cognition

Implications for agency, memory, and consciousness

Functional cognitive disorder in everyday life

Baseline testing and concussion assessment

Related Articles

Leave a Comment Cancel Reply

Queue