{"id":3208,"date":"2026-01-04T15:47:53","date_gmt":"2026-01-04T15:47:53","guid":{"rendered":"https:\/\/beyondtheimpact.net\/?p=3208"},"modified":"2026-01-04T15:47:53","modified_gmt":"2026-01-04T15:47:53","slug":"neural-priors-under-two-time-thermodynamics","status":"publish","type":"post","link":"https:\/\/beyondtheimpact.net\/?p=3208","title":{"rendered":"Neural priors under two-time thermodynamics"},"content":{"rendered":"<p><a name=\"neural-priors-in-non-equilibrium-learning-dynamics\"><\/a><\/p>\n<p>Neural priors in non-equilibrium learning dynamics can be understood as structured constraints that bias how neural populations explore and exploit their high-dimensional state space while remaining persistently out of thermodynamic equilibrium. Rather than treating priors as static, abstract probability distributions, they can be modeled as dynamic fields encoded in synaptic strengths, intrinsic excitability, and network topology. These fields shape trajectories of neural activity over time by favoring certain patterns of activation and suppressing others, effectively implementing a form of \u201csoft architecture\u201d that continuously steers learning and inference under changing environmental and metabolic conditions.<\/p>\n<p>In non-equilibrium settings, the brain\u2019s learning dynamics are driven by continual energy throughput and dissipative processes, so that priors emerge as stable yet adaptable attractors within this flow. Instead of convergence to a fixed equilibrium distribution, cortical circuits operate under constantly shifting input statistics, neuromodulatory signals, and internal goals. The resulting priors are metastable, reconfigurable structures that can be rapidly reweighted in response to context. This perspective reframes priors as operational constraints on trajectories through state space rather than as timeless summaries of past data.<\/p>\n<p>From the perspective of the bayesian brain hypothesis, non-equilibrium dynamics are not an inconvenience but a resource: they allow the brain to implement approximate Bayesian inference through ongoing prediction-error minimization. Priors bias which hypotheses are considered during this process, but under persistent driving they themselves remain plastic. Activity-dependent plasticity rules, such as Hebbian and spike-timing\u2013dependent plasticity, implement local updates to synaptic parameters that encode prior expectations over latent causes of sensory data. These synaptic updates do not simply relax toward a maximum-likelihood or maximum a posteriori solution; instead, they evolve under continuous perturbations, noise, and resource constraints, reflecting the non-equilibrium nature of biological learning.<\/p>\n<p>Non-equilibrium neural dynamics can be formalized using stochastic differential equations that describe how membrane potentials, firing rates, and synaptic efficacies change over time under noisy input and internal feedback. In such formulations, priors appear as effective potentials that bias drift terms in these equations. The presence of ongoing noise and non-conservative forces, such as neuromodulatory influences that break detailed balance, prevents the system from settling into a static Boltzmann-like distribution. Consequently, priors must be characterized in terms of quasi-stationary distributions or time-dependent path measures, highlighting their inherently dynamical character.<\/p>\n<p>When learning is framed in terms of prediction and free energy minimization, non-equilibrium priors gain an explicit role as regularizers of inference trajectories. Predictive coding formulations describe how hierarchical circuits attempt to suppress prediction errors by adjusting both internal states (representations) and parameters (priors). Under non-equilibrium conditions, prediction errors are never fully eliminated; instead, the system continually reconfigures to maintain low average free energy while coping with fluctuating inputs. Priors influence which prediction errors are taken seriously and which are treated as noise, effectively sculpting the manifold of representational states that are visited during ongoing activity.<\/p>\n<p>Metabolically, maintaining informative priors under non-equilibrium dynamics requires continuous energy expenditure, as synaptic weights and receptor distributions must be actively maintained and updated. This introduces a link between the complexity of priors and metabolic cost. Priors that are overly sharp or excessively detailed may demand disproportionate energy to sustain, while priors that are too diffuse fail to support accurate prediction. The resulting compromise is a set of priors that preserve task-relevant structure while discarding unneeded detail, consistent with principles of efficient coding under resource constraints.<\/p>\n<p>Non-equilibrium neural dynamics also imply asymmetries in how priors are learned from different temporal scales of experience. Fast-changing sensory contingencies shape short-term priors encoded in transient synaptic states or modulatory gains, while slower, more stable regularities are consolidated into long-term structural priors in dendritic morphology and recurrent connectivity. The coexistence of these time scales means that priors are layered: rapidly reconfigurable surface priors ride on top of deeper structural priors that evolve much more slowly. This layered organization allows neural systems to remain responsive to immediate demands without sacrificing the stability imparted by long-range experience.<\/p>\n<p>Noise plays a dual role in non-equilibrium learning. On one hand, stochastic fluctuations can destabilize fragile priors and degrade performance. On the other, controlled noise can help escape poor local minima and facilitate exploration of alternative hypotheses. In many models of neural dynamics, noise sources such as synaptic release variability or channel noise are not merely tolerated; they are leveraged to explore the hypothesis space around existing priors. The resulting stochastic sampling behavior approximates Bayesian posterior sampling, with priors defining the baseline tendencies toward specific interpretations of ambiguous input.<\/p>\n<p>Non-equilibrium conditions are particularly salient during early development and in periods of rapid learning, where the statistics of sensory input and internal goals can change abruptly. During these phases, priors are especially plastic, and the system may be far from any quasi-stationary regime. Synaptic overproduction followed by activity-dependent pruning, for instance, can be seen as a non-equilibrium mechanism for discovering and stabilizing useful priors about the environment and the body. The system temporarily increases its capacity, widely exploring possible configurations, then collapses onto a more economical set of priors that continue to be refined throughout life.<\/p>\n<p>At the circuit level, recurrent connectivity patterns embody priors about temporal and spatial structure in the environment. In non-equilibrium regimes, feedback loops within and between cortical areas continually reshape these patterns through processes such as synaptic consolidation, heterosynaptic plasticity, and homeostatic mechanisms. As the network processes streams of input, these internal loops reinforce frequently co-occurring features and suppress rarely useful combinations, effectively encoding priors over the manifold of likely sensory trajectories. Because input statistics and behavioral demands change over time, these recurrent priors remain in flux, never completely settling into a fixed optimal architecture.<\/p>\n<p>Neuromodulators provide an additional layer of non-equilibrium control over neural priors by dynamically adjusting learning rates, gain, and the relative weighting of prior-driven versus data-driven signals. Changes in neuromodulatory tone can transiently relax established priors to permit rapid re-learning, or conversely, strengthen them to stabilize behavior in predictable contexts. This modulation can be viewed as a mechanism for rapidly shifting the system between exploration-dominated and exploitation-dominated regimes without rewriting deep structural priors at the synaptic level, thereby reducing the metabolic and computational cost of adaptation.<\/p>\n<p>Behaviorally, non-equilibrium priors manifest as context-sensitive biases in perception, decision-making, and motor control. For example, after adaptation to a new environment or task, observers show shifts in their default expectations about stimuli or outcomes, even when explicit feedback is removed. These shifts reflect updated priors that were formed under strongly driven, non-equilibrium learning conditions and then partially stabilized. Yet, as situations continue to evolve, these priors remain subject to further adjustment, revealing that what appears as a stable predisposition at one moment is actually a snapshot of an ongoing dynamic process.<\/p>\n<p>Pathological conditions can often be interpreted as disruptions in the non-equilibrium maintenance of neural priors. Excessive rigidity of priors may lead to hallucinations, delusions, or perseverative behaviors, where existing expectations dominate over incoming evidence. Conversely, overly labile priors can produce chronic uncertainty, distractibility, or hypersensitivity to noise. In both cases, the underlying issue is not simply the content of priors, but the failure of the dynamical mechanisms that update and stabilize them under continuous energetic and informational flux. Modeling such conditions through the lens of non-equilibrium dynamics offers a principled way to connect microscopic synaptic processes with macroscopic behavioral symptoms.<\/p>\n<p>Computationally, leveraging non-equilibrium principles in artificial neural networks involves moving beyond static regularizers and fixed weight decay to dynamic, state-dependent priors that evolve during training and inference. Techniques such as stochastic gradient Langevin dynamics, continual learning with synaptic consolidation, and meta-learning of initialization states can be seen as engineering analogs of biological non-equilibrium priors. These methods keep networks in a regime where parameters are constantly adjusted under structured noise and task demands, enabling systems to retain useful prior structure while remaining adaptable to new data streams.<\/p>\n<p>Incorporating two-time or multi-time-scale learning rules into such models further aligns them with biological non-equilibrium dynamics. Fast updates capture transient contingencies, while slow updates refine more enduring priors over task structure and environment. The interaction between these time scales, coupled with ongoing stochasticity, leads to emergent priors that are neither wholly hand-designed nor purely data-driven. Instead, they arise from the interplay between the learning algorithm, the flow of data, and the constraints imposed by computational and energetic resources, in close analogy to how neural priors are sculpted and sustained in the living brain.<\/p>\n<h3>Two-time thermodynamics and effective temperature in neural systems<\/h3>\n<p>Two-time thermodynamics provides a way to describe neural dynamics when fast microscopic fluctuations and slower macroscopic adaptations coexist and interact. Instead of assuming that neural activity and synaptic configurations relax toward a single equilibrium characterized by a unique temperature, two-time descriptions distinguish between rapid, local degrees of freedom and slow, collective modes. Fast neural variables\u2014such as membrane potentials, spiking events, and short-lived synaptic states\u2014evolve on millisecond to second scales, while slower variables\u2014such as structural connectivity, receptor expression, and long-term synaptic strengths\u2014change on scales of minutes, hours, or longer. Thermodynamically, this separation leads to partially equilibrated subsystems that can be assigned different effective temperatures, reflecting their distinct levels of fluctuation and responsiveness to perturbations.<\/p>\n<p>In such a framework, effective temperature is not a literal physical temperature of tissue but a measure of how strongly a subsystem explores its state space under stochastic driving. High effective temperature corresponds to broad, noisy exploration, while low effective temperature corresponds to concentrated, stable activity around attractor states. Neural populations under high neuromodulatory drive, elevated synaptic noise, or intense sensory bombardment may exhibit high effective temperatures, rapidly sampling multiple hypotheses about the environment. Conversely, deeply engrained behaviors, such as overlearned motor routines or rigid beliefs, correspond to low effective temperature regimes in which activity is confined to narrow basins of attraction and deviations are quickly suppressed.<\/p>\n<p>Two-time thermodynamics captures how these regimes coexist and interact in neural systems. Fast neural activity may operate at a relatively high effective temperature, enabling rapid inference, exploration of representations, and probabilistic sampling over possible interpretations of input. Simultaneously, slower synaptic and structural variables remain at a lower effective temperature, reflecting the accumulated influence of long-term experience that stabilizes certain priors and circuit motifs. The brain thereby maintains a structured backbone of low-temperature constraints\u2014embodied priors\u2014against which high-temperature fluctuations in activity can be evaluated and harnessed for flexible computation.<\/p>\n<p>This separation of time scales is tightly linked to prediction and free energy minimization principles. In predictive coding and related formulations of the bayesian brain, neural circuits are seen as continually updating internal states to reduce prediction errors between expected and actual sensory signals. Fast variables encode moment-to-moment predictions and residual errors, while slower variables encode priors and generative models that shape those predictions. Two-time thermodynamics reframes this process as an ongoing negotiation between fast, noisy sampling at one effective temperature and slow, regularizing adaptation at another. The free energy landscape that governs these dynamics is reshaped at different rates along different coordinates: steep and relatively static along structural dimensions, shallower and rapidly deforming along activity dimensions.<\/p>\n<p>Within this perspective, effective temperature determines how sharply or loosely the system responds to prediction errors at each time scale. A higher effective temperature in fast activity channels allows transient prediction errors to generate rich exploratory dynamics rather than immediate, rigid corrections. This promotes sampling of alternative explanations for ambiguous input and supports robust inference under uncertainty. A lower effective temperature in slower synaptic and structural channels restricts the pace at which long-term priors are modified, protecting accumulated knowledge from being overwritten by brief fluctuations in sensory statistics. The resulting hierarchy of effective temperatures prevents catastrophic forgetting while still enabling flexible adaptation.<\/p>\n<p>Two-time thermodynamics also sheds light on how neural systems manage energy-entropy trade-offs in maintaining and updating priors. Fast, high-temperature dynamics are energetically costly but entropy-enhancing: they generate diverse patterns of activity that explore possible configurations of the network. Slow, low-temperature dynamics are more conservative: they consume energy to stabilize synaptic structures and suppress unnecessary exploration, thereby reducing entropy in the long run. By tuning effective temperatures separately at fast and slow levels, neural systems can strike a balance between the metabolic cost of continual exploration and the informational value of maintaining stable, low-entropy priors that encapsulate reliable regularities in the environment.<\/p>\n<p>In practice, the effective temperature of neural subsystems is modulated through biophysical mechanisms such as neuromodulation, homeostatic plasticity, and network-level feedback. Neuromodulators like dopamine, norepinephrine, acetylcholine, and serotonin alter noise levels, gain, and adaptation rates, effectively raising or lowering the temperature of particular circuits. For example, heightened norepinephrine can increase the variability and sensitivity of cortical responses, corresponding to a temporary elevation of effective temperature that supports exploration and reconfiguration of ongoing computations. Conversely, neuromodulatory states associated with focused exploitation or habitual behavior lower effective temperature, confining neural trajectories to well-established pathways that implement entrenched priors.<\/p>\n<p>Homeostatic mechanisms further regulate effective temperature by adjusting excitability and synaptic strengths to maintain firing rates within functional bounds. When recurrent activity becomes too quiescent, homeostatic upscaling can introduce additional variability and responsiveness, effectively increasing local temperature and promoting exploration of underutilized configurations. When activity becomes excessively variable and energetically expensive, downscaling or synaptic depression can cool the system, stabilizing activity patterns that align with existing structural priors. These local temperature adjustments ensure that different regions of the brain can operate at distinct points along an exploration-exploitation spectrum without losing global coordination.<\/p>\n<p>Two-time thermodynamics provides a natural language for describing critical phenomena and phase transitions in neural systems. As effective temperatures at different time scales are tuned, networks may approach critical points where small perturbations lead to large changes in activity or connectivity. Near such points, correlation lengths and times blow up, reflecting long-range dependencies and temporal memory in neural dynamics. Critical regimes emerge when fast activity is sufficiently hot to propagate fluctuations across the network, while slow structure remains cool enough to prevent runaway destabilization. This delicate interplay enables rich, scale-free dynamics often associated with optimal information processing, sensitivity to stimuli, and maximal dynamic range.<\/p>\n<p>Pathological states can be interpreted as misalignments or dysregulations of effective temperatures across time scales. If fast activity becomes too cold relative to slow structure\u2014due to excessive inhibition, rigid connectivity, or overly strong priors\u2014networks may become stuck in narrow attractors, reducing flexibility and impairing the capacity to update expectations. Clinically, this might manifest as perseveration, obsessive rumination, or hallucinations driven by overly dominant internal models. If fast activity becomes too hot without sufficient stabilizing structure\u2014due to disinhibition, heightened noise, or weakened priors\u2014networks may display chaotic, unstructured firing, leading to distractibility, hyper-reactivity to noise, or difficulties forming stable beliefs. In both scenarios, two-time thermodynamics emphasizes the importance of relative, not absolute, temperature tuning.<\/p>\n<p>From a computational standpoint, two-time approaches inspire neural network models in which fast inference and slow learning are governed by coupled, but distinct, stochastic processes. Fast inference dynamics, such as noisy recurrent updates or sampling-based hidden state updates, can be implemented at a higher effective temperature to approximate posterior sampling over latent variables. Slow learning dynamics, such as parameter updates in a generative model, operate at lower effective temperature, reflecting gradual consolidation of priors. Techniques like stochastic gradient Langevin dynamics naturally instantiate this separation: fast sampling in weight or state space runs with relatively high noise, while slower annealing schedules cool selected parameters, locking in structural features discovered during exploration.<\/p>\n<p>In these models, two-time thermodynamics clarifies how learning schedules, noise injection, and annealing protocols should be coordinated to mimic biological efficiency. Fast degrees of freedom must have sufficient time and temperature to thoroughly explore the vicinity of current priors and data, discovering alternative solutions or representational modes. Only once this exploration has produced consistent patterns should slow degrees of freedom cool and consolidate, effectively embedding new priors in the network. Premature cooling of slow degrees can trap the model in suboptimal configurations, while perpetual heating prevents the stabilization of useful structure. Thus, a principled temperature schedule across time scales becomes an essential design ingredient in thermodynamically inspired learning algorithms.<\/p>\n<p>Two-time thermodynamics also connects with information geometry by highlighting how effective temperatures shape trajectories in parameter and activity manifolds. A hotter fast process moves more freely across nearby configurations in the neural state space, probing curvature and discovering directions of low free energy. A cooler slow process integrates these local explorations into a gradual drift along geodesic-like paths that correspond to statistically efficient updates of priors. The division of labor between hot, exploratory dynamics and cool, consolidating adaptation allows the system to approximate natural gradient flows without incurring the full computational cost of explicit second-order optimization, effectively using thermodynamic fluctuations as a probe of underlying informational structure.<\/p>\n<p>Two-time thermodynamics underscores that neural systems do not inhabit a single, static thermodynamic regime; instead, they orchestrate a dynamic repertoire of effective temperatures across circuits and time scales. Attention, arousal, sleep, and learning phases can all be viewed as global reconfigurations of this thermodynamic landscape. During waking exploration, certain sensory and associative areas may be \u201cheated\u201d to support rapid formation and testing of hypotheses, while deeper priors in subcortical or long-range circuits remain comparatively cool. During sleep or offline consolidation, effective temperatures may be redistributed, with replay and synaptic renormalization adjusting the balance between fast and slow processes. Through these ongoing thermodynamic reallocations, the brain sustains a flexible yet stable architecture of neural priors under continuously changing environmental and internal conditions.<\/p>\n<h3>Information geometry of neural priors under temporal constraints<\/h3>\n<p>Information geometry offers a natural language for describing how neural priors are organized and evolve when learning and inference occur under explicit temporal constraints. Instead of viewing priors as undifferentiated probability densities over latent variables, one can represent them as points on a curved statistical manifold whose coordinates are the synaptic and circuit parameters that define a generative model. Distances on this manifold quantify how distinguishable two priors are in terms of their implied sensory consequences. When neural dynamics are constrained by two-time thermodynamics, trajectories on this manifold become time-asymmetric, with fast paths corresponding to transient reconfigurations of effective priors and slow paths capturing the gradual reshaping of deep structural expectations.<\/p>\n<p>On this manifold, the Fisher information metric provides a canonical measure of local curvature, capturing how sensitive the likelihood of sensory inputs is to small changes in neural parameters that encode priors. Regions of high curvature correspond to parameter settings where small synaptic modifications produce large changes in predicted sensory statistics, while flat regions indicate directions in parameter space that barely affect predictions. Under temporal constraints, neural systems do not explore this manifold isotropically. Instead, fast dynamics preferentially move along directions of low curvature, where exploratory changes are relatively cheap in terms of both energy and prediction error, while slow consolidation aligns with directions of high curvature that carry long-term informational significance.<\/p>\n<p>In the bayesian brain framework, neural circuits are tasked with performing prediction and free energy minimization in a high-dimensional space of possible generative models. Information geometry formalizes this task as gradient flows on the manifold of priors and posteriors. The natural gradient, defined with respect to the Fisher metric, indicates the most statistically efficient direction for changing a prior given observed prediction errors. However, implementing exact natural gradient descent is both computationally and biophysically costly. Temporal constraints and non-equilibrium conditions effectively approximate this process: fast stochastic activity samples nearby configurations, probing the local geometry, while slower plasticity integrates these samples into a coarse-grained drift that approximates a natural gradient flow without explicitly computing second-order structure.<\/p>\n<p>Temporal constraints impose anisotropy not only in how the manifold is traversed but also in how it is shaped. Because slow synaptic changes are metabolically expensive, the brain tends to accumulate curvature\u2014i.e., high Fisher information\u2014along a limited number of directions that consistently reduce free energy across many contexts. These directions correspond to compressed, low-dimensional priors that capture stable environmental regularities such as object permanence, causal structure, or body kinematics. Fast, context-dependent priors, encoded in transient activity or short-term plasticity, live in a higher-dimensional, lower-curvature subspace that can be rapidly reconfigured without destabilizing the deeper manifold structure. The resulting stratification of the manifold reflects a geometric layering of priors according to their temporal stability.<\/p>\n<p>Two-time thermodynamics adds an explicit thermodynamic interpretation to these geometric structures. Effective temperatures associated with fast and slow processes determine how widely each subsystem samples within its local neighborhood on the manifold. A high effective temperature in the fast subsystem corresponds to broad, noisy excursions around the current prior, probing curvature and discovering nearby low free energy basins. A lower temperature in the slow subsystem constrains motion along directions with large curvature, preventing abrupt shifts in deeply entrenched priors. Information geometry thereby provides the coordinate system in which these thermodynamic explorations and consolidations are expressed as structured paths rather than random wanderings.<\/p>\n<p>Temporal constraints can be formalized as friction-like terms on the manifold that bias trajectories toward or away from particular directions. When adaptation is slow, effective friction is high along directions with large Fisher information, so that changes in those parameters require persistent, congruent prediction errors over extended time windows. By contrast, low-friction directions align with parameters that can fluctuate rapidly in response to short-lived contingencies. In geometric terms, the brain implements time-dependent Riemannian flows: the metric and friction tensor jointly define how priors are allowed to move, given the balance between current errors, historical evidence, and resource limitations.<\/p>\n<p>This perspective reveals that different brain areas may occupy distinct regions of the manifold with characteristic curvature profiles and temporal response properties. Sensory cortices, which must rapidly adapt to changing input statistics while retaining stable feature detectors, may inhabit manifolds with sharp curvature along dimensions encoding core feature priors and relatively flat directions capturing contextual modulations. Associative and prefrontal regions, engaged in long-horizon planning and abstraction, may be situated in regions where curvature accumulates along more abstract, temporally extended dimensions, making them resistant to rapid reconfiguration but highly influential when slow changes do occur.<\/p>\n<p>Information geometry also clarifies the role of hierarchical organization in neural priors. In hierarchical generative models, higher levels encode slowly changing beliefs about context and latent causes, while lower levels encode fast-changing hypotheses about immediate sensory details. Geometrically, this hierarchy corresponds to a fiber bundle structure: each high-level prior defines a base point in a coarse manifold, while the space of compatible lower-level priors forms a conditional submanifold attached to that point. Temporal constraints ensure that high-level points move slowly across the base manifold, while lower-level submanifolds are rapidly traversed as sensory evidence arrives. Two-time thermodynamics regulates the relative effective temperature on base and fiber, fostering stable high-level expectations that nonetheless permit flexible low-level inference.<\/p>\n<p>When prediction errors are integrated over time, they define not just instantaneous gradients but entire paths on the manifold of priors. Short-lived deviations may only cause small local excursions that are later retracted as noise, whereas persistent, structured prediction errors carve out new valleys in the free energy landscape, effectively reshaping curvature. From an information geometric perspective, this corresponds to a process of metric learning: the neural system gradually reweights different directions in parameter space according to their historical contribution to reducing free energy. Temporal constraints dictate the time window over which such metric updates are permitted, balancing responsiveness against stability.<\/p>\n<p>Noise plays a constructive geometric role under temporal constraints. Stochastic fluctuations in activity allow the system to sample around the current prior, estimating local curvature and identifying directions of low effective resistance to change. This sampling-based probing can be interpreted as an online, thermodynamically grounded approximation to computing the Fisher information matrix. Over longer timescales, synaptic plasticity averages over these stochastic excursions, consolidating only those changes that consistently reduce prediction error. The result is a geometry that is sculpted by the joint statistics of inputs and internal fluctuations, rather than being fixed a priori.<\/p>\n<p>Retrocausality-like interpretations sometimes arise when considering that slow structural priors appear to anticipate future regularities they have never directly observed in their current configuration. Information geometry dissolves much of this apparent paradox. Because the manifold is shaped by long histories of interaction with the environment, the local curvature at any point encodes decades of past adaptation. When a new input sequence arrives, the system\u2019s geodesic response\u2014its preferred update direction\u2014reflects constraints accumulated over this entire history. To an observer focused only on the current episode, it may appear as though the network is biased toward representations appropriate for future stimuli, but this is simply the manifestation of previously learned geometric structure guiding new trajectories.<\/p>\n<p>Under strong temporal constraints, optimally efficient adaptation corresponds to following approximate geodesics on the manifold of priors. Deviating from geodesic paths requires additional work: more synaptic changes for the same improvement in predictive performance, or greater metabolic cost for the same reduction in free energy. Biological learning rules that depend on local error signals and eligibility traces can approximate such geodesic flows by incrementally aligning plasticity with the natural gradient implied by the geometry. Two-time thermodynamics supports this process by allowing fast exploration to map out nearby geodesic directions, while slow consolidation reduces variance and locks in the most economical paths.<\/p>\n<p>From a computational standpoint, information geometry under temporal constraints suggests design principles for artificial learning systems that emulate neural priors. One principle is to separate fast and slow parameter subsets and equip them with different effective metrics and learning rates, so that the fast subset explores a relatively flat, high-temperature submanifold while the slow subset adjusts along high-curvature, low-temperature directions that summarize long-term structure. Another principle is to employ approximate natural gradient methods that are temporally smoothed: curvature estimates are accumulated over long windows to avoid overreacting to transient fluctuations, mirroring how biological systems accumulate evidence before altering deep priors.<\/p>\n<p>This framework also sheds light on capacity allocation in large networks. Temporal constraints imply that not all directions in parameter space can be equally well explored or maintained. Information geometry makes explicit which combinations of synaptic parameters correspond to highly distinguishable priors and which are redundant or degenerate. By aligning slow plasticity with directions that provide the greatest marginal reduction in free energy per unit of metabolic cost, the system implicitly performs geometric sparsification of its priors. Parameters lying in flat, poorly informed directions remain free to fluctuate at fast timescales, serving as a flexible substrate for context-specific adaptations that do not overwrite core structure.<\/p>\n<p>Pathological disruptions of priors can be interpreted as distortions of the underlying information geometry under altered temporal constraints. Excessively rigid priors, as in certain forms of psychosis or obsessive-compulsive disorder, correspond to regions of artificially inflated curvature where geodesic paths are narrowly confined and effective temperature is too low to allow exploration. Conversely, conditions characterized by unstable expectations and hypersensitivity to noise may reflect flattened or ill-conditioned manifolds where curvature fails to provide clear guidance, and fast dynamics wander broadly without convergence. Two-time thermodynamics helps explain how such distortions can arise from misaligned temperature schedules or breakdowns in the separation of time scales that ordinarily maintain a healthy geometric structure.<\/p>\n<p>Information geometry under temporal constraints clarifies how sleep and offline replay can reorganize priors without constant exposure to external input. During these phases, effective temperatures and plasticity rules are reweighted, allowing the system to traverse parts of the manifold that are energetically or computationally inaccessible during active behavior. Replay sequences generate synthetic prediction errors that refine curvature along behaviorally important directions, while global renormalization of synaptic strengths adjusts the metric to prevent overfitting to particular episodes. In this way, the geometry of priors is continually reshaped by both online and offline dynamics, always under the governance of temporal constraints that regulate how far and how fast the system can move through its own space of possible expectations.<\/p>\n<h3>Energy-entropy trade-offs in synaptic adaptation<\/h3>\n<p>Synaptic adaptation unfolds under tight constraints that couple energetic expenditure to informational gains, forcing neural systems to negotiate an ongoing trade-off between energy and entropy. Each synaptic modification requires biochemical work: vesicle cycling, receptor trafficking, protein synthesis, and cytoskeletal remodeling all draw on finite metabolic resources. At the same time, plasticity reshapes the entropy of neural states by altering which activity patterns are likely or unlikely, effectively sharpening or broadening the priors encoded in network connectivity. Energy-entropy trade-offs therefore determine how selectively synapses can encode predictive structure about the environment while remaining within sustainable metabolic budgets.<\/p>\n<p>Within prediction and free energy formulations, synaptic changes are driven by the objective of reducing expected prediction error over future inputs while discouraging overly complex or sharply tuned models. The synaptic matrix can be seen as a thermodynamic medium that sets the landscape over which neural activity evolves. Strengthening a synapse deepens an attractor basin in this landscape, making certain joint firing patterns more probable and thereby lowering entropy locally. Weakening or pruning synapses flattens regions of the landscape, increasing entropy by allowing a larger repertoire of configurations. The free energy functional couples this structural entropy to metabolic cost: deep basins are metabolically expensive to build and maintain but provide reliable, low-entropy predictions, whereas shallow landscapes are cheap but yield noisy, high-entropy behavior.<\/p>\n<p>Two-time thermodynamics clarifies how this trade-off is distributed across fast and slow components of synaptic dynamics. On short timescales, transient forms of plasticity\u2014such as facilitation, depression, and phosphorylation-based changes\u2014can be modulated at relatively low energetic cost, allowing rapid tuning of synaptic gain in response to current context. These fast changes adjust the entropy of activity patterns without committing to long-term structural revisions, effectively implementing a high-temperature, exploratory regime for synaptic states. On longer timescales, mechanisms like spine growth, receptor insertion, and gene-expression\u2013mediated consolidation operate at lower effective temperature but with higher energetic investment. They reduce entropy more durably by committing resources to stable patterns of connectivity that embody robust priors.<\/p>\n<p>From a thermodynamic standpoint, energy invested in long-term synaptic changes is justified only when it leads to a sustained reduction in free energy, averaged over future encounters with similar environmental conditions. Recurrent prediction errors act as a signal that the current synaptic configuration yields excessive entropy relative to available structure in the inputs. When such errors are consistent over long intervals, it becomes cost-effective to pay the metabolic price for structural plasticity that sharpens priors and reduces uncertainty. Conversely, when errors are transient or idiosyncratic, the system prefers to keep changes at the level of fast, reversible plasticity, tolerating higher entropy rather than incurring high energetic costs for modifications that are unlikely to pay off.<\/p>\n<p>The balance between energy and entropy is further mediated by homeostatic plasticity, which counteracts runaway sharpening of priors that would otherwise concentrate activity into a few narrow attractors. If Hebbian mechanisms alone were at work, frequently co-active neurons would continually strengthen their mutual connections, dramatically lowering entropy by funneling diverse inputs into a limited set of rigid patterns. Homeostatic rules\u2014such as synaptic scaling, intrinsic excitability adjustments, and inhibitory plasticity\u2014inject entropy back into the system by normalizing firing rates and preventing any subset of synapses from monopolizing the network\u2019s dynamical range. This regulation ensures that the overall free energy landscape remains sufficiently rugged to support flexible prediction rather than collapsing into a handful of energetically over-favored states.<\/p>\n<p>Energetic constraints shape not only the magnitude but also the selectivity of synaptic adaptations. Synapses that carry signals with high predictive value\u2014strong correlations with future inputs, rewards, or key latent variables\u2014justify repeated investment of metabolic energy. These synapses undergo potentiation, increased structural stability, and protection from pruning, progressively lowering entropy along behaviorally relevant dimensions. Synapses conveying largely unstructured or noisy signals are energetically disfavored: maintaining them yields little reduction in expected free energy and thus offers poor return on metabolic expenditure. Over development and learning, such synapses are weakened or eliminated, effectively reallocating energy away from maintaining high-entropy, uninformative connectivity toward low-entropy, informative priors.<\/p>\n<p>At the microscopic level, synaptic biophysics embodies this trade-off through mechanisms that couple activity history to maintenance costs. Long-term potentiation often requires repeated high-frequency activation and engagement of energetically demanding cascades, including calcium influx, kinase activation, and protein synthesis. These processes can be seen as a thermodynamic filter: only those synapses participating in sufficiently consistent co-activation patterns receive enough energetic input to cross the threshold for consolidation. The high cost of these cascades prevents the indiscriminate formation of low-entropy priors, ensuring that only recurrently useful patterns are locked into structure.<\/p>\n<p>The structural complexity of dendritic arbors and spine distributions illustrates another aspect of the energy-entropy balance. Elaborate dendritic trees and dense spine populations greatly expand the repertoire of possible synaptic configurations, increasing potential informational capacity but also raising metabolic demands for maintenance and signaling. Through pruning and experience-dependent refinement, the nervous system reduces this combinatorial entropy, retaining only those branches and synapses that contribute substantially to lowering predictive uncertainty. The outcome is an architecture that remains richly expressive yet sparse: many potential synapses are never stabilized, and many existing ones remain weak or transient, preserving enough entropy for flexible recombination while concentrating energetic investment on a subset that encodes enduring priors.<\/p>\n<p>Energy-entropy trade-offs also govern synaptic noise. At first glance, reducing noise might appear universally beneficial, as it sharpens responses and improves reliability. However, suppressing noise fully is energetically costly and can lead to overconfident, brittle priors. A moderate level of synaptic release variability and channel noise increases entropy in neural responses, supporting stochastic exploration of representational space. Under a two-time interpretation, fast noise-induced fluctuations allow the system to probe alternative synaptic configurations and activity patterns without committing structural changes. Only when these exploratory excursions repeatedly converge on similar patterns does it become advantageous to invest energy in reducing synaptic noise locally, stabilizing these patterns as low-entropy, high-confidence components of the prior.<\/p>\n<p>Neuromodulatory systems provide a flexible means of adjusting the energy-entropy balance in synaptic adaptation depending on task demands and global brain state. Elevated neuromodulatory tone can temporarily raise the effective temperature of synaptic and circuit dynamics, boosting plasticity rates and tolerance for entropy increases in activity patterns. During phases of intense learning or exploration, such heating supports a broader search over possible synaptic configurations, enabling the discovery of new regularities at the cost of higher energy usage. In contrast, neuromodulatory regimes associated with exploitation, habit execution, or sleep can cool synaptic dynamics, lowering plasticity and entropic variability while reallocating energy toward consolidating already identified patterns into more rigid priors.<\/p>\n<p>On the network level, the distribution of synaptic strengths across populations reflects a compromise between maximizing representational diversity and minimizing redundant energetic expenses. Sparse coding schemes, in which only a small subset of neurons is active at any time and many synapses remain weak or silent, exemplify this compromise. Such schemes reduce average firing-related energy consumption while still allowing a combinatorially large space of potential patterns, preserving entropy at the level of possible configurations rather than instantaneous activity. Synaptic adaptation in sparse regimes tends to selectively strengthen a small number of connections per neuron that capture essential predictive features, converting part of this potential entropy into structured, low-energy attractors while leaving the rest of the space available for future learning.<\/p>\n<p>These trade-offs extend to how synaptic rules balance locality and global coherence. Strictly local rules, such as classical Hebbian updates driven only by pre- and postsynaptic activity, are energetically cheap and implementable with minimal infrastructure, but they can generate excess entropy by reinforcing idiosyncratic correlations that do not generalize. Global or modulatory signals\u2014such as dopamine-based reward prediction errors\u2014impose additional energetic costs for broadcasting information, yet they reduce entropy by selectively stabilizing only those Hebbian changes that contribute to long-term predictive success. The coexistence of local and global components allows synaptic adaptation to retain the energy efficiency of local processing while using sparse global signals to prune away spurious low-entropy structures that would otherwise misallocate resources.<\/p>\n<p>The energy-entropy balance is dynamically renegotiated across development and learning stages. Early in development, synaptogenesis and exuberant connectivity massively increase structural entropy, supported by relatively generous metabolic allocation. This high-entropy regime enables the system to sample a wide range of possible circuit motifs and receptive field organizations. As experience accumulates, activity-dependent pruning and stabilization progressively convert this initial entropy into structured priors, reducing both synaptic count and configurational uncertainty. In adulthood, when environmental statistics are more familiar and metabolic margins narrower, synaptic adaptation becomes more conservative: energy is preferentially invested in fine-tuning existing structures rather than in expanding the space of possible ones.<\/p>\n<p>Sleep and offline states offer windows in which the energetic accounting of synaptic adaptation can be recalibrated. During wakefulness, metabolic resources are primarily directed toward fast activity and immediate prediction, with synaptic changes occurring under noisy and often nonstationary conditions. Offline replay and synaptic renormalization redistribute energy toward evaluating which of the recent synaptic modifications truly reduce long-term free energy. Weakly supported changes may be downscaled or erased, injecting entropy back into the synaptic ensemble and reclaiming energy previously allocated to unhelpful structure. Strongly supported changes are further consolidated, deepening their associated attractors and lowering entropy along dimensions that consistently improve prediction.<\/p>\n<p>From a computational perspective, these biological trade-offs suggest that synaptic adaptation algorithms should explicitly couple a measure of model complexity or synaptic entropy to an energy-like regularizer. Bayesian formulations implement this coupling by penalizing overly precise or intricate priors that do not yield commensurate improvements in predictive accuracy. Thermodynamically inspired learning rules carry this further by tying the strength and persistence of synaptic updates to an energetically weighted free energy objective: synapses change most when prediction improvements per unit energetic cost are high, and remain plastic or noisy when the balance is unfavorable. Such rules produce networks whose connectivity patterns reflect not only the informational structure of data but also the thermodynamic realities of maintaining those structures over time.<\/p>\n<h3>Emergent representations from two-time-scale inference<\/h3>\n<p>Representations that form in systems governed by two-time inference are not directly prescribed but emerge from the coupled evolution of fast activity and slow structure under thermodynamic and informational constraints. Fast inference dynamics explore a rich space of possible states and hypotheses about the environment, while slow synaptic and architectural changes gradually reshape the landscape in which these explorations unfold. Representations arise as recurrently visited regions of this joint state\u2013parameter space: sets of activity patterns and synaptic configurations that co-stabilize because they jointly reduce prediction and free energy across many episodes. Rather than being hand-coded symbols or static features, they are metastable motifs in the flow of dynamics that persist on intermediate timescales.<\/p>\n<p>Under the bayesian brain perspective, these emergent motifs can be interpreted as implicit latent variables, organized by the prior structure encoded in slow parameters and the likelihood constraints imposed by fast sensory streams. Fast variables\u2014spiking patterns, transient assemblies, short-term synaptic modifications\u2014carry moment-to-moment hypotheses about latent causes. Slow variables\u2014long-term weights, connectivity motifs, intrinsic excitability profiles\u2014encode priors over which latent causes are plausible and how they tend to co-occur. Two-time inference couples them: repeated success of certain fast hypotheses in explaining sensory input biases slow plasticity toward making those hypotheses easier to access and more robust. Over time, the network\u2019s representational repertoire condenses into structured manifolds corresponding to high-probability latent configurations under the learned priors.<\/p>\n<p>This emergent organization is shaped critically by the separation of effective temperatures across time scales. High effective temperature in fast dynamics enables broad sampling over candidate representations, allowing the system to explore multiple interpretations of ambiguous inputs. Low effective temperature in slow plasticity prevents rapid overcommitment to any single interpretation, ensuring that only patterns that consistently lower free energy across contexts are woven into the structural fabric of the network. The interplay of hot exploration and cool consolidation yields representations that are both diverse and stable, capturing regularities that transcend specific episodes while remaining sensitive to variability in the data.<\/p>\n<p>On the level of neural populations, two-time inference naturally produces distributed, overlapping codes rather than discrete, localized units. Fast activity patterns tend to occupy low-dimensional manifolds carved out by slow connectivity constraints. These manifolds correspond to emergent representational spaces in which similar environmental causes map to nearby trajectories. Because slow learning operates through incremental adjustments guided by aggregated prediction errors, it tends to smooth and compress these manifolds, aligning them with statistically salient directions in sensory and task space. The result is a form of representation learning in which axes of the internal state space approximate principal or independent components of the underlying generative process, without any explicit dimensionality reduction objective.<\/p>\n<p>Temporal structure in the environment further shapes emergent representations, as two-time inference is inherently sensitive to correlations across different timescales. Fast inference tracks immediate transitions in sensory input, while slow updates accumulate evidence about longer-range dependencies and context. Representations that persist or recur across extended intervals are preferentially stabilized, because they enable predictive compression: a compact description of what is likely to follow given the current state. This leads to emergent coding of temporal abstractions such as routines, trajectories, and contextual frames. For example, a sequence of sensory scenes encountered during navigation can give rise to a slowly evolving latent representation of \u201cplace\u201d or \u201croute\u201d that remains stable across many rapid fluctuations in local details.<\/p>\n<p>Hierarchy in representations also emerges from the structured interaction between fast and slow processes. When multiple layers or modules share a common pool of fast dynamics but distinct slow plasticity regimes, some layers come to encode highly abstract, slowly changing summary variables, while others encode more volatile, fine-grained details. High-level representations are those whose stability confers a substantial reduction in future prediction and free energy: beliefs about context, goals, or global factors that condition many lower-level inferences. Low-level representations adjust quickly to accommodate immediate sensory variations. Two-time thermodynamics ensures that these layers self-organize such that more abstract representations sit in cooler, deeper basins of the structural landscape, while concrete representations inhabit shallower, warmer basins that can be reshaped with little energetic cost.<\/p>\n<p>Crucially, the representational content is not predefined in semantic terms but is determined by the network\u2019s history of interaction with its environment and its resource constraints. A given pattern of slow connectivity may encode \u201cshape\u201d in a visual system, \u201cphoneme\u201d in an auditory system, or \u201cpolicy fragment\u201d in a motor system, depending on which structures yield the greatest reduction in long-term uncertainty per unit of metabolic and plasticity cost. Two-time inference provides the mechanism by which these content-specific codes are discovered: fast trial-and-error exploration proposes candidate partitions of input space, and slow consolidation preserves those partitions that support compact, reliable predictions. Over developmental time, this process yields a repertoire of emergent representations specialized to the ecological niches and behavioral demands the system routinely faces.<\/p>\n<p>Noise and stochasticity, often seen as obstacles, play a constructive role in shaping these representations. At fast timescales, stochastic fluctuations allow the system to sample around current attractors, revealing alternate viable representations that might otherwise remain hidden. At intermediate timescales, this sampling exposes symmetries and redundancies in the mapping between inputs and internal states, guiding slow plasticity to break irrelevant symmetries while preserving those that reflect genuine invariances of the environment. As a result, emergent representations tend to be invariant to nuisance transformations\u2014like small shifts, rotations, or timing jitters\u2014because slow learning preferentially stabilizes those aspects of fast dynamics that remain predictive across such variations.<\/p>\n<p>The geometry of emergent representational manifolds is also sculpted by two-time dynamics. Fast activity trajectories that frequently co-occur and yield similar prediction errors are gradually drawn closer together in state space through correlated plasticity, while those that lead to divergent or unreliable predictions are pushed apart. Over time, this produces clusters and continuous curves in the activity space that correspond to categories, prototypes, and smooth feature axes, respectively. Category-like representations arise when distinct regions of input space demand qualitatively different predictive strategies, leading to separated basins in the slow energy landscape. Continuous feature representations emerge when the underlying generative structure varies smoothly, encouraging the network to arrange internal codes along approximate geodesics that minimize distortion of predictive relationships.<\/p>\n<p>Two-time inference naturally supports compositional and factorized representations, in which complex states are encoded as combinations of simpler building blocks. Because fast processes recombine existing patterns more rapidly than slow learning can create entirely new ones, the system is biased toward reusing and composing existing representational fragments. Slow plasticity reinforces those combinations that repeatedly prove useful, gradually shaping synaptic structure so that frequently co-occurring fragments become easier to co-activate. This incremental consolidation of pattern combinations yields factorized latent spaces, where independent or weakly coupled causes of sensory input correspond to approximately separable submanifolds. Such factorization enhances generalization, because novel inputs can be represented as new mixtures of familiar components without requiring wholesale reconfiguration of the network.<\/p>\n<p>In tasks involving action and control, emergent representations reflect not only the statistical regularities of sensory input but also the causal consequences of the system\u2019s own behavior. Two-time inference intertwines perception and action by allowing fast dynamics to encode counterfactual hypotheses about possible actions and their outcomes, while slow learning updates structural priors based on the long-run success of these hypotheses. Representations that support effective control\u2014such as predictive models of body dynamics or environmental affordances\u2014are selectively stabilized because they yield sustained reductions in prediction error under closed-loop interaction. Over time, this process yields sensorimotor contingencies and forward models that are not explicitly programmed but emerge as useful intermediate variables in the service of free energy minimization.<\/p>\n<p>Attentional mechanisms can be interpreted as dynamic reshaping of emergent representational manifolds on fast timescales, conditioned on task demands and internal goals. By modulating gain, effective temperature, or the weighting of specific pathways, attention temporarily distorts the local landscape, effectively reparameterizing which regions of representation space are accessible or salient to fast dynamics. Two-time inference ensures that only sustained attentional habits become structurally embedded: if certain features or tasks repeatedly draw attention, slow plasticity reorganizes representations so that these features become easier to isolate and manipulate, reducing the need for costly top-down modulation in the future. Thus, attentional history imprints itself on the emergent representational geometry, biasing future inference.<\/p>\n<p>Sleep and offline processing provide a complementary channel through which emergent representations are refined in a two-time framework. During offline states, fast dynamics replay sequences of previously experienced activity patterns under altered neuromodulatory conditions, effectively resampling from the empirical distribution of past episodes. Slow plasticity responds to these internally generated patterns as if they were new data, but with an emphasis on consistency and co-occurrence statistics accumulated over many replays. Representations that were only weakly supported during wakefulness may be pruned or merged with others, while those that capture robust, frequently reencountered structure are consolidated. The net effect is a reorganization of representational manifolds that sharpens the alignment between structural priors and the long-run statistics of both external inputs and internal simulations.<\/p>\n<p>From a computational modeling standpoint, emergent representations under two-time inference highlight the power of coupling fast sampling-based inference with slower, structurally constrained learning in artificial systems. Fast dynamics instantiated through recurrent updates, stochastic hidden units, or attention-like gating can explore a space of potential explanatory patterns, while slow parameter updates integrate the statistics of these explorations to sculpt an internal representational basis. When designed to minimize an energy or free energy objective, such systems automatically discover codes that balance expressivity, robustness, and metabolic or computational efficiency, mirroring the emergent, task-adaptive representations seen in biological neural circuits.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Neural priors in non-equilibrium learning dynamics can be understood as structured constraints that bias how&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[1],"tags":[323,1654,735,1615,1613,536,1675],"class_list":["post-3208","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-bayesian-brain","tag-free-energy","tag-prediction","tag-priors","tag-retrocausality","tag-thermodynamics","tag-two-time"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.0 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Neural priors under two-time thermodynamics - Beyond the Impact<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/beyondtheimpact.net\/?p=3208\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Neural priors under two-time thermodynamics - Beyond the Impact\" \/>\n<meta property=\"og:description\" content=\"Neural priors in non-equilibrium learning dynamics can be understood as structured constraints that bias how&hellip;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/beyondtheimpact.net\/?p=3208\" \/>\n<meta property=\"og:site_name\" content=\"Beyond the Impact\" \/>\n<meta property=\"article:published_time\" content=\"2026-01-04T15:47:53+00:00\" \/>\n<meta name=\"author\" content=\"admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"39 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/beyondtheimpact.net\/?p=3208#article\",\"isPartOf\":{\"@id\":\"https:\/\/beyondtheimpact.net\/?p=3208\"},\"author\":{\"name\":\"admin\",\"@id\":\"https:\/\/beyondtheimpact.net\/#\/schema\/person\/a5cf96dc27c4690dbf266a6cae4ee9aa\"},\"headline\":\"Neural priors under two-time thermodynamics\",\"datePublished\":\"2026-01-04T15:47:53+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/beyondtheimpact.net\/?p=3208\"},\"wordCount\":7854,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/beyondtheimpact.net\/#organization\"},\"keywords\":[\"Bayesian brain\",\"free energy\",\"prediction\",\"priors\",\"retrocausality\",\"thermodynamics\",\"two-time\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/beyondtheimpact.net\/?p=3208#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/beyondtheimpact.net\/?p=3208\",\"url\":\"https:\/\/beyondtheimpact.net\/?p=3208\",\"name\":\"Neural priors under two-time thermodynamics - Beyond the Impact\",\"isPartOf\":{\"@id\":\"https:\/\/beyondtheimpact.net\/#website\"},\"datePublished\":\"2026-01-04T15:47:53+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/beyondtheimpact.net\/?p=3208#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/beyondtheimpact.net\/?p=3208\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/beyondtheimpact.net\/?p=3208#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/beyondtheimpact.net\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Neural priors under two-time thermodynamics\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/beyondtheimpact.net\/#website\",\"url\":\"https:\/\/beyondtheimpact.net\/\",\"name\":\"BeyondTheImpact\",\"description\":\"Concussion, FND and Neuroscience\",\"publisher\":{\"@id\":\"https:\/\/beyondtheimpact.net\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/beyondtheimpact.net\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/beyondtheimpact.net\/#organization\",\"name\":\"Beyond the Impact\",\"url\":\"https:\/\/beyondtheimpact.net\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beyondtheimpact.net\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/beyondtheimpact.net\/wp-content\/uploads\/2025\/04\/955D378D-9439-4958-AA9D-866B66877DCB-1.png\",\"contentUrl\":\"https:\/\/beyondtheimpact.net\/wp-content\/uploads\/2025\/04\/955D378D-9439-4958-AA9D-866B66877DCB-1.png\",\"width\":1024,\"height\":1024,\"caption\":\"Beyond the Impact\"},\"image\":{\"@id\":\"https:\/\/beyondtheimpact.net\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/beyondtheimpact.net\/#\/schema\/person\/a5cf96dc27c4690dbf266a6cae4ee9aa\",\"name\":\"admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beyondtheimpact.net\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/59867129c03db343d7fdc6272ec5e0a85250cd376a4e7153307728ae82a1b108?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/59867129c03db343d7fdc6272ec5e0a85250cd376a4e7153307728ae82a1b108?s=96&d=mm&r=g\",\"caption\":\"admin\"},\"sameAs\":[\"https:\/\/beyondtheimpact.net\"],\"url\":\"https:\/\/beyondtheimpact.net\/?author=1\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Neural priors under two-time thermodynamics - Beyond the Impact","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/beyondtheimpact.net\/?p=3208","og_locale":"en_US","og_type":"article","og_title":"Neural priors under two-time thermodynamics - Beyond the Impact","og_description":"Neural priors in non-equilibrium learning dynamics can be understood as structured constraints that bias how&hellip;","og_url":"https:\/\/beyondtheimpact.net\/?p=3208","og_site_name":"Beyond the Impact","article_published_time":"2026-01-04T15:47:53+00:00","author":"admin","twitter_card":"summary_large_image","twitter_misc":{"Written by":"admin","Est. reading time":"39 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/beyondtheimpact.net\/?p=3208#article","isPartOf":{"@id":"https:\/\/beyondtheimpact.net\/?p=3208"},"author":{"name":"admin","@id":"https:\/\/beyondtheimpact.net\/#\/schema\/person\/a5cf96dc27c4690dbf266a6cae4ee9aa"},"headline":"Neural priors under two-time thermodynamics","datePublished":"2026-01-04T15:47:53+00:00","mainEntityOfPage":{"@id":"https:\/\/beyondtheimpact.net\/?p=3208"},"wordCount":7854,"commentCount":0,"publisher":{"@id":"https:\/\/beyondtheimpact.net\/#organization"},"keywords":["Bayesian brain","free energy","prediction","priors","retrocausality","thermodynamics","two-time"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/beyondtheimpact.net\/?p=3208#respond"]}]},{"@type":"WebPage","@id":"https:\/\/beyondtheimpact.net\/?p=3208","url":"https:\/\/beyondtheimpact.net\/?p=3208","name":"Neural priors under two-time thermodynamics - Beyond the Impact","isPartOf":{"@id":"https:\/\/beyondtheimpact.net\/#website"},"datePublished":"2026-01-04T15:47:53+00:00","breadcrumb":{"@id":"https:\/\/beyondtheimpact.net\/?p=3208#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/beyondtheimpact.net\/?p=3208"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/beyondtheimpact.net\/?p=3208#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/beyondtheimpact.net\/"},{"@type":"ListItem","position":2,"name":"Neural priors under two-time thermodynamics"}]},{"@type":"WebSite","@id":"https:\/\/beyondtheimpact.net\/#website","url":"https:\/\/beyondtheimpact.net\/","name":"BeyondTheImpact","description":"Concussion, FND and Neuroscience","publisher":{"@id":"https:\/\/beyondtheimpact.net\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/beyondtheimpact.net\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/beyondtheimpact.net\/#organization","name":"Beyond the Impact","url":"https:\/\/beyondtheimpact.net\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beyondtheimpact.net\/#\/schema\/logo\/image\/","url":"https:\/\/beyondtheimpact.net\/wp-content\/uploads\/2025\/04\/955D378D-9439-4958-AA9D-866B66877DCB-1.png","contentUrl":"https:\/\/beyondtheimpact.net\/wp-content\/uploads\/2025\/04\/955D378D-9439-4958-AA9D-866B66877DCB-1.png","width":1024,"height":1024,"caption":"Beyond the Impact"},"image":{"@id":"https:\/\/beyondtheimpact.net\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/beyondtheimpact.net\/#\/schema\/person\/a5cf96dc27c4690dbf266a6cae4ee9aa","name":"admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beyondtheimpact.net\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/59867129c03db343d7fdc6272ec5e0a85250cd376a4e7153307728ae82a1b108?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/59867129c03db343d7fdc6272ec5e0a85250cd376a4e7153307728ae82a1b108?s=96&d=mm&r=g","caption":"admin"},"sameAs":["https:\/\/beyondtheimpact.net"],"url":"https:\/\/beyondtheimpact.net\/?author=1"}]}},"_links":{"self":[{"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=\/wp\/v2\/posts\/3208","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3208"}],"version-history":[{"count":0,"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=\/wp\/v2\/posts\/3208\/revisions"}],"wp:attachment":[{"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3208"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3208"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/beyondtheimpact.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3208"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}