A new study published in Cell Reports shows that midbrain dopamine neurons are sensitive to previously experienced time intervals, and that this is likely to be important in terms of reward processing. Midbrain dopamine neurons are frequently discussed in terms of their roles in reward, motivation, and certain forms of learning. However, within the time perception literature, we commonly associate dopamine as modulating the rate of the internal pacemaker. Naturally, these functions of dopamine are not exclusive, and this study makes important progress in integrating them.
Dopamine in reinforcement learning
While early research implicated dopamine as the principle neurotransmitter responsible for the hedonic nature of “liking” something, the contemporary view conceptualises dopaminergic activity as a reinforcement signal that facilitates learning, rather than directly causing pleasure. This is in part due to the classic finding that phasic dopamine activity in the mesolimbic pathway constitutes a reward prediction error (the difference between expected and received reward), commensurate with prescriptive models of reinforcement learning.
During learning, dopamine responses gradually transfer to the earliest predictors of a reward, and after this associative pairing is established, response to the reward itself is reduced or absent. Importantly, this means that these response dynamics are fundamentally sensitive to the expected time of reward delivery.
Further to this, if rewards are delivered at different delays, the phasic responses of dopamine neurons to cues signalling these rewards depend on the duration of the delay (as well as reward probability, magnitude and type). This decreased response to longer reward delays typifies the economic principle of temporal discounting: rewards are devalued as a function of delay until their receipt. In reflecting the reduced value of delayed rewards, these neural responses demonstrate sensitivity to timing and appear to encode the intervals between cues and prospective (i.e. future) rewards.
Dopamine and time perception
In addition to its associations with motivation and reward, as a pharmacological agent, dopamine has been routinely acknowledged to play a significant role in time perception, in what some refer to as the ‘dopamine clock hypothesis’. Two sets of evidence in particular highlight this.
Firstly, non-human animal studies have pharmacologically manipulated dopamine during time perception tasks. When given dopamine agonists (e.g. methamphetamine) during a peak interval procedure, rats’ response rates peak earlier, as if their internal pacemaker was accelerated. When given dopamine antagonists (e.g. haloperidol), peak responses are later, commensurate with a slowing of the pacemaker.
Secondly, electrophysiological and optogenetic studies of neurons in the substantia nigra (which produces dopamine and has inputs to the striatum) have shown that optogenetic activation or suppression of these neurons result in later and earlier timed responses, respectively. These results respectively reflect a slower or faster internal pacemaker, which is the opposite pattern of results seen in the pharmacological studies.
The present study
From the background above, we can see that dopamine appears to be involved in both time perception and reward processing. However, dopamine neurons have previously only been shown to encode elapsing and future delays. The study from Fonzi et al. questioned whether dopamine signals could also convey information related to retrospective, past delays. For example, do dopamine responses to a reward cue encode how much time has already been invested in the pursuit of the reward?
The researchers developed a Pavlovian conditioning paradigm with two reward cues that provided identical information about an upcoming reward, but differed in terms of how much time had elapsed since the previous reward. One cue was only presented after a 15–25 s wait time (“short cue”), while the other was only presented after a 65–75 s wait time (“long cue”). The researchers trained rats with this design while simultaneously using fast-scan cyclic voltammetry to record dopamine concentration in the nucleus accumbens core. If the dopamine responses to the short and long cues did not differ, then it would seem that dopamine activity only encodes prospective information. On the other hand, if the dopamine response to the long cue was larger than that of the short cue, this could be said to reflect the sunk cost of time. Conversely, if the signal to the long cue was decreased relative to that of the short cue, this could be said to reflect the rate of reward.
The results showed that within this simple experimental design, dopamine responses to the long cue were decreased relative to short cue, suggesting that dopamine in the nucleus accumbens encodes reward rate. An alternative possibility was that this differing dopamine response reflected differing expectations about the time of delivery – the response to the long cue could be decreased because as time elapses, it is increasingly likely that the cue will be shown (i.e. a change in hazard rate). However, there was no relationship between the dopamine response and the time elapsed within each cue type. Furthermore, when another cohort of rats was trained with only a single cue for both short and long wait conditions, no differences were seen in the cue-evoked dopamine response for different wait times. Both of these results speak against the possibility that the dopamine response reflected the changing likelihood of reward delivery over time.
Notably, the principle finding above relied on a single analysis, and the relative difference between the short and long cues. The authors of the study thus performed a follow up analysis to determine whether this retrospective temporal information could be encoded when the animals were not able to directly compare cues. To do this, they trained an independent cohort of rats with short trials and long trials in separate sessions. Even in these scenarios, the short cue evoked a larger dopamine response than the long cue, which suggested that the encoding of retrospective delays was context-independent.
However, once these rats were exposed to both cues in a mixed session, the response to the short cue was increased. While for most of the above experiments there were no differences in behaviour between the two conditions, this increase in dopamine response to the short cue in this intermixed session was also accompanied by an increase in behavioural responding. This implies that (while elapsed wait times can be learnt independently) the dopaminergic encoding of retrospective delays is not entirely context-independent. It also shows that while there are not generally behavioural differences between the short and long cues, there appear to be changes in behaviour when there are also changes in dopamine response.
In a final analysis, the researchers also investigated the effect of the previous trial type, and the tonic dopamine signals over the waiting time. Firstly, for rats recently switch from the separate sessions to an intermixed session, they found that dopamine responses to short cues were significantly increased when the preceding trial was a long cue trial, compared when the preceding trial was a short cue trial. Similarly, dopamine levels were increased during the waiting period after long cue trials, relative to short cue trials (but only up to 25 s, before the identity of the current trial was known). From around the point that the identity was known (25 s), conditioned responding decreased when the preceding trial was a long cue trial, relative to when it was a short cue trial. One possible implication here is that a decrease in wait time dopamine could promote increased anticipatory responding. This would be consistent with the electrophysiological and optogenetic evidence that reducing dopamine increases pacemaker rate (see above).
It is important to reiterate that the results in the former two paragraphs only applied to the experiments where rats where moved from separate training on the short and long cues to an intermixed schedule. These results therefore represent peculiarities in how these animals learnt and adapted to their new context. Overall, the results of the first experiment are the most important here: phasic dopamine responses encode previous durations and appear to constitute a signal of previous reward rate.
This study compellingly demonstrates how even simple experimental designs can lead to novel and valuable findings. The fact that nucleus accumbens dopamine responses encode reward rate suggests a potential mechanism that could normalise value signals for future rewards, and provide contextual information such as the sunk cost of time.
If cue-evoked dopamine responses have to encode durations over a large range of timescales (potentially over 15 orders of magnitude) one interesting future avenue for research would be to describe the mapping between these dopamine responses and the duration of the delays preceding them, in order to precisely understand how durations are represented. More work needs to be done to comprehensively understand the functions of tonic and phasic dopamine and how they relate to perceived and experienced durations, but this study makes substantial progress toward this goal.
Fonzi, K. M., Lefner, M. J., Phillips, P. E. M., & Wanat, M. J. (2017). Dopamine Encodes Retrospective Temporal Information in a Context. Cell Reports 20(8), p. 1774. doi: 10.1016/j.celrep.2017.07.076