Basal Nuclei - Context Processor
Striking points:
(1) The basal nuclei is grouped into an input stucture (striatum), intermediary structures (external pallidum, pars compacta, and subthalamic nucleus), and output structures (internal pallidum and pars reticulata).
(2) The basal nuclei exists in basal nuclei (BN) circuits with the neocortex and thalamus; BN circuits display a convergent structure and are arranged in parallel with respect to each other.
(3) BN circuits possess two pathways, the striatonigral (STN) and striatopallidal (STP) pathways, and the relative activities of these two pathways determines the speed and accuracy of neocortical processing by establishing a universal, dynamic matching threshold that is calibrated during rapid eye movement (REM) sleep; increasing the STN pathway lowers the matching threshold whereas increasing the STP pathway raises the matching threshold.
(4) When it is not clear which one out of several competing memory networks should be converted into a movement or cognition, BN circuits convergently parallel process context to modify the matching thresholds for each of the competing memory networks, allowing the most statistically suitable memory networks for a particular situation to be autoassociated and converted into movements or cognitions.
(5) Over time, BN circuits employ reinforcement learning to adjust the relative strengths of the STN and STP pathways, maintaining the matching threshold at an ideal set point.
(1) The basal nuclei is grouped into an input stucture (striatum), intermediary structures (external pallidum, pars compacta, and subthalamic nucleus), and output structures (internal pallidum and pars reticulata).
(2) The basal nuclei exists in basal nuclei (BN) circuits with the neocortex and thalamus; BN circuits display a convergent structure and are arranged in parallel with respect to each other.
(3) BN circuits possess two pathways, the striatonigral (STN) and striatopallidal (STP) pathways, and the relative activities of these two pathways determines the speed and accuracy of neocortical processing by establishing a universal, dynamic matching threshold that is calibrated during rapid eye movement (REM) sleep; increasing the STN pathway lowers the matching threshold whereas increasing the STP pathway raises the matching threshold.
(4) When it is not clear which one out of several competing memory networks should be converted into a movement or cognition, BN circuits convergently parallel process context to modify the matching thresholds for each of the competing memory networks, allowing the most statistically suitable memory networks for a particular situation to be autoassociated and converted into movements or cognitions.
(5) Over time, BN circuits employ reinforcement learning to adjust the relative strengths of the STN and STP pathways, maintaining the matching threshold at an ideal set point.
Opening
The basal nuclei is often written off as an "ancient motor system" that has been replaced by the neocortex. While this may be partially true, it undermines the current primary and essential role of the basal nuclei in neocortical data processing.
Basal Nuclei And Projections
The basal nuclei is a collection of subcortical gray matter surrounding the thalamus deep within the brain. Based on their macroscopic appearance (Groenewegen, 2003), the four distinct basal nuclei structures are the striatum, pallidum, substantia nigra, and subthalamic nucleus.
(1) Striatum. The striatum consists of the caudate nucleus, putamen, and nucleus accumbens. The medium spiny neuron represents 90-95% of all striatal neurons (Kemp and Powell, 1971); the rest are interneurons. The dendritic branches of medium spiny neurons are packed with small spines receiving thousands of synaptic inputs from the neocortex, thalamic intralaminar nuclei, amygdala, and hippocampus, with up to 20,000 inputs per medium spiny neuron (Houk, 2007). Based on axonal projections, medium spiny neurons are of two types - striatonigral neurons, projecting directly to the basal nuclei output structures (internal pallidum and pars reticulata) and striatopallidal neurons, projecting to the external pallidum (Kreitzer and Malenka, 2008). The striatum mainly contains the inhibitory neurotransmitter gamma-amino butyric acid (GABA) and it lacks glutamatergic neurons (Kreitzer and Malenka, 2008). Striatonigral neurons express dopamine D1 receptors and striatopallidal neurons express dopamine D2 receptors; both express serotonin (mainly 5HT-2) receptors (Groenewegen, 2003). (2) Pallidum. The pallidum has internal and external sections. Most of the pallidum consists of large aspiny neurons (Groenewegen, 2003). The internal pallidum forms one of two major basal nuclei output structures and projects mainly to the thalamus (intralaminar, ventral anterior, and medial dorsal relay nuclei). It also projects to the brainstem (reticular formation and superior colliculus). Pallidal neurons contain the inhibitory neurotransmitter GABA. (3) Substantia nigra. The substantia nigra (SN) consists of pars reticulata and pars compacta; the pars reticulata contains neurons similar to those of the pallidum whereas the pars compacta consists of neurons blackened by melanin (Francois et al, 1999). The pars reticulata, along with the internal pallidum, is a major basal nuclei output structure and contains GABA. The pars compacta neurons contain the pleasure neurotransmitter dopamine. (4) Subthalamic nucleus. The subthalamic nucleus is a compact homogenous group of neurons with sparsely spiny dendrites (Rafols and Fox, 1976). These neurons contain the excitatory neurotransmitter glutamate. |
There are four basal nuclei structures - the striatum (caudate nucleus, putamen, and nucleus accumbens), pallidum (internal, external, and ventral sections), substantia nigra, and subthalamic nucleus.
Coronal basal nuclei view (closer to the face) showing the striatum and pallidum. The striatum (purple) consists of the caudate nucleus (smaller top structures), putamen (larger lateral structures), and nucleus accumbens (not shown). The pallidum (green) consists of internal (GPi) and external (GPe) sections.
Coronal basal nuclei view (closer to the back of the head) showing the SN and subthalamic nucleus. The SN (orange) consists of the pars reticulata and pars compacta which are located close together. The subthalamic nucleus (yellow) lies just above.
|
Basal Nuclei Circuits
We saw in chapter three that the thalamus and neocortex are intimately bound together through corticothalamocortical circuits. The basal nuclei is also intimately linked to the thalamus and neocortex (Saint-Cyr, 2003) through distinct basal nuclei (BN) circuits (Alexander, 1986). The generic BN circuit consists of an information loop from the neocortex through the basal nuclei to the thalamus and back. BNTC circuits have two striking structural features - their converging structure and parallel arrangement.
(1) Convergent structure. Much of the neocortex - as well as the intralaminar nuclei, hippocampus, and amygdala - projects onto the striatal medium spiny neurons, up to 20,000 inputs per neuron (Houk, 2007). The striatum is therefore not only a major input structure, but also a major information integrating structure. After leaving the striatum, information flows through the remaining basal nuclei to the main basal nuclei output structures (internal pallidum and pars reticulata). and then on to the thalamus (intralaminar nuclei, ventral anterior, and medial dorsal relay nuclei). It is then returned via the radiations to a specific area of neocortex. Some of the returning information formed part of the initial BN circuit input, representing a closed portion of the circuit, but some of it originated in other neocortical areas, representing an open portion of the circuit; for this reason, BN circuits are partially closed circuits (Alexander, 1986). A skeleton diagram is useful in showing the generic BN circuit (Alexander, 1986); even with a quick glance, a funnelling effect is clearly evident - information is integrated more and more as it moves from neocortex to striatum to basal nuclei output structures to thalamus. This funnelling effect, or convergence, occurs at every stage. (2) Parallel arrangement. BN circuits can be functionally segregated into discrete loops (Alexander, 1986); the most well-known of these loops is the motor circuit which is, unsurprisingly, thought to be important in controlling movement. The closed portion of this motor circuit loops from and returns to the supplementary motor area (Alexander, 1986). Several other functionally segregated circuits have been described, including the occulomotor circuit, the dorsolateral prefrontal circuit, the lateral orbitofrontal circuit, and the anterior cingulate circuit (Alexander, 1986). In turn, each of these circuits consists of thousands of smaller, microscopic circuits (Houk, 2007), and each is topographically arranged such that it runs through slightly different areas of thalamus, basal nuclei, and neocortex in accordance with its particular function. BN circuit functional segregation is interesting, but what is far more interesting is that the overall structures of these functionally segregated circuits are all the same - they are all massively repeating preserved units (Leyden and Kleinig, 2008), much like the minicolumns of the neocortex (Mountcastle, 1997). Thus, it is probable that a common BN circuit processing algorithm exists, just as there appears to be a common neocortical processing algorithm. Finally, these structurally circuits are all arranged side-by-side with respect to each other, in parallel, a fact that is often overlooked but as we shall see is critical to the function of the BN circuits. |
Simple diagram showing WCTC, SCTC, and BN circuits. Note that the two corticothalamocortical circuits form run directly between the thalamus and neocortex whereas BN circuits involve the basal nuclei.
Skeleton diagram (Alexander, 1986) showing three converging BN circuits (A, B, and C) arranged in parallel. Information runs from multiple areas of neocortex to the striatum to be partially integrated, then to the basal nuclei output structures (internal and ventral pallidum, and pars reticulata) where it is integrated more, and then on to the thalamus by which time information from the different neocortical areas is fully integrated. It is then sent back to the neocortex. Note that some of the information returning to neocortical area A originated there (closed portion of the circuit) whereas some of it did not (open portion); thus, BN circuits are partially closed circuits.
This diagram portrays BN circuits as functionally segregated parallel loops, by showing the motor, occulomotor, and dorsolateral prefrontal circuits (Alexander, 1986). Each circuit runs through a different region of the neocortex, striatum, pallidum, pars reticulata, and thalamus. While the different connections lead to functional differences between each circuit, they are structurally the same and therefore actually process information the same way.
|
The Matching Threshold
Although there have been attempts to replace it (Gurney et al, 2001), the prevailing model of basal nuclei function for over two decades (Albin et al, 1989) consists of two pathways which have a significant impact on the control of movements. They do this by establishing a universal, dynamic memory network matching threshold that applies not only to movements, but cognitions as well.
(1) Basal nuclei pathways. These two pathways diverge at the striatum and affect movement in opposite ways; their relative tonic activities determine the body's overall state of movement. Dopamine has opposing actions in each pathway as a result of the differing striatal dopamine receptors in each (Groenewegen. 2003). (a) Striatonigral pathway. The striatonigral (STN) pathway arises from striatonigral medium spiny neurons containing GABA, substance P, and dynorphin (Groenewegen, 2003); these neurons express dopamine D1 receptors. Neoortical stimulation of striatonigral neurons results in striatal inhibition of the basal nuclei output structures, in turn resulting in less thalamic inhibition. The disinhibited thalamus stimulates the neocortex, and hence the body, into a state of "more" movement. Relative overactivity of the STN pathway results in a hyperkinetic or dyskinetic state as seen in conditions such as Huntington's disease, which involves degeneration of the striatopallidal medium spiny neurons (Abdo et al, 2010). (b) Striatopallidal pathway. The striatopallidal (STP) pathway arises from striatopallidal medium spiny neurons containing GABA and enkephalin (Gerfen and Wilson, 1996); these neurons express dopamine D2 receptors. This pathway is more complicated with several more basal nuclei structures involved. Neocortical stimulation of striatopallidal neurons results in striatal inhibition of the external pallidum such that the subthalamic nucleus is in turn disinhibited, resulting in stimulation of the basal nuclei output structures and thalamic inhibition. The inhibited thalamus inhibits the neocortex, and hence the body, into a state of "less" movement. Relative overactivity of the STP pathway results in a hypokinetic or bradykinetic state as seen in Parkinson's disease, which involves degeneration of the SN (Abdo et al, 2010). (2) A universal, dynamic matching threshold. Since the basal nuclei exist within greater BN circuits, the relative activity of the STN and STP pathways must influence what happens not only in the basal nuclei, but in the thalamus and neocortex as well. Recall from chapter three the concept of matching, which reflects the degree to which neocortical memory networks correctly predict current information about the world carried within the sensory stream; matching can be good, partial, or poor. It has been speculated that a memory network will only produce a movement or cognition if it meets a matching threshold (the percentage of minicolumns in a memory network that must be individually matched before the entire memory network is converted into a movement or cognition) which is determined by the relative tonic activities of the STN and STP pathways (Leyden and Kleinig, 2008). The matching threshold is universal in that it exists across the entire neocortex and applies to both motor and cognitive memory networks. It is also dynamic in that it may be adjusted by modifying the relative tonic activities of the STN and STP pathways. Support for a universal, dynamic matching threshold has been provided in a study that examined the interpretation of sensory data by patients with Parkinson's disease (Leyden and Kleinig, 2008). When interpreting faces in a crowd, patients in the "on" state (STN pathway relatively overactive) recognize faces when they are not there; their matching threshold is set too low so that in addition to inappropriate motor memory networks being matched resulting in hyperkinesia or dyskinesia, inappropriate "face" visual memory networks are matched despite inadequate visual sensory data resulting in false positive recognition. In contrast, when the same patients are in the "off" state (STP pathway relatively overactive), they fail to recognize faces when they are there; their matching threshold is set too high so that in addition to the appropriate motor memory networks not being matched resulting in hypokinesia or bradykinesia, the appropriate "face" visual memory networks are not matched despite adequate visual sensory data resulting in false negative recognition. (3) Matching threshold calibration. We have just stated that Parkinson's disease is a degenerative condition involving the basal nuclei characterized by rigidity, bradykinesia, tremor, and a festinating gait during wakefulness; however, we have not mentioned what happens to these patients during rapid eye movement (REM) sleep. Normal subjects do not move during REM sleep as a result of the near-total body paralysis produced by the medullary reticular formation (Kohyama et al, 1998), but patients with Parkinson's disease almost universally experience a REM sleep disorder (Poryazova and Zachariev, 2005), with up to half of them generating vigorous complex movements during REM sleep corresponding to enacted dreams, such as kicking, laughing, punching, or fighting invisible enemies (Comella et al, 1998; De Cock et al, 2007; Gagnon et al, 2002; Schenck et al, 1986). Interestingly, during this REM sleep disorder, motor control is restored in patients with Parkinson's disease; even in the most severely disabled patients and even without levodopa for 12 hours or more, the bradykinesia and tremors vanish, and limb movements, speech, and facial expression improve (De Cock et al, 2007). These observations are best explained by the idea that the basal nuclei are circumvented (taken "off-line") during REM sleep (De Cock et al, 2007). If so, this explains the presence of REM sleep disorders in patients with Parkinson's disease. In the normal situation, the basal nuclei use REM sleep as an opportunity to disassociate from neocortical activity so as to calibrate (optimize and reset) the matching threshold, allowing for appropriate movements and cognitions to be produced during wakefulness; since REM atonia is intact, spontaneously matched neocortical memory networks do not get converted into movements during REM sleep. However, in patients with Parkinson's disease, REM atonia is disrupted (Gagnon et al, 2002) such that spontaneously matched neocortical memory networks are converted into movements during REM sleep, and furthermore, with the faulty basal nuclei circumvented, those movements appear normal. |
The state of body movements depends on the relative tonic activities of the STN and STP pathways; excitatory glutamine-based pathways are shown by blue arrows and inhibitory GABA-based pathways are shown by brown arrows. The STN pathway arises from striatonigral medium spiny neurons that contain the neurotransmitters GABA, substance P, and dynorphin, and express dopamine D1 receptors. The STP pathway arises from striatopallidal medium spiny neurons that contain the neurotransmitters GABA and enkephalin, and express dopamine D2 receptors, and continues on through several more basal nuclei structures.
To illustrate the matching threshold, consider again the "Clint Eastwood's face visual memory network as defined by the matched group of matched minicolumns in IT; a sufficient percentage of minicolumns in the lower levels of the neocortical hierarchy had to be matched before the entire visual memory network could be converted into recognition.
Let's say you now see John Wayne's face for the first time, so there is no neocortical memory network for him; some of the minicolumns in the lower levels of the "Clint Eastwood's face" memory network will be matched as there are a few nonspecific similarities such as as the hat, but there won't be enough to reach the matching threshold and convert the memory network into recognition, which is a good thing since this is not Clint Eastwood's face.
During wakefulness, BN circuits establish an optimal matching threshold so that the most statistically suitable memory networks for the sensory information at hand are converted into movements and cognitions appropriate to the situation. In Parkinson's disease these thresholds are set too high such that excessive amounts of sensory information are required to make a match and convert the appropriate memory networks into movements and cognitions.
During REM sleep, BN circuits are circumvented as there is virtually no sensory information to make a match with; the matching thresholds become irrelevant as neocortical memory networks fire away spontaneously. It may be that the basal nuclei take the opportunity to go "off-line" so as to calibrate the wakefulness matching thresholds.
|
Processing Context
We now speculate on the mechanism by which BN circuits contribute to information processing. With their structure in mind, it is reasonable to conclude that BN circuits convergently parallel process context to alter the matching thresholds for partially matched memory networks, thus allowing the most statistically suitable memory networks for a particular situation to be converted into movements or cognitions.
(1) Neocortical autoassociation. Until now, we have confined much of our discussion about neocortical memory networks to matching, which is the degree to which a memory network correctly predicts current sensory information from the world. When the matching threshold for a memory network is met, all the information contained by that memory network is retrieved in a process called autoassociation (the retrieval of all information contained within a memory network given only partially matched sensory information). Autoassociation is a core feature of neocortical processing. As an example, consider that while on safari you see an elephant head behind a large bush - right away, your neocortex predicts that there will be an entire elephant, even though you do not actually see the entire elephant. From experience, your neocortex knows that elephant heads are almost always accompanied by the rest of the elephant, so despite being given partial sensory information your neocortex uses autoassociation to make a statistically likely prediction that there will be an entire elephant. The world is full of experiences containing limited information; while we mentioned in chapter three that a match can be good, partial, or poor, in reality the majority of matches are partial in that the neocortex usually receives only part of the sensory information needed to produce a movement or cognition; autoassociation allows the neocortex to do this given that limited information. (2) Context. Sometimes it is not clear what the best movement or cognition is for a particular situation. In the case of recognition, if uncertainty about a thing exists it can often be recognized by its situational context (the particular circumstances of a situation). We often define things by their context - using vision again as an example, consider a small round white object in isolation, without context. It may be impossible to specifically identify what that thing is - it could be a golf ball, a ping-pong ball, an egg, a billiard ball, a piece of candy, or any other number of things. Even though your neocortex contains the necessary visual memory networks, the matching threshold cannot be reached for any of these things and they all remain equally viable options. However, things are rarely observed in isolation; if the situational context is provided which includes a nest surrounded by leaves, the probability that the white thing is an egg becomes more likely and the other options become less likely. Even though the sensory information about the egg remains the same, the contextual sensory information allows the egg visual memory network to be be matched and converted into recognition. The analysis of context is the primary function of BN circuits. (3) The convergent, parallel processing of context. As noted earlier the BN circuits have a convergent structure and are arranged in parallel with respect to each other. Regarding their convergent structure, the greatest degree of convergence occurs as the neocortex and other structures project onto the striatum; consequently, the striatal medium spiny neurons are able to incorporate many different pieces of information from widespread areas of neocortex allowing them to act as "coincident detectors" (Groenewegen, 2003) or more specifically, context detectors. Regarding their parallel arrangement, processing context can be complicated, requiring the simultaneous processing of many different pieces of information at the same time. In the case of uncertainty regarding the recognition of a thing or what the most appropriate movement or cognition is, each individual contextual piece of information affects the probability as to which memory network is the best statistical option for a particular situation and as such, all of these contextual pieces of information must be processed in parallel. Thus, BN circuits are convergent, parallel context processors. To illustrate this with an example using motor memory networks, pretend that you are walking along a path and encounter an obstacle, such as a rock, that you need to get around. Given enough sensory data to surpass the matching threshold, your neocortex easily matches and converts the "rock" memory network into recognition. However, in isolation and without context, the recognition of the rock does not provide enough information to match and convert a particular motor memory network into movement - the "kick", "crawl", "step" and "jump" motor memory networks all remain equally viable options. Using saccadic eye movements, your neocortex sequentially analyzes the rest of the scene, adding contextual information over time. You analyze the size of the rock; if it is half a meter high, the "kick" and "crawl" options become less statistically suitable whereas the "step" and "jump" options become more statistically suitable. You analyze the texture of the rock; if it is wet from a recent rainfall, the "crawl" and "jump" options become less suitable whereas the "kick" and "step" options become more suitable. You analyze the morphology of the rock; if it has sharp edges, the "kick", "crawl", and "jump" options become less suitable whereas the "step" option becomes more suitable. As each additional piece of context is matched and converted into recognition, the information converges at the striatum and passes down BN circuits in parallel; in doing so, the statistical impact of each piece of context is weighted and used to modify the matching thresholds for each of the competing motor memory networks. Pieces of context that make a particular motor memory network more statistically suitable lower the matching threshold for that memory network (making it easier to match and convert into movement), whereas pieces of context that make a particular motor memory network less statistically suitable raise the matching threshold for that memory network (making it harder to match and convert into movement). At some point, the matching threshold is lowered enough for one of them - in this case, the "step" motor memory network - such that the matching threshold is surpassed and autoassociation occurs, followed by a stepping movement over the rock. By incorporating context into memory network matching, BN circuits allow the statistically best memory network for the job to be converted into a movement or cognition. It is important to remember that it is the neocortex, not the basal nuclei, that contains the memory networks; the basal nuclei helps choose the best memory network for the job but it does not produce movements or cognitions, as evidenced by studies showing that the basal nuclei process information late in the initiation phase of a movement (Minck, 1996). |
If you see an elephant head, your neocortex predicts that an entire elephant will be present, even though you did not actually see an entire elephant.
Based on previous experiences, your neocortex knows that elephant heads are almost always attached to the rest of the elephant; given partial sensory information, your neocortex used autoassociation to retrieve the entire "elephant" memory network.
If you see a round white object in isolation, lacking context, numerous visual memory networks remain equally viable options. Is this a golf ball, ping-pong ball, egg, billiard ball, a piece of white candy, or something else?
However, as the saccadic movements of your eyes look over the scene, various pieces of context are processed. Since there is a nest surrounded by leaves, the most statistically suitable option becomes the egg, and so the "egg" visual memory network is matched and converted into recognition. Note that this was done even though the sensory information about the egg itself did not change; it was the context that made it the best option. That round white object still could be a golf ball or something else, but it's far less likely given the context; the neocortex made a match based on probability.
A large rock on a path represents an obstacle. To surmount this obstacle, there are several movements available to you - kicking it out of the way, crawling over it, stepping over it, or jumping over it - and you need to decide which is most appropriate for the situation at hand. The best option depends upon various contextual factors such as the size, texture, and morphology of the rock, amongst others.
Once the "rock" visual memory network has been converted into recognition, the appropriate motor memory network (kicking it, crawling over it, stepping over it, or jumping over it) needs to be matched and converted into movement to get past it. Let us say that the matching threshold is set at 70%, meaning that 70% of the individual minicolumns in any memory network must be matched before it is converted into movement. As various pieces of context are convergently parallel processed by BN circuits, the probabilities favour the "step" motor memory network and so it is matching and racing towards the matching threshold the fastest.
As BN circuits process context, the statistical impact of each piece of context is weighted and used to modify the matching thresholds for each of the competing motor memory networks. Since the rock is half a meter high, wet from a recent rainfall, and has sharp edges, the "step" motor memory network is the most statistically suitable option and so its matching threshold is lowered. To ensure that the incorrect movements are not produced, the matching thresholds for the other motor memory networks are raised, making it even harder for them to be converted into movement. In this way, there is no confusion as a single stepping movement over the rock is made; it would be bad if two movements tried to occur at the same time.
If the STN pathway is overactive as in Huntington's disease, the memory network matching threshold is set too low, resulting in many incorrect motor memory networks being matched and converted into movement.
If the STP pathway is overactive as in Parkinson's disease, the memory network matching threshold is set too high, resulting in movement stopping and an interruption in behaviour until enough time has allowed a good match to be made and converted into movement.
|
Basal Nuclei Learning
Learning methods may be classified as reinforcement (reward-based trial and error), supervised (error-based teacher), or unsupervised (internal rules). It is time to discuss reinforcement learning.
(1) Reinforcement learning. Reinforcement learning is based on trial and error using an internal reward system (Woergoetter and Porr, 2014). There is positive reinforcement, which involves adding a valued stimulus to increase a behaviour, and negative reinforcement, which involves removing an aversive stimulus to increase a behaviour; either way, the behaviour is increased. The opposite of reinforcement is punishment, which involves adding or removing a stimulus to decrease a behaviour; some think that reinforcement is more effective than punishment in changing long-term behaviour (Skinner, 1948) but others believe them to be equally effective (Domjan, 2003). Experimental evidence from animals suggests that reinforcement learning is an essential aspect of basal nuclei processing, with the pleasure state induced by dopamine release being the positively-reinforcing reward (Schultz and Dickinson, 2000). During behavioural tasks in monkeys, dopaminergic neurons are stimulated by actual or predicted rewards (Schultz, 1998). Dopamine release in rats results in synaptic modification, the main cellular learning mechanism (Wickens et al, 1996) as we saw in chapter three. (2) The actor-critic model. Many models have attempted to demonstrate how the basal nuclei functions as a reinforcement learning system. Among these, the actor-critic model is a recurrent theme (Joel et al, 2002). The actor-critic model is an extension of a conventional feedback control system (Woergoetter and Porr, 2014) in which there is a controller and a controlled system. The controlled system is influenced by disturbances and sends feedback to the controller, which in turn maintains a set point by sending adjusting signals back to the controlled system to compensate for the disturbances. In the actor-critic model, there is still a controller (the actor) and a controlled system (the environment), but there is now a critic that creates a dynamic set point by sending positively-reinforcing signals to the actor (Woergoetter and Porr, 2014). The critic increases its signals in the presence of an actual or predicted reward, and the actor learns to perform actions on the environment that maximize future rewards (Joel et al, 2002). Applying the actor-critic model to the basal nuclei, the neocortex with its memory network inventory (the environment) is influenced by sensory information and sends feedback to the striatum (the actor) and pars compacta (the critic) after executing each behaviour. Striatal activity is modulated by dopamine levels and the relative activities of the STN and STP pathways (the set point) adjusts the memory network matching threshold (the actions). Over time, the striatum adapts to the dopamine signals it receives (Joel et al, 2002) and so the pars reticulata in a sense "trains" the striatum, which has a high degree of synaptic plasticity (Joel et al, 2002), through synaptic modification. (3) Uncertainty. Interestingly, fully predicted rewards may not result in the same degree of learning compared to unpredicted rewards; reinforcement learning appears to be at its strongest when there is maximal uncertainty regarding reward probability (Pearce and Hall, 1980). Since uncertainty is greatest when reward probability is 0.5 (Schultz, 2004), reinforcement learning in the basal nuclei may be maximal when the brain encounters situations where the reward probability is 0.5, where the outcome is completely uncertain. |
Reinforcement learning is learning by trial and error using an internal reward system. Correct behaviours are increased by the release of pleasure-inducing dopamine. Incorrect actions go unrewarded.
In a conventional feedback control system, the controlled system is influenced by disturbances and sends feedback to the controller. The controller aims for a set point and sends controlling signals to adjust the controlled system and compensate for the disturbances.
In the actor-critic model of reinforcement, the environment is influenced by disturbances and sends feedback to the actor. The actor responds to a dynamic set point created by reinforcing signals from a critic, and performs actions in the environment.
The basal nuclei fits the actor-critic model nicely. The neocortex with its memory network inventory (the environment) is influenced by sensory information and sends feedback to the striatum (the actor) and pars compacta (the critic) after executing each behaviour. Striatal activity is modulated by dopamine levels and the resulting relative activities of the STN and STP pathways (the set point) adjusts the matching thresholds for the memory networks of the neocortex (the actions).
|
Closing
The basal nuclei processes context, allowing the neocortex to incorporate probability based on past experiences into its decisions. Without it, like half a wheel trying to roll down a hill, the neocortex just could not do its job.
The cerebellum awaits.
The cerebellum awaits.
References
Abdo et al. 2010. The clinical approach to movement disorders. Nature Reviews Neurology 6, 29-37.
Albin et al. 1989. The functional anatomy of basal ganglia disorders. Trends in Neuroscience 12(366-375).
Alexander. 1986. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience 9, 357-381.
Comella et al. 1998. Sleep-related violence, injury, and REM sleep behavior disorder in Parkinson's disease. Neurology 51, 526-529.
De Cock et al. 2007. Restoration of normal motor control in Parkinson's disease during REM sleep. Brain 130, 450-456.
Domjan. 2003. The Principles of Learning and Behaviour. Thompson Learning.
Doya. 1999. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Networks 12(7-8), 961-974.
Francois et al. 1999. Dopaminergic cell group A8 in the monkey: anatomical organization and projections to the striatum. Journal of Comparative Neurology 414(3), 334-347.
Gagnon et al. 2002. REM sleep behavior disorder and REM sleep without atonia in Parkinson's disease. Neurology 59, 585-589.
Gerfen and Wilson. 1996. Handbook of Chemical Neuroanatomy. Elsevier.
Groenewegen. 2002. The basal ganglia and motor control. Neural Plasticity 10(1-2), 107-120.
Gurney et al. 2001. A computational model of action selection in the basal ganglia. Biological Cybernetics 84, 401-410.
Hikosaka and Wurtz. 1983. Visual and occulomotor functions of monkey substantia nigra pars reticulata. Journal of Neurophysiology 49, 1230-1253.
Houk. 2007. Models of basal ganglia. Scholarpedia 2(10), 1633.
Joel et al. 2002. Actor-critic models of the basal ganglia: new anatomical and computational perspectives. Neural Networks 15, 535-547.
Kemp and Powell. 1971. The structure of the caudate nucleus of the cat: light and electron microscopy. Philosophical Transactions of the Royal Society B: Biological Sciences 262, 383-401.
Kohyama et al. 1998. Inactivation of the pons blocks medullary-induced muscle tone suppression in the decerebrate cat. Sleep 21(7), 695-699.
Leyden and Kleinig. 2008. The role of the basal ganglia in data processing. Medical Hypotheses 71, 61-64.
Minck. 1996. The basal ganglia: focused selection and inhibition of competing motor programs. Progress in Neurobiology 50, 381-425.
Mountcastle. 1997. The columnar organization of the neocortex. Brain 210(4), 701-722.
Pearce and Hall. 1980. A model for Pavlovian conditioning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychological Reviews 87, 532-552.
Poryazova and Zachariev. 2005. REM sleep behavior disorder in patients with Parkinson's disease. Folia Medica 47(1), 5-10.
Rafols and Fox. 1976. The neurons in the primate subthalamic nucleus: a Golgi and electron microscopic study. Journal of Comparative Neurology 168(1), 75-111.
Saint-Cyr. 2003. Frontal-striatal circuit functions: context, sequence, and consequence. Journal of the International Neuropsychological Society 9(1), 103-127.
Schenck et al. 1986. Chronic behavioral disorders of human REM sleep: a new category of parasomnia. Sleep 9, 293-308.
Schultz. 1998. Predictive reward signal of dopamine neurons. Journal of Neurophysiology 80, 1-27.
Schultz. 2004. Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology. Current Opinion in Neurobiology 14, 139-147.
Schultz and Dickinson. 2000. Neuronal coding of prediction errors. Annual Review Neuroscience 23, 473-500.
Skinner. 1948. Walden Two. The Macmillan Company.
Wickens et al. 1996. Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex in vitro. Neuroscience 70, 1-5.
Woergoetter and Porr. 2014. Reinforcement learning. Scholarpedia.
Abdo et al. 2010. The clinical approach to movement disorders. Nature Reviews Neurology 6, 29-37.
Albin et al. 1989. The functional anatomy of basal ganglia disorders. Trends in Neuroscience 12(366-375).
Alexander. 1986. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience 9, 357-381.
Comella et al. 1998. Sleep-related violence, injury, and REM sleep behavior disorder in Parkinson's disease. Neurology 51, 526-529.
De Cock et al. 2007. Restoration of normal motor control in Parkinson's disease during REM sleep. Brain 130, 450-456.
Domjan. 2003. The Principles of Learning and Behaviour. Thompson Learning.
Doya. 1999. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Networks 12(7-8), 961-974.
Francois et al. 1999. Dopaminergic cell group A8 in the monkey: anatomical organization and projections to the striatum. Journal of Comparative Neurology 414(3), 334-347.
Gagnon et al. 2002. REM sleep behavior disorder and REM sleep without atonia in Parkinson's disease. Neurology 59, 585-589.
Gerfen and Wilson. 1996. Handbook of Chemical Neuroanatomy. Elsevier.
Groenewegen. 2002. The basal ganglia and motor control. Neural Plasticity 10(1-2), 107-120.
Gurney et al. 2001. A computational model of action selection in the basal ganglia. Biological Cybernetics 84, 401-410.
Hikosaka and Wurtz. 1983. Visual and occulomotor functions of monkey substantia nigra pars reticulata. Journal of Neurophysiology 49, 1230-1253.
Houk. 2007. Models of basal ganglia. Scholarpedia 2(10), 1633.
Joel et al. 2002. Actor-critic models of the basal ganglia: new anatomical and computational perspectives. Neural Networks 15, 535-547.
Kemp and Powell. 1971. The structure of the caudate nucleus of the cat: light and electron microscopy. Philosophical Transactions of the Royal Society B: Biological Sciences 262, 383-401.
Kohyama et al. 1998. Inactivation of the pons blocks medullary-induced muscle tone suppression in the decerebrate cat. Sleep 21(7), 695-699.
Leyden and Kleinig. 2008. The role of the basal ganglia in data processing. Medical Hypotheses 71, 61-64.
Minck. 1996. The basal ganglia: focused selection and inhibition of competing motor programs. Progress in Neurobiology 50, 381-425.
Mountcastle. 1997. The columnar organization of the neocortex. Brain 210(4), 701-722.
Pearce and Hall. 1980. A model for Pavlovian conditioning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychological Reviews 87, 532-552.
Poryazova and Zachariev. 2005. REM sleep behavior disorder in patients with Parkinson's disease. Folia Medica 47(1), 5-10.
Rafols and Fox. 1976. The neurons in the primate subthalamic nucleus: a Golgi and electron microscopic study. Journal of Comparative Neurology 168(1), 75-111.
Saint-Cyr. 2003. Frontal-striatal circuit functions: context, sequence, and consequence. Journal of the International Neuropsychological Society 9(1), 103-127.
Schenck et al. 1986. Chronic behavioral disorders of human REM sleep: a new category of parasomnia. Sleep 9, 293-308.
Schultz. 1998. Predictive reward signal of dopamine neurons. Journal of Neurophysiology 80, 1-27.
Schultz. 2004. Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology. Current Opinion in Neurobiology 14, 139-147.
Schultz and Dickinson. 2000. Neuronal coding of prediction errors. Annual Review Neuroscience 23, 473-500.
Skinner. 1948. Walden Two. The Macmillan Company.
Wickens et al. 1996. Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex in vitro. Neuroscience 70, 1-5.
Woergoetter and Porr. 2014. Reinforcement learning. Scholarpedia.