Visual perception related to object recognition in higher stage (V4, IT) in ventral pathway

How does your brain perceive object?
Image Unavailable
A pathway of visual object perception. Source:

Visual information from thalamus is projected to the primary visual area at occipital lobe in order to interpret the environment around us[2]. This interpretation, called visual perception, is associated with not only sensory inputs but also memories[2]. There are two visual processing: bottom-up sensory information arises from retina, and top-down neuronal signals are associated with imagery that influences the bottom-up processing[2]. In addition, visual perception is affected by the region such as Frontal eye Field in prefrontal cortex[3]. The visual information processing in the first stage of ventral pathway (V1) has been well studied. Many studies showed that visual inputs from V1 is projected to Inferior Temporal(IT) cortex along the ventral pathway[2],known as what pathway; however, this processing in higher stages(such as V4, IT) is not fully elucidated1. Recent studies suggested that the last step of the ventral visual pathway occurs in the IT cortex, and it plays important roles in visual object recognition, which is the ability to identify particular object by labelling and categorizing regardless of various appearance of visual target[1]. Also, based on computational models, recent studies have focused on algorithms for uncovering circuits in visual object recognition that may provide opportunities to reveal and treat entire circuit in brain lesions or diseases related to visual perception, such as agnosia and blindness[1].

1.1 Visual circuitry and where pathway

1.1a Role of prefrontal cortex in visual perception

visual processing steam
Image Unavailable
Two pathway model (dorsal and ventral pathways) based on hierarchical organization of visual processing steam. Source :

Traditionally the frontal-eye-field (FEF) in frontal cortical area is thought to receive visual perceptual information through two types of visual responses of visual areas; early, fast and short latency visual responses is involved in low stage of visual processing related to visual stimulus, whereas second delayed later responses affect perceptual process regardless of visual stimulus [3],[4]. Based on these visual activity information, the FEF controls cognition and behavior, such as regulating eye movement via oculomotor structures [5]. The fact that early visual areas are thought to be only regions that influence conscious visual perception; however, the early short latency visual responses in FEF are also reflected to the visual perception by generating and sending perceptual state contents that is not inherited from early visual area because FEF has shorter response latencies than other visual areas [3]. Strong associations between higher visual areas (IT) and visual perception can be elucidated by this role of FEF in visual perception because, unlike lower visual areas (V1 or V4), the higher visual areas (IT) receive visual information projected from FEF [3],[6].

1.1b Where/how pathway (spatial vision)

Where/how pathway, called dorsal pathway, is the route that arises from primary visual cortex (V1) and projects dorsally to the parietal lobe [8],[7]. This pathway is involved in processing mainly spatial contents of the external environment in order to plan and establish motor behavior by influencing the motor cortex area in the frontal lobe [9]. According to recent studies, the dorsal pathway consists of many processing routes for reaching, grasping and self-determined behavior [10],[11]. Moreover, the lateral geniculate nucleus (LGN) pathway reaching area MT and the temporal processing steam (when pathway) are considered as the dorsal pathway [12],[13].

1.2 What pathway (object recognition) with computational algorithms

Core recognition without any attentional pre-cuing for specific location and object
Image Unavailable
Fast and extremely precise object recognition sufficient to generate decision border(= black dashed line).
Source :

What pathway from primary visual cortex (V1) to inferior temporal cortex (IT), known as ventral pathway, is responsible for recognition regarding size and shape of object or text [1]. Recognition processing is operated seamlessly and effortlessly within less than a second regardless of substantial visual variation [14]. This recognition processing of specific objects depending on identification and categorization through assigning labels refers to object recognition [1]. Object recognition is able to rapidly detect objects in the presence of appearance variation without any attentional pre-cuing for specific location and object [1]. Although an artificial visions system, like machine and computer vision, are not as complicate as connectivity among actual brain regions, algorithms (known as hypothesis) in primate neocortex highly involved in contributing visual processing can be postulated by investigating mechanism associated with phenomena of abstraction based on clue from the artificial system [1]. Thus, understanding object recognition is defined as designing an artificial system that can be applied biological visual system [15].

1.2a Core recognition behavior

Core recognition is object recognition responses that can rapidly and precisely identify one or more objects in the central visual field (scene) [1]. Invariance problem must be considered for investigating artificial vision recognition system because every objects we encounter in real world are thought to be totally unique due to variances (identity-preserving image transformation); however, these unique objects influenced by variability of the environment and observer, such as scale, pose, position and intraclass variability, are recognized as same label or same group resulting from establishing equivalent and invariable feature in response to different retinal response pattern without any confusion [1].

1.2b IT population representation

Unfurling object population representation
Image Unavailable
Tangled neuronal representations at early visual areas are reformatted and untangled along ventral pathway
to generate new population re-representation with overt decision borderline (=black dashed line) and hyper plane at late visual areas.
Source : 

Based on all neuronal population in response to identity-preserving transformations of objects, population representations related to low-dimensional manifold of point were determined in the population vector space suggesting that the information in terms of object identity gradually unravels, and neural population with highly bent and scrambled object identity manifolds in early visual regions (V1) is converted into less carved new population representation at later stage through re-representation along the ventral pathway [1]. These untangled the manifolds among different objects are sufficient to enhance separation of object by providing an explicit decision boundary resulting from a simple weighted summation [1].

1.2c IT single-unit responses

These IT neuronal population representation is dependent to individual IT single neuron and continuously receives explicit contents from the individual IT single neuron resulting in strengthening population representation because IT single neuron is considered as an invariant unit and acts as a member of a whole population than single unit [16]. These characteristics of the IT single neuron indicates that each IT neuron is able to sustain its preferences regarding objects, called tolerance [17].

1.3 Architectural solution for object recognition in IT

Proposed global-scale architecture includes two algorithmic frameworks; first framework is serial chain framework that untangles stepwise complicate tasks by attaching more recognition force, and second framework organizes framework depending on different stages of the ventral pathway hierarchy in order to show communications between the levels [16],[18]. The best solution is an integration of two frameworks giving raise to allow reversible interactions under conditions of interval scale and visual complexity [1]. Also, mesoscale architecture based on abstraction layers is associated with both global scale algorithms suggesting specific and localized job description for each subpopulation[16].

1. DiCarlo, J. J., Zoccolan, D., & Rust, N. C. (2012). How does the brain solve visual object recognition?. Neuron, 73(3), 415 – 434.
2. Albright, T. D. (2012). On the perception of probable things: neural substrates of associative memory, imagery, and perception. Neuron, 74(2), 227-245.
3. Libedinsky, C., & Livingstone, M. (2011). Role of prefrontal cortex in conscious visual perception. J. neuroscience , 31(1), 64 – 69.
4. Thompson, K.G.,& Schall, J.D. (1999). The detection of visual signals by macaque frontal eye field during masking. Nat Neurosci, 2, 283–288.
5. Schall, J.D. (2002). The neural selection and control of saccades by the frontal eye field. Biol Sci, 357, 1073–1082.
6. Stanton, G.B., Bruce, C.J.,& Goldberg, M.E. (1995). Topography of projections to posterior cortical areas from macaque frontal eye fields. J Comp Neurol, 353. 291–305.
7. Edward, H.F., de, H.,& Cowey, A. (2011). On the usefulness of ‘what’ and ‘where’ pathways in vision. Trends in Cognitive Sciences, 15 (10), 460-466.
8. Ungerleider, L.G., & Mishkin, M. (1982). Two cortical visual systems. In Analysis of Visual Behavior, 549–586.
9. Milner, A.D.,& Goodale, M.A. (1995). The visual brain in action.
10. Kravitz, D.J., Saleem, K.S., Baker, C.I., & Mishkin, M. (2011). A new neural framework for visuospatial processing. Nat Rev Neurosci, 12, 217–230.
11. Grol, M.J., Majdandz̆ić, J., Stephan , K. E., Verhagen , L., Dijkerman , H.C., Bekkering , H., Verstraten, A. J., & Toni, I. (2007). Parieto-frontal connectivity during visually guided grasping. J Neurosci, 27, 11877–11887.
12. Sincich, L.C., Park, K.F., Wohlgemuth, M.J., & Horton, C. (2004).Bypassing V1: a direct geniculate input to area MT. Nat Neurosci,7, 1123–1128.
13. Battelli, L. Walsh, L.V., Leone, A.P.,&Cavanagh,P. (2008).The ‘when’ parietal pathway explored by lesion studies. Curr Opin Neurobiol, 18, 120–126.
14. Logothetis, N.K.,& Sheinberg, D.L. (1996). Visual object recognition. Annu Rev Neurosci,19, 577–621.
15. Rousselet, G.A., Fabre-Thorpe, M., & Thorpe, S.J. (2002). Parallel processing in high-level categorization of natural images. Nat Neurosci, 5, 629–630.
16. DiCarlo, J.J., & Cox, D.D. (2007). Untangling invariant object recognition. Trends Cogn Sci, 11, 333–341.
17. Rust, N.C., and DiCarlo, J.J. (2010). Selectivity and tolerance (‘‘invariance’’) both increase as visual information propagates from cortical area V4 to IT. J Neurosci, 30, 12978–12995.
18. Roelfsema, P.R., and Houtkamp, R. (2011). Incremental grouping of image elements in vision. Atten Percept Psychophys,73, 2542–2572.

Add a New Comment
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License