Referee triad inconsistencies and VAR: limits of visual perception ... the VAR is not AI and the referee's position can make a difference on the pitch
Cecilia Scassa
Medico Oculista, Riabilitazione ipovisiva, , Metabolomica, Neuroscienza Applicata nello Sport.
This introduction is essential to clarify some anatomo- physiological concepts of the complex visual vision of which the eyes are the receptors, but the real decoding takes place in the brain .
Our brain processes visual perceptions by recreating a global memory of the scene inextricably linked to the localized memory of specific targets.
The memory in synthesis guide the search and reduces the reaction times so that, for every perception is programmed a motor response.
The speed of modern football also affects the difficulty of an exact evaluation by the referee trio , exasperated by a series of factors linked to the physiological limits of human visual perception. Hence the request for a technological support in the field represented by the VAR , strongly supported by the President of FIFA , Gianni Infantino . In the last Club World Cup , this idea was tested and it did not fail to arouse bitter controversy right from the start, and before its official entry into force .
Gianni Infantino, however, announces the need for a refinement of the system in view of the 2018 World Cup.
The aim is to understand how it is possible that an advanced technology such as VAR can sometimes suggest an incorrect evaluation .
But let's remember the VAR is not Artificial Intelligence, but still a technological means evaluated by men, with all the limits of human resolution.
It is very important to underline how our brain has greater resolving power for bright points than colored points , as well as a chiaroscuro contrast stimulus is more perceptible than a uniform illumination.
Colors belonging to contiguous categories of the chromatic scale are more difficult to perceive than opposite colors in the band of the visible spectrum , with repercussions on the choice of the shirts of the teams on the field, the choice of the color of the ball, or on a factor often underestimated as the color of horizontal scrolling headland advertising.
But it is in the brain that the image projected by the two retinas through the optic nerves is reworked, minimizing some elements and enhancing the main ones: the edges and angles of the images.
Furthermore , the occipital cerebral cortex is more sensitive to moving objects. In particular, the parietal pathway ( areaV ) is super- specialized for the movement of objects, to respond to the direction of movement and for the exact spatial position of the objects.
A fundamental fact is that the visual search is more efficient in the depth towards moving targets (area V5) compared to static targets, and at the beginning of the movement with respect to speed .
The visual scanning pattern can reduce the time of reaction (TR), making them faster and depend on many factors including has been widely documented as a Target of a single color, is faster captured compared to Target or distractors of different color and size .
Visual capture and attention are more sensitive to qualitative changes (beginning or end of action) than quantitative (speed and movement).
A key concept is that the ability to witness multiple objects simultaneously decreases as the angle of view that separates distant objects increases .
A recent study confirms in a situational context, i.e. on offside calls by referees and soccer assistants, calculated to what extent the referees and assistants had to use their visual attention to achieve good results. Referees and assistants made fewer mistakes when they were further away from the action due to an advantageous viewing angle in the playing action . ( Hüttermann S 1 , No?l B 1 , 1 . Memmert D 1 . Evaluating erroneous offside calls in soccer. PLoS One . 2017 Mar 23; 12 (3).
The ability to deal with multiple objects simultaneously decreases as the angle of view that separates distant objects increases.
Counterintuitively , the assistant referees made fewer mistakes when they were further away from the action due to an advantageous viewing angle (because the affected players would form a smaller viewing angle). Referees often take the blame for wrong calls, but in some cases they can find themselves at an unfair disadvantage. The perception of spatially disparate but simultaneous events could exceed the limits of their ability to spread attention.
Offsides in football provide an ideal test for the importance of spatial attention in a real-world context.
Assistant referees make mistakes in about 20–26% of offside calls, and many factors contribute to those mistakes . Among others, assistant referees make more mistakes when running to maintain the appropriate position than when walking or standing . In addition, the assistant referee has a distorted view of the position and position of the players (receiver and defender) relative to each other, as the position of the assistant referee moves away from the offside line (see Flash effect- lag and hypothesis of optical error ).
In determining whether an offensive player is offside, an assistant referee must carefully select relevant players from other players and simultaneously focus attention on objects and their space / time relationships in spatially disparate regions of the field.
In the event of an offside decision, the relevant players are the passer-by, the catcher and the penultimate defender at the time of the pitch. That is, an assistant referee must simultaneously identify the moment when a ball is passed forward (watch out for the passer-by) and the relative position of the receiver and the penultimate defender (watch out for the area / players close to the offside line). As a result, the limits to the ability to selectively focus their attention on two positions should affect performance.
Most importantly, assistant referees tend to have a conservative bias when making calls; when in doubt, they tend not to call and always decide in favor of the attacker . Given the conservative bias in making calls, we can be fairly certain that the majority of flagged calls were due to lack of visual attention rather than a general tendency to make a call when uncertain. That is, false positives constitute cases where the assistant referee has misperceived the situation and by analyzing the marked calls we can more readily determine if there is a link between the viewing angle that separates critical players and the likelihood of error. However, to check if the limited viewing angle is really a deciding factor for making bad decisions in offside situations, we considered the two spatial distances that determine the angle: the spatial separation of the assistant referee between the passerby and the defender. on the offside line ( horizontal attention spread ) as well as the deep spatial separation between the assistant referee and both players ( deep attention spread ). . We expected assistant referees to show higher error rates in their decision making as the viewing angle increased. That is, we expected more errors with greater separation between players (i.e., greater spread of horizontal attention) .In addition, we hypothesized to find higher error rates for smaller separations between the assistant referee and the players. in depth (i.e., less spread of attention in depth) due to concomitant greater viewing angles . The assistant referees should position themselves on the offside line so that they can better judge the left / right position of the offensive and defensive players concerned. Oudejans et al observed perceptual illusions in which the assistant referees were positioned more than 1m from the imaginary offside line. The average deviation from the offside line was 0.79m.
To make the correct call, the assistant referee must determine the relative positions of the catcher and the last defender at that time. A correct call depends on focusing attention on both the passerby and the receiver / defender. Consequently , the crucial viewing angle always involves the passerby and one of the other two players. Considering that the assistant referees are always required to position themselves at the height of the last defending player, the maximum visual angle to their left side is between the passer-by and the defender regardless of whether it is an offside or offside situation. . In addition to determining the assistant referee's angle of view between the passer and the defender, we determined the distance y in depth between the assistant referee and the passer (or more specifically the y separation between the passer and the sideline) as well as between the assistant referee and the defender .
From these coordinates we calculated the angle of view of the assistant referee by calculating the distance between the passer-by and the defender ( horizontal attentional spread) as well as the separation in depth between the assistant referee (or rather the sideline) and both players ( attentional spread in depth ) . Out of 355 coded offside calls, the assistant referees made a total of 49 errors. This error rate (14%) is in line with estimates that up to 20% of offside calls are incorrect based on the assistant referee's viewing angle between passerby and defender. Only 11% of calls (33/302) were incorrect when the viewing angle was less than 40 °, but 30% (16/53) was incorrect for viewing angles greater than 40 °.
When considering the spatial separation between the passerby and the defender (horizontal attention spread required), it becomes clear that the average separation for incorrect and correct calls was approximately comparable ( correct calls: M = 6.97 m , 95% CI 6 , 19, 7.74; wrong calls : M = 7.84 m , 95% CI 5.45, 10.22). This means that the relationship between visual angle and error rates cannot be explained solely by the separation between the players, but perhaps also by the separation between the assistant referee and the players in depth.
error rates increased as the y-separations between the assistant referee and the players decreased. This indicates that the in-depth diffusion of attention does not adversely affect the assistant referee's decision-making process; rather, greater separation between the assistant referee and the players at depth appears to be beneficial due to the associated reduced viewing angle .
Viewing angle is associated with error rates in offside decisions by assistant referees, just as it is for laboratory measurements of attention spread.
1. Assistant referees are more prone to errors when the viewing angle required to perceive affected players increases.
2.Smaller separations in depth between the assistant referee and the players negatively affect the assistant referee's decision making, possibly due to a greater associated viewing angle.
error rates were lower when relevant players were separated by viewing angles of less than about 40 ° compared to scenes that required greater angles.
It's possible that a third factor, something about how the game takes shape, could contribute to both greater viewing angles and higher error rates.
It was not considered, for example, that decisions often depend on the time of the match (for example, usually fewer offside calls are made in the first half than in the second half. Furthermore , a limiting factor in the analysis of offside decisions through the evaluation of the frozen images is that the impact of the movements of the players at the time of the offside decision cannot be considered.
Counterintuitively , these results suggest that the assistant referees could make more accurate calls if they were positioned further away from the action; from a more distant point of view the relevant players in an offside call would be separated by a smaller viewing angle, allowing the assistant referee to assist the relevant players simultaneously.
This preliminary evidence that greater viewing angles and smaller separations from players in depth are associated with greater errors in offside calls may explain fan complaints .
Indeed, an analysis of the opinions of various experts highlighted the complex role of football referees, such as Arag?oe Pina et al. (2019 ) found that the excellence of football referees can be shaped by individual preparation, game preparation and game management.
Williams et al. (1999) state that experienced referees should know how to keep their attention on numerous stimuli and be able to distinguish between essential and less important cues. However, when referees make important decisions with limited time, under pressure and often with limited relevant input, it can be difficult to assess these signals appropriately ( Wolfson and Neave , 2007; Plessner et al., 2009). Referees should therefore aim to be positioned in a way that allows them to obtain relevant insights for making correct decisions , therefore, applicable physical fitness is required for referees to be able to keep up with the game and have an unobstructed view. of the foul play potential ( Riiser et al., 2019). Joo and Jee (2019) highlighted in their study of elite Korean referees that the physical fitness and positioning skills of both referees should be emphasized to reduce the number of referee errors during the match.
In particular, Mallo et al. (2012) showed that an appropriate distance (11-15m) for the referees in the central area of the playing field gave the lowest error rate in the referees decision-making process, while the risk of making errors increased when the referees referees were more distant from foul play situations.
Furthermore, Hossner et al. (2019) analyzed both the distance and angle of match referees' position relative to foul play infractions across all 64 matches of the 2014 FIFA World Cup. They found referee error rates were higher when the distance to the incident was between 10 and 15 m for whistle errors and 0-5 m for errors without whistle. referees must use their experience to be in the position that allows them to make a correct decision (International Football Association Board IFAB, 2019).
Regarding distance, the referee was positioned within 10m of the incident in 12 situations, of which 10 (83%) were correctly assessed ( For distances between 10 and 20 meters (N = 22), the referees made a correct decision in 14 situations (64%) and a wrong decision in eight situations (36%). When the match referee was positioned more than 20 meters from the situations (N = 8), four of the eight situations (50 %) have been correctly arbitrated .
It is important to emphasize that the football referee is human and makes quick decisions based on a subjective assessment of various game situations ( Poolton et al., 2011). Although the referee can move freely on the playing field to access the best possible distance, angle and vision, the referee does not always have an optimal view of a situation and must decide based on his own intuition and the environmental signals obtained ( Plessner et al. , 2009; Samuel et al., 2020).
However, as research has indicated that referees can be influenced by social pressure ( Sutter and Kocher , 2004; Erikstad and Johansen , 2020 ), and therefore errors are not necessarily evenly distributed among teams, appropriate positioning in situations of rigor as demonstrated in our study can also help reduce the risk of (unintentionally) biased decisions.
This work has irrefutably demonstrated that the exact position of the observer, in the study in question the referee triad, and the distance in which he is positioned become factors of primary importance.
Where do the perplexities arise from the frequent inconsistencies between evaluation in the field and evaluation of these assistants in charge of analyzing the video, but still liable to error due to the limits of human vision?
For the opponents of this novelty, the main problem consists precisely in this: although supported by the help of slow motion and sophisticated electronic equipment, the VAR is still evaluated by man. This means that it will still be the human eye to observe and judge, through the multiple cameras, with all the physiological limits of visual perception, which were summarized in the previous Post.
Often in the sport of football interpretation counts more than the fact that occurred on the pitch in and of itself.
But more properly the vision of the world around us is the process of interpreting what has been captured in a visual scene starting from the receptor (the eye, the retina, in the optic nerve that projects this information to the occipital cortex), of the draw meaning from things, by integrating what has been seen with information received through other sense organs and one's own personal baggage of experience.
The process of vision is not static but dynamic and involves continuous movements of both eyes is the head, allowing the exploration of the surrounding space and detailed observation of what attracts our attention (Target) .In the cortex takes an important reworking of the visual information , as some information is minimized , such as those concerning the average level of illumination, and others are highlighted , in particular the contours and angles of the figures .
It is essential, for the purposes of all non-static but moving activities, that is, in all sports, to underline how the receptive fields of the cells of the cortex , unlike the retina, preferentially respond to stimuli in motion. In particular, among the many cortical areas to which it is connected, Vision finds the greatest specialization in the parietal projection pathway for the perception of moving objects , in particular the V5 area . Neurons in this area have large receptive fields and are specialized in responding to the direction of movement of the visual stimulus and in the perception of the spatial position of objects.
The visual image is not a faithful representation of the external world , as the photographic or cinematographic image can be, but is the result of a brain processing process , and as such has characteristics that are not necessarily identical to the geometric and physical ones of the external world that generates it (an example derives from optical illusions).
What we see as a high resolution image of the world around us is actually a creation of the brain that has to work from complex and often insufficient information.
These problems are essentially solved thanks to the continuous dynamic ocular scanning , in order to have a mosaic of high resolution images that covers the entire field of view.
To introduce the problems relating to the limits of human visual perception, which also affects new analysis technologies, we start from the difficulty of comparing the resolution of the human eye with the resolution of a digital camera , to which the eye is often compared.
In fact, in a digital camera, “resolution” refers to the total number of pixels on the imaging plate (the CCD) and is measured in megapixels .
While the image that we can see by quickly opening and closing the eye is just over 7 megapixels, the image that the brain is able to give us from this mosaic corresponds to about 576 megapixels.
These observations have important repercussions if we think about the possibility of a different interpretation ( rendering ) of the image by both the players and the referee trio and it is reflected above all on the different resolution of the human view if evaluated in terms of pixels towards the interpretation of the new generation of display PC, video, eye-tracker4, increasingly used in the evaluation of match analysis , inserting the doubt of an imperfect congruence between resolution of human sight and resolution of such equipment.
With the advancements of 3D graphics technology we are in the process of producing hardware that matches or exceeds the needs of the human visual system. This element must be taken into consideration when designing 3D graphics hardware.
As part of an effort to design future hardware and visualization systems, a further study was carried out ( MF. Deering , Michael F. 1998) in which the details of the limits of human spatial vision are emphasized for which the maximum perceptible pixels are resolution correspond to 28 seconds of arc.
Also, several physical factors limit the higher frequencies .
This high resolution, however, only applies to the 2 ° of central foveal vision . Outside this area, cone spacing and measured visual acuity decrease even faster than the optical limits.
In summary, the eventual consumer of all 3D rendering is the human visual system . With display technology and hardware rendering speeds in real time ever faster, we are on the threshold of a generation of machines that will exceed the input capabilities of the visual system .
If we wanted to use megapixels to calculate the resolution of the human eye or we could say that it is as if we are equipped with a 576 megapixel camera.
Roger N. Clark calculated this number using the very concept of "angular resolution". In reality, it is an approximation that assumes that anywhere in the eye, from the fovea to the periphery, there is the same visual acuity, that is, the same angular resolution . In fact, it would seem that most of the time our eye is a 7 megapixel camera , which corresponds to the resolution of the fovea during a single glance , resolution sufficient to ensure that all the pixels that make up the image that stands out in our field visual are imperceptible.
The ability to focus and maintain concentration simultaneously on multiple objects, typical of a perfect scan of the visual scene, decreases as the visual angle that separates distant objects increases.
In summary, contrary to what can be deduced, an excessive proximity to the visual scene and to the targets and distractors present, reduces the clarity of the global vision of the scene itself, determining, for example, a reduction in errors, also for the referee triad, and is inversely linked to smaller viewing angles, i.e. a greater distance from the action.
It is interesting to note that head speeds lower than the speed of the retina, i.e. eye movements, allow the perception of shorter distances (near balloon), while retinal speeds slower than the speed of the head are directly correlated to the perception of greater distances ( distant balloon).
At the same time, we must remember that the visual search in depth is more efficient towards moving targets than static targets and at the beginning of the movement compared to speed .
Therefore, capture and visual attention are more sensitive to qualitative alterations (beginning or end of action) than quantitative (speed and movement).
1