Figure 2: Centralised video content analysis Figure 3: Distributed content analysis Intelligent CCTV performs various computational applications. Initially, the applications were limited to image capture, compressing the image and transferring the image to the host computer. However, recently, intelligent CCTV perform a wide range of applications, with the most popular algorithms including real-time face recognition, scene capturing, traffic surveillance and human activity recognition (Kleihost et al.
, 2004; Ozer & Wolf , 2001; Matsushita et al., 2003; Bramberger et al., 2004; Bojkovic & Samcovic, 2006). Figure 4: Smart CCTV conceptualisation Limitation of intelligence CCTV system Technological Limitations Limited opportunity for information fusion To determine the limitations of intelligent CCTVs, Saini et al. (2014) investigated why the technology is rarely used in commercial systems. According to Saini et al. (2014), the most critical limitation of the technology is the limited opportunity for information fusion.
The researchers echoed an earlier study by Atrey et al. (2006) that had established that during the video analysis process, information fusion only happens at three key levels, namely the data level, where the pixel values are compared directly to reach a conclusion, the feature level, where features are extracted from the image and the decision level, where conclusion is reached and acted upon (See Figure 5). Figure 5: Processing time of individual processes (Saini et al., 2014) Saini et al. (2014) concluded that the smart camera systems in the intelligent CCTV systems only allow fusion to take place at the decision level.
It is submitted that this is since the multi-camera system consist of cameras that are densely positioned with overlapping views that need data and feature level fusion. In a related study of the limited opportunity for information fusion, Mitchell (2012) reached a similar conclusion. To overcome such limitations, Yang et al (2011) and Maker (2009) proposed compression feature techniques, which may however still compromise the general accuracy of the video analysis task. Tracking is computationally-intensive As indicated in the figure 5 above, the processing times of the several steps involved in video analysis system of four videos differs.
The foreground detection is dependent on the frame resolution while the processing times of tracking and detection are relative to the computational load in each step (Atrey et al., 2006). Hence, tracking could be viewed to be computationally-intensive. Sensor synchronisation and coordination Sensor synchronisation and coordination may also be difficult in the intelligent CCTVS because of random network delays at the transitional nodes. This views was promoted by Farenbook and Clement (2011).
Sensor synchronisation and coordination limitation has the potential to hinder the cameras from providing effective tracking performance as high-quality video from at one node without encouraging extra bandwidth overhead (Tessens et al., 2008). To overcome such limitations -- in order to achieve high level of tracking accuracy--, it is reasoned that the centralised system should have high quality video by integrating multiple cameras, which cause large bandwidth overhead. High bandwidth requirement The high bandwidth requirement is an underlying limitation to attaining a full-fledged system.
This limitation was discussed by Bigdeli et al. (2008) when designing smart cameras that enable proactive intelligent CCTV. Bigdeli et al. (2008) investigated the possibility of designing cameras that could overcome the high bandwidth requirement that high resolution cameras need, by discarding unutilised information at the sensor, before the image of interest is moved to the CPU. As a result setting aside CPU time needed for actual processing. The researchers found that for accuracy and efficiency in recognition, high-resolution camera is required which in turn needs a large bandwidth to transfer the image to a central processing unit (CPU).
Read More