Show simple item record

dc.contributor.advisorSimmhan, Yogesh
dc.contributor.authorKhanijo, Bharati
dc.date.accessioned2024-08-29T04:40:22Z
dc.date.available2024-08-29T04:40:22Z
dc.date.submitted2024
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/6612
dc.description.abstractVideo data has been historically known for its unstructured nature, rich semantic content and scalability issues in terms of storage. With advances in computer vision and Deep Neural Net works (DNNs) it is now possible to automatically extract rich semantic information from video data. This has resulted in increased interest for development and exploration of applications where stored video data could be used to observe and study the world retrospectively. But recent research has highlighted the compute intensive nature of such deep models (e.g., accurate object detection models) leading to high cost associated with use of these models limiting their applicability to analyze video data naively for retrospective analysis. Also development and efficient implementation of above applications often require to co-analyze video data along with associated geospatial and temporal metadata, which has been acknowledged by the research community to be a difficult task due to the associated cognitive load. There is a growing use of drone cameras for capturing video data due to their mobility and ease of deployment. These videos are complemented with temporally varying location and orientation metadata, which ease their exploration. Video query systems are required to allow intuitive querying over such geospatial, temporal and semantic information associated with the videos. In this thesis, we develop a geospatial-temporal video query system that supports semantic queries over drone videos by extending an existing spatial-temporal database and leveraging DNN models. Specifically, we include query abstractions over the level of detail of visual information captured in the videos, and propose simple heuristics for better reuse of semantic object detection results from different object detection model configurations. Op timizations needed for such retrospective analysis motivate the proposed novel DDownscale method, along with an Ingest Pipeline, to efficiently acquire, store and query drone videos in a Video Repository. A key requirement for this video repository is the need to conserve storage space and compute time for semantic video queries. Reducing the resolution of videos will reduce the video size and the inferencing time during querying. Existing methods to reduce the resolution of video data for such optimizations often leverage the stationary spatial and temporal characteristics of videos in static cameras, which are absent in videos from mobile drone fleets. Drones fly at different altitudes and record videos from different viewpoints and capture videos with varying level of detail of visual information. Another factor we need to take care of, is that, drone videos are typically of short duration unlike those captured by static cameras. We propose the novel DDownscale method to dynamically select the downscale factor for a video such that the level of detail in the video required for effective object detection is not compromised. We model the relative recall drop caused by downscaling as a function of the object size in the downscaled video and the downscaling factor used. We observe that for a given object detection DNN model and class of interest, our method generalizes well to the evaluated test datasets. Using the above modeling, we derive the DDownscale inequality as a relation between the relative recall drop of the video and the hyperparameters to DDownscale. This relation is satisfied by ≈ 98% of the dynamically downscaled videos across different datasets. For user specified target reduction in recall values ranging from 1% – 30%, the proposed DDownscale algorithm help achieve > 25% reduction in total object detection time and > 31% reduction in storage on average compared to baseline of storing and evaluating the videos uniformly at the original resolution, with ≈ 96% of dynamically downscaled videos having the relative recall drop within user specified target. Additionally, we explore a simpler specification of target level of detail ; derive a relation between this specification and a statistic of relative drop in recall of smallest object of in terest when detected by the selected model; and propose a drone video ingest pipeline that preprocesses video on arrival from the drones, including dynamically downscaling them, before insertion into the video repository residing on a central server. The pipeline uses scheduling strategies over a cluster of heterogeneous edge accelerators to reduce time to ingest the drone videos to make them quickly available for analysis. The pipeline reduces the average turn around time for ingest by ≈ 66% despite the downscaling overhead, compared to uploading original resolution video without downscaling, for the evaluated workload and experimental setup.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseries;ET00622
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectAerial Video Semantic Analysisen_US
dc.subjectcomputer visionen_US
dc.subjectDeep Neural Net worksen_US
dc.subjectDrone cameraen_US
dc.subjectvideo query systemen_US
dc.subjectspatial-temporal databaseen_US
dc.subjectVideo dataen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Information technology::Computer scienceen_US
dc.titleScalable Video Data Management and Visual Querying System for Autonomous Camera Networksen_US
dc.typeThesisen_US
dc.degree.nameMTech (Res)en_US
dc.degree.levelMastersen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record