Bag of Visual Words Model
Abstract
Automatic interpretation of Remote sensing images is a very important task in several practical fields. There are several approaches to accomplish this task, one of the most powerful and effective approach is the use of local features and machine learning techniques to detect objects and classify it. In such an approach, first, the image is scanned for local features and coded in a mathematically manipulatable form, then these local features are injected to a classifier to get the class of the object which contains these local features. In this thesis, bag of visual words model for detecting and recognizing of objects in high resolution satellite images is constructed and tested using blob local features. Scale Invariant Feature Transform (SIFT) and Speedup Robust Features (SURF) algorithms are used as blob local feature detector and descriptor. The extracted features are coded mathematically with Bag of Visual Words algorithm in order to represent an image by the histograms of visual words. Dimension reduction technique is used to eliminate non-relevant and non-distinctive data using Principle Component Analysis (PCA). Finally, a single class Support Vector Machine (SVM) classifier is used to classify the object image as a positive or negative match. We extend the typical use of BOVW by using an object proposals technique to extract regions that will be classified by the SVM depends on keypoints location clustering instead of sliding window approach. Besides enhance the resolution independency by using geospatial info extracted from the remote sensing images meta-data to extract real dimensions of objects during training and detection. The whole approach will be tested practically in the experiment work to prove that this approach is capable to detecting a number of geo-spatial objects, such as airplane, airports and cars.
Introduction
The remote sensing, images has been developed in quantity and quality and its applications. The image itself is not useful without analysis. The analysis is to generate information from the image. One of the image analysis tasks is the detection of objects from the images, either man-made objects or natural objects. The automation of this task is very useful in real world applications, but it is very challenging. This can be one of the computer vision field problems. The methods that, use local features in object, recognition from visual data is very successful in recent researches. The benefits of using local features is immunity, to occlusion, and clutter, and with greatest significantly, no pre-step of segmentation, is required before local feature extraction. The accessibility of diverse feature extraction and descriptors algorithms lets local feature methods efficient. Furthermore, the large number of features, generated from images of objects is crucial advantage, of local features. While the benefits of local features are useful, a feature has to cover some factors; like invariance to scaling, rotation, illumination, viewing direction slight change, noise and cluttering.
Motivation
The revolutionary technology used in new generation satellite systems is driving the development of new large scale data handling approaches in remote sensing related applications. Furthermore, the large image archives captured over the previous missions are now being used to produce innovative global products. In particular, the development of large-scale analytics tools to efficiently extract information and apply the achieved results towards answering scientific questions represents a big challenge for the research community working in the Remote Sensing field. One of the most useful analytic tools in remote sensing images is the object detection and recognition, either the man-made objects or natural ones as shown in
Figure ‎1‑1 Object detection as a Remote sensing image interpretation analysis |
There are a lot of challenges faced by the researchers like, but not limited to, enhancing the efficiency to process large data, developing the suitable techniques to detect and recognize various object types and develop tools and platforms needed to store, analyze, interpret and represent data and results. These challenges united experts of data science, algorithm development and computer science, as well as environmental experts and geoscientists, to present state-of-the-art algorithms, tools, and applications for processing and exploitation of a huge amount of remotely sensed data. The scope of these researches can be generalized as following:
- Studies describing advanced approaches to process large volume of multi-temporal optical, SAR (Synthetic Aperture Radar) and radiometric data.
- Studies discussing innovative techniques, and associated data processing methods for very large-scale data exploitation.
- Critical analyses of existing and innovative tools, methods and techniques for large-scale analytics to extract and represent information
- Results of case studies executed at different large spatial and temporal scales, also by using GRID and/or Cloud Computing platforms.
- Results of on-going national/international initiatives and solutions for managing, processing, and disseminating huge archives of Remote Sensing data and relevant results.
Problem Statement
This thesis addresses the problem of geospatial object detection and recognition from high resolution satellite images. The problem we are trying to solve is to decide if a given aerial, or satellite image, contains one or more objects, belonging to the class of interest, and locate the position, of each predicted object, in the image. The expression ‘object’ stated in this thesis is any type of object may appear in the remote sensing images, including man-made objects which have sharp edges and are distinct from the background, for example a building, a ship, a vehicle. Our solution must be consider the challenges and difficulties of object detection in optical remote sensing images like visual appearance variations which caused by occlusion, viewpoint variation, clutter, illumination variation, shadow variation, etc.
A general statement of the problem can be formulated as follows:
“Given a remote sensing image contains different objects, it is required to decide if one or more occurrences of a specific object class is existing in this image, and if so, detect locations of these objects, this needs to be successful in case of variation of viewpoint, occlusion, background clutter”
Objectives
Model a methodology to solve the problem stated above that can features the following:
- Acquire training data of unlimited object classes.
- Read high resolution remote sensing images and able to analyze its data.
- Detect occurrences of trained object classes in the remote sensing images
- Demonstrate results as a geo-referenced data type.
In this thesis, we will demonstrate a model to achieve these objectives, and assess its results compared to other state-of-the-art models presented in the recent researches.
Thesis Layout
The thesis is composed of five chapters, the first chapter presenting an introduction stating the motivation, problem definition and objectives, second chapter is discussing the literature survey about the problem and researches in the field, third chapter presenting a detailed explanation of the methodology proposed to solve the problem. Fourth chapter contains the experimental results of the model. Fifth chapter discusses and concludes the methodology represented in this thesis, then a few points is suggested as a future work.