DETECTION OF OUTLIERS IN REAL TIME KINEMATIC (RTK) GLOBAL POSITIONING SYSTEM (GPS) OBSERVATION
DEPARTMENT: SURVEYING AND GEOINFORMATICS
CHAPTER ONE
1.0 INTRODUCTION
1.1 BACKGROUND OF STUDY
In carrying out data analysis, it is of uttermost importance to identify outlying observations that deviates so much from the overall datasets before data modelling; otherwise aberrant data may result in model misspecification, biased parameter estimation and incorrect results. That is to say it is futile to do data based analysis when data are contaminated with outliers because outliers can lead to incorrect analysis of results.
Outliers are observations that do not follow the statistical distribution of the bulk of the data, and consequently may lead to erroneous results with respect to statistical analysis (Liu, et al. 2004). According to Hawkins (1980), an outlier can be referred to as an observation that deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism.
Outliers are results of mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations (Hodge and Austin, 2004). Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can also identify errors and remove their contaminating effects on the data set and as such to purify the data for processing.
Outliers has the ability to alter the results arrived at if they are not carefully handled. The identification and handling of outliers leads to significantly greater computational process. Because of this, removal of outlying observations can enhance the quality of data used for statistical inferences. Eliminated outliers from observations will have positive effects on the results of data analysis and data mining. Simple statistical estimates, like sample mean and standard deviation can be significantly biased by individual outliers that are very far from the middle of the distribution (Kaya, 2010).
The overall objective of outlier identification and removal is to discern the odd data, whose behaviour is very anomalous when put side-by-side with the rest of the data set. Assessing the abnormal behaviour of outliers aid in the uncovering the valuable knowledge hidden behind them and also assist in decision making for the improvement of service quality. The main purpose outlier detection is to separate those observations that are divergent from the rest of the dataset. Outlier identification and removal is applied in several fields such as fraud detection, intrusion detection, data cleaning, medical diagnosis, etc. Data mining includes supervised and unsupervised approaches (Nithya and Caroline, 2014).
Surveying networks are used in many geomatics engineering projects to provide positioning information. In a surveying network, geodetic observations (height differences, distances, angles, directions and GPS baseline components) are made and then parameter estimation is realized using the method of least squares (Yetkin, 2013).
1.2 STATEMENT OF THE PROBLEM
The least squares technique is the most commonly used parameter estimation tool in geomatics. It is carried out by minimizing the sum of squares of weighted residuals. The advantage of the least squares method is that it has the ability to give an unbiased and minimum variance estimate. However, the least squares technique is limited it must use observations free from gross error i.e. blunder and systematic bias to provide optimal results. Unfortunately, these unwanted errors are often encountered in practice. Therefore, outlier detection and elimination in spatial data is very necessary in conducting spatial data analysis (Yetkin, 2013).
The classical least squares adjustment method is vulnerable against blunders because the basic assumption on which the theory of the least squares estimation is founded is that all the gross and systematic errors have been eliminated before the adjustment is performed, and only random errors affect the data, but there are outliers that are very close to random errors in magnitude and are determined strictly by applying outlier tests. The local and large disturbances are considered as gross errors, blunders or outliers, whereas smaller and global deviations are considered as systematic errors. Although these concepts look quite familiar, there are cases where clear distinction between local systematic errors and outliers of small magnitude cannot be made. Both kinds of errors have the same effect on the observations and therefore must be detected and distributed accordingly.
1.3 AIM AND OBJECTIVES
1.3.1 AIM
The main aim of this study is to detect and separate measurement noise (outliers) of GPS coordinates time series in RTK GPS observations.
1.3.2 OBJECTIVES
The specific objectives are
i) Carry out field observations using RTK GPS to generate data sets for detection of outliers.
ii) Develop a software program based on Matlab for detection of outliers in RTK GPS observations.
1.4 SIGNIFICANCE OF STUDIES
The monitoring of buildings, slide slopes and crustal movement is very essential in geodetic engineering since the reference points stability is very important in the geodetic methods of determining displacements. In practice, the determination of the monitored points displacements is usually preceded by the study of the reference point stability. The application of GPS to determine ground deformation may influenced on high accuracy of the obtained co-ordinates. This coordinates especially the vertical coordinates is characterized by the lowest accuracy. Outliers in the collected measurements may falsify points coordinates. Consequently they don’t show the actual movement of the points. This may prevent a proper initiation of constructional or geotechnical safety measures for this reasons the study of outlier detection and removal is significant in large engineering structures like buildings, bridges, dams etc. which are subjects to movements and may eventually lead to failure or collapse of the structures (Neumann and Kutterer, 2006; Zienkiewicz and Baryla, 2015).
1.5 STUDY AREA
The study area of this project is GidanKwano Campus of Federal University of Technology, Minna (see Figures1.0& 1.1). GidanKwano Campus of Federal University of Technology, Minna is located approximately between latitude 9.45N and 9.60N (9° 32' 17" N and 9° 31' 27" N) and longitude 6.33 E and 6.54 E (6° 27' 21" E and 6° 26' 18" E). The university is located in Northern Nigeria was established by the Federal Government of Nigeria by February, 1983.
Figure 1.0: Map of Study Area
Figure 1.1: Part of GidanKwano campus Minna, Niger state
1.6 SCOPE OF THE STUDY:
The study was limited to the generation of outliers from observation made using just one model of differential global positioning system to generate outliers for the project. The study was restricted to the local topographic features found in the study area that would provide poor satellite geometry. The factors contributing to these were expected to be generated from the localized features found at this site. These localized features included: the school of environmental complex of the federal university of technology Minna including some of the trees around the complex building. The observation for the study was also restricted to a small number of sample data points that could be readily recollected and covered a small geographic area. Also of all the many methods available for the detection of outliers, this study only focused on the use of Kalman Filter techniques for the detection of outliers in the real time kinematic observation of GPS.