Computing batched depth queries and the depth of a set of points

Thumbnail Image
Rajapakse, Sachini Kanchana
Journal Title
Journal ISSN
Volume Title
In Computational Geometry and Statistics, we often encounter problems related to data depth. Data depth describes the centrality of an object relative to a set of objects. Simplicial depth and Tukey depth are two such common depth measures for expressing the depth of a point q relative to a set P of points in R^d. In this thesis, we discuss the problems of how efficiently we can compute the depths of k query points in a batch relative to a set P of n points. We examine three algorithms for computing batched simplicial depth queries and two algorithms to compute batched Tukey depth queries in R^2. Depending on the relative cardinalities of P and Q, one algorithm performs better than others for each depth measure. Then, we introduce new notions to express the depth of a set Q of points relative to a set P of points in R^d as the average depth of points in Q. We discuss how to compute these new notions of depth in R^2 by applying the algorithms above, giving an algorithm for computing the simplicial depth of a set Q relative to set P in O(min{kn log n, n^2+nk, n^4+k log n}) time, and the Tukey depth of a set Q relative to set P in O(min{kn log n, n^2 + k log n}) time, when n = |P| and k = |Q|. Further, we consider different statistical and probabilistic interpretations of these new notions of depth of a set. We show that the simplicial depth of Q relative to P is proportional to the expected number of points of Q contained in a simplex constructed from points selected randomly from P. Properties of depth measures are commonly analyzed to compare and contrast different depth measures, which are initially introduced for the depth measures of a single query point; we present generalizations for five of these properties for the depth measures of sets and evaluate these properties to compare depth measures defined for a set of points.
Data depth, Tukey depth, Simplicial depth, Depth of multiple points, Depth of a query set