Outlier and Trend Detection Using Approximate Median and Median Absolute Deviation
Date Issued
Singh, Gagandeep
Kundu, Suman
In this modern era of technologies of scale, vast amounts of data are generated both by users and machines every day. This data comes as streams that may contain outliers. Detecting those outliers can be helpful in many ways, such as machine failures due to overload. Similarly, trends in social media posts are also outliers, and detecting them at different levels has great benefits. The current paper proposes an algorithm to approximate median and median absolute deviation from a stream of numerical values. The algorithm takes a fixed number of memory spaces and linear to the size of the memory. The median and median absolute deviation are then used to detect outliers and multi-level trends without being prone to noise in the data. Experimental results with CPU usage benchmark data and Twitter post data show the effectiveness of the proposed algorithms.