multidimensional multi-sensor time-series data analysis framework

Hello, friends. In this blog post, I will take you through my package “msda” useful for time-series sensor data analysis. A quick introduction about time-series data is also provided. The demo notebook can be found on here

One of the specific use case applications focused on “Unsupervised Feature Selection” using the package can be found in the blog post here.

What is Time Series Data?

Time series data is information taken at a particular duration. For instance, having a set of sensor data observed at particular equal paces, each sensor can be classified as time series. If the data is collected without any order in time, or at once, it is not time series data.

There are two types of time series data:

1- Stock Series (Measure of attribute, in particular point of time)

2- Flow Series (Measure of activity, in a time interval)

Components of Time Series Data

To analyze time series data, we need to know the different pattern types. These patterns will together create the set of observations on time series.

1) Trend: A long pattern present in the time series. It represents the variations of low, medium and high frequency filtered out from the time series.

If there is no increasing or decreasing pattern in the time series data, it is taken as stationary in the mean.

There are two types of trend pattern:

2) Cyclic: The pattern exhibit up and down movements around a specified trend. The period of time is not fixed and usually composed of at least 2 months in duration.

3) Seasonal: Pattern that reflects regular fluctuations. These short-term movements occur due to the seasonal and custom factors of people. The data faces regular and predictable changes which occurs on regular intervals of calendar. It always consist of fixed and known period.

The main sources of seasonality:

Models to create a seasonal component in time series:

4) Irregular: It is an unpredictable component of time series.

Time Series Data vs Cross-Section Data

Time Series Data is composed of collection of data of one specific variable at particular interval of time. On the other hand, Cross-Section Data is consist of collection of data on multiple variables from different sources at a particular interval of time. Collection of company’s stock market data at regular interval of year is an example of time series data. But when the collection of company’s sales revenue, sales volume is collected for the past 3 months then it is taken as an example of cross-section data. Time series data is mainly used for obtaining results over an extended period of time, but cross-section data focuses on the information received from surveys at a particular time.

What is Time Series Analysis?

Analysis is performed in order to understand the structure and functions produced by the time series.

Two approaches are used for analyzing time series data are -

Time series analysis is mainly used for -

Need of Time Series Analysis

In order to model successfully, the time series is important in machine learning and deep learning. Time series analysis is used to understand the internal structure and functions that are used for producing the observations. Time Series analysis is used for -

Applications of Time Series Analysis

Few Time-Series Application Area Examples

Now, that we have seen through the basics of time-series, let’s dwell into the MSDA package & its details.

What is MDSA?

MSDA is an open source low-code Multi-Sensor Data Analysis library in Python that aims to reduce the hypothesis to insights cycle time in a time-series multi-sensor data analysis & experiments. It enables users to perform end-to-end proof-of-concept experiments quickly and efficiently. The module identifies events in the multidimensional time series by capturing the variation and trend to establish relationships aimed towards identifying the correlated features helping in feature selection from raw sensor signals.

The package includes:-

a) Features involving trend of values across various aggregation windows: change and rate of change in average, std. deviation across window.

b) Ratio of changes, growth rate with std. deviation.

c) Change over time.

d) Rate of change over time.

e) Growth or decay.

f) Rate of growth or decay.

g) Count of values above or below a threshold value.

Overview:-

Prototype for feature/sensor selection from multi-dimensional heterogeneous/homogeneous time series multi-sensor data. The intuitive representation of the framework is as shown below.

Pictorial representation of multi-dimensional time series data feature selection

Features Include:-

Core Functionalities in MSDA

MSDA Workflow:-

MSDA algorithm workflow

Terminal Installation:-

The easiest way to install msda is using pip.

pip install msda

or

$ git clone https://github.com/ajayarunachalam/msda

$ cd msda

$ python setup.py install

Install in Jupyter Notebook:-

!pip install msda

Follow the rest as demonstrated in the demo example [here] — https://github.com/ajayarunachalam/msda/tree/master/demo.ipynb

Who should use MSDA?

MSDA is an open source library that anybody can use. In my view, the ideal target audience of MSDA is:

CONTACT

You can reach me at ajay.arunachalam08@gmail.com

Thank you for reading. Happy Learning :)

REFERENCES

AWS Certified ML Specialist; Cloud Solution Architect; Sr. Data Scientist & Researcher (AI) — https://www.linkedin.com/in/ajay-arunachalam-4744581a/