Data Mining in non-stationary data streams is gaining more attention
recently, especially in the context of Internet of Things and Big Data.
It is a highly challenging task, since the fundamentally different types
of possibly occurring drift undermine classical assumptions such as
data independence or stationary distributions.
Available algorithms are either struggling with certain forms of drift or require
a priori knowledge in terms of a task specific setting.
We propose the Self Adjusting Memory (SAM) model for the k Nearest Neighbor (kNN) algorithm
since kNN constitutes a proven classifier within the streaming setting.
SAM-kNN can deal with heterogeneous concept drift,
i.e different drift types and rates, using biologically inspired
memory models and their coordination. It can be easily
applied in practice since an optimization of the meta parameters is not necessary.
The basic idea is to construct dedicated models for the
current and former concepts and apply them according to
the demands of the given situation.
An extensive evaluation on various benchmarks, consisting of artificial streams
with known drift characteristics as well as real world datasets is conducted.
Thereby, we explicitly add new benchmarks enabling a precise performance evaluation
on multiple types of drift. The highly competitive results throughout all
experiments underline the robustness of SAM-kNN as well as its capability
to handle heterogeneous concept drift.