Similarity Searching for Multi-attribute Sequences

We investigate the problem of searching similar multi-attribute time sequences. Such sequences arise naturally in a number of medical, financial, video, weather forecast, and stock market databases where more than one attribute is of interest at a time instant. We first solve the simple case in which the distance is defined as the Euclidean distance. Later, we extend it to shift and scale invariance. We formulate a new symmetric scale and shift invariant notion of distance for such sequences. We also propose a new index structure that transforms the data sequences and clusters them according to their shiftings and scalings.  This clustering improves the efficiency considerably.  According to our experiments with real and synthetic datasets, the index structure's performance is 5 to 45 times better than competing techniques, the exact speedup based on other optimizations such as caching and replication.