MSTN: Fast and Efficient Multivariate Time Series Prediction Model

Abstract:

Real-world time series often exhibit strong non-stationarity, complex nonlinear dynamics,  and behavior expressed across multiple temporal scales, from rapid local fluctuations to slow-evolving long-range trends. However, many contemporary architectures impose rigid, fixed-scale structural priors—such as patch-based tokenization, predefined receptive fields, or frozen backbone encoders—which can over-regularize temporal dynamics and limit adaptability to abrupt high magnitude events. To handle this, we introduce the Multi-scale Temporal Network (MSTN), a hybrid neural architecture grounded in an Early Temporal Aggregation principle. MSTN integrates three complementary components: (i) a multi-scale convolutional encoder that captures fine-grained local structure; (ii) a sequence modeling module that learns long-range dependencies through either recurrent or attention based mechanisms; and (iii) a self-gated fusion stage incorporating squeeze–excitation and a single dense layer to dynamically reweight and fuse multi-scale representations. This design enables MSTN to flexibly model temporal patterns spanning milliseconds to extended horizons, while voiding the computational burden typically associated with long-context models. Importantly, MSTN applies early temporal aggregation immediately after encoding, ensuring that all subsequent refinement and prediction modules operate in constant time O(1) with respect to sequence length, while the front-end encoder retains its original complexity (O(L2) for Transformer, O(L) for BiLSTM). Across extensive benchmarks covering imputation, long-term forecasting, classification, and cross-dataset generalization, MSTN achieves state-of-the-art performance, establishing new best results on 21 of 27 datasets, while remaining lightweight (∼0.40M params for MSTN-BiLSTM and ∼1.06M for MSTN-Transformer) and suitable for low-latency inference (<1 sec, often in milliseconds), resource-constrained deployment. Code: https://github.com/SumitPTW/MSTN

Keywords: Multivariate time series, multi-scale temporal modeling, time series forecasting,
classification, imputation, computational efficiency, deep learning architectures.

MSTN architecture:

MSTN multi-scale signal processing pipeline: