In system modeling arises a fundamental question about the level of difficulty one may encounter when designing a model on a basis of some training data. In this study, we advocate that such level of difficulty inherently depends upon the variability of the available function (data). If for a pair of input data which exhibits small differences, the differences of the corresponding outputs are substantial then building a model in the presence of such data becomes more challenging than in cases of data where the differences in the output data are far more limited. Dwelling on this observation, we introduce a variability index quantifying the nature of data in terms of variability observed in input and output data, respectively. The proposed index is model-neutral (model agnostic), namely describes and quantifies the modeling challenge implied by the data irrespectively of the specific model to be constructed. In case of functions, we show that the Lipschitz constant plays a similar role as the variability index computed for experimental data. An original way of reducing values of the variability index through a nonlinear transformation of original data completed by a fuzzy rule-based model is introduced. It is shown that such rule-based architecture gives rise to a piecewise linear transformation (multipoint linear approximation) exhibiting required contraction-dilation characteristics. The optimization of this transformation is carried out with the use of a Particle Swarm Optimization algorithm. We also demonstrate that the index can be used to quantify a concept of adversarial data. Along this line, we introduce a granular characterization of adversarial feature of individual data points. A series of experiments is provided to offer a thorough illustration and detailed insight into the nature and a thorough characterization of publicly available data.
ASJC Scopus subject areas