TOMASZ MICHALSKI [*]
This paper includes the definition of the measure of the similarity between two functions. This measure is a quantitative characteristic of similarity of function graphs and is helpful in comparing the dynamics of economic processes. Construction of the measure
Introduction
The growing central and eastern European countries' interest in the integration process, within the confines of the European Union, is connected with the realization of proper economic programs, the so-called "access programs." These are the programs in which the main focus is the integration of economies of the candidate countries with those of the European Union countries [Michalski, 1995 p. 229; Nowak, 1990]. Evaluation of the efficiency and effectiveness of the programs enables verification and possible correction. Evaluation is done with the help of the methods of similarity studies. In most of these methods, attention is focused on comparing items at a specific moment. Due to this fact, the dynamic character of the problem is reduced (or is less important).
Introduction of new methods or improvement of the existing methods of economic process comparison lends a greater possibility to control and influence those processes. It is also connected with the possibility of a current economic program correction. It is crucial in this time when the processes of economic integration are receiving so much attention. We suggest that the method enabling comparison and evaluation of the degree of economic process dynamics should be included in the methods of similarity studies that are used when assessing the access programs. In the suggested comparative method, the similarity of the economic process dynamics is defined by the measure of the compatibility of the functions. The considered functions are mainly of the so-called "dynamics functions" (or tendencies or distributions) [Dhrymes, 1971], whose analytical form we consider as a mathematical model of the dynamics of compared processes.
The Similarity Measure of Functions
Starting from the definition [Dorosiewicz and Michalski, 1998, p. 198], let f, g: D [right arrow] R be given functions which map the set D [subset of] [[R.sup.n].sub.e] into R. Let J be a functional space, the subset of the set Map (D, R) of all functions mapping D into R. The similarity measure on J is:
[micro]: J x J [right arrow] R,
where [micro] is normalized that is, -1 [less than or equal to] [micro](f, g) [less than or equal to] 1 for any (f, g) [epsilon] J x J:
[micro] (f,f) = 1 for f [epsilon] J.
For any (f, g) [epsilon] J x J, then [micro](f, g) = [micro](g, f). If (f, g) [epsilon] J x J and (f, g + c) [epsilon] J x J, where c is a constant, then [micro](f, g) = [micro] (f, g + c).
Example 1
Let f, g be continuously differentiable functions on the interval ]a, b[, where -[infinity] [less than or equal to] a [less than or equal to] b [less than or equal to] + [infinity]. Construction of the similarity measure can be done in two steps:
1) construction of the measure in a given point t [epsilon] ]a, b[; and
2) construction of this measure on a given subset of R.
In step 1, the basis of the construction of the measure of f, g in the point t [epsilon] ]a, b[ is the cosine of an angle between the tangent lines to the graphs of f and g in the points (t, f(t)) and (t, g(t)), respectively (see Figure 1). Consider the situation in Figure 2.
If [[theta].sub.1] = [pi] - [[theta].sub.2], then obviously cos [[theta].sub.1] = cos [[theta].sub.2]. However, it seems that the function f is more similar to g to [t.sub.1] than at [t.sub.2] (in the neighborhood of [t.sub.1], both f and g increase, but in [t.sub.2], they do not). The measure of similarity expressed by (1) will reflect this fact:
[micro](f,g)(t) = [absolute val. of 1 + f'(t)g'(t)]/[square root](1 + [(f'(t)).sup.2])(1 + [(g'(tt).sup.2]) sgn(f'(t)g'(t)), (1)
where [micro](f,g)(t) is the value of the measure in point t, and sgn is the function defined by:
sgn(a) = {1 if a [greater than or equal to] 0, -1 if a [less than] 0.
In step 2, using the similarity measure in point t, we can define the value of the measure on ]a, b[, where - [infinity] [less than or equal to] a [less than] b [less than or equal to] + [infinity] (if it exists):
[[micro].sub.]a,b[] (f,g) = [lim.sub.[epsilon] - b+, [delta] - a -] 1/[epsilon] - [delta] [[[integral].sup.[epsilon]].sub.[delta]] [micro](f,g)(t)dt. (2)
The value of the measure of similarity on closed interval is defined in the same way. For sufficiently smooth functions (that is, with continuous derivatives of order N for some N [greater than or equal to] 1), one can define the measure of similarity by (2), with [micro](f,g)(t) replaced by:
[[micro].sub.N](f,g)(t) = [[[sigma].sup.N].sub.p=1] [w.sub.p][micro]([f.sup.(p)], [g.sup.(p)])(t),
and the weights [w.sub.p], (p = 1, ... , N), given, for example, by:
[w.sub.p] = 1/p [([[[sigma].sup.N].sub.j=1] 1/j).sup.-1] or [w.sub.p] = 1/p! [([[[sigma].sup.N].sub.j=1] 1/j!).sup.-1],
and so on.
Example 2
Consider the set T = {[t.sub.0] + kh: k = 1, ... , m}, where [t.sub.0] [epsilon] R, h [greater than] 0, and m [epsilon] N [union] {+[infinity]} are given, and two real functions f,g are defined on T. In this case, the measure of similarity of f and g is the average value of measures in the points of T:
[micro](f, g) = {[liminf.sub.n [right arrow] + [infinity]] 1/n-1 [[[sigma].sup.n-1].sub.k=1] [micro](f, g) ([t.sub.0] + kh), if m = +[infinity] 1/m - 1 [[[sigma].sup.m-1].sub.k=1] [micro](f, g) ([t.sub.0] + kh), if m [less than] +[infinity], (3)
where, for t [epsilon] T:
[micro](f,g)(t) = [absolute val. of 1 + [delta]f(t) [delta]g(t)]/[square root](1 + [([delta]f(t)).sup.2])(1 + [([delta]g(t)).sup.2]) sgn([delta]f(t)[delta]g(t)), (4)
and for [varphi] = f,g:
[delta][varphi] = [varphi]([t.sub.0] + (k + 1)h) - [varphi]([t.sub.0] + kh)/h.
Equations (3) and (4) are very similar to (1) and (2). The difference [delta][varphi] corresponds in a natural way with the derivative [varphi]'.
The construction from the last example can be generalized in several ways. Namely, let X be a set, M, a given [sigma], algebra of subsets of X, and v: M [right arrow] R is a nonnegative and bounded measure such that v(X) [greater than] 0, and the map, x [right arrow] [micro](f,g)(x), is v measurable.
The measure of similarity of functions f,g: X [right arrow] R is equal:
[micro](f,g) = 1/v(X) [[integral].sub.X][micro](f,g)(x)dv(x).
Similarity of Stochastic Processes
The above ideas can be applied to construct the measure of similarity of stochastic processes. The case with continuous-time stochastic processes is much more complex. Equation (1) is not useful in many important cases because it requires the differentiability of the compared processes. Unfortunately, many stochastic processes are not differentiable in any way. For example, almost all (with probability one) trajectories of a separable Wiener process are continuous but not differentiable. Moreover, the Wiener process is not differentiable in the mean-square sense [Sobczyk, 1996, pp. 116-9]. This implies a necessity for changes in the definition of the measure from Example 1.
For discrete-time stochastic processes, the measure of similarity can be defined analogously to (3) and (4). More precisely, suppose that [([xi]).sub.t[epsilon][T.sub.1]] and [([eta]).sub.t[epsilon][T.sub.2]], where [T.sub.1], [T.sub.2] [subset of] X and X are the same as in Example 2, are given stochastic processes. The domains [T.sub.1] and [T.sub.2] of these stochastic processes cannot be the same, so we suppose that [T.sub.1] [intersection] [T.sub.2] is finite and nonempty. It is easy to see that for any t [epsilon] [T.sub.1] [intersection] [T.sub.2]:
[[lambda].sub.t]([xi], [eta]) = [absolute val. of 1 + [delta] [[xi].sub.1][delta] [[eta].sub.t]]/[square root](1 + [([delta][[xi].sub.t].sup.2])(1+ [([delta][[eta].sub.t]).sup.2]) sgn ([delta][[xi].sub.t],[delta][[eta].sub.t]), (5)
is random variable. Moreover, the Lebesgue majority theorem implies that the expected values and other moments of [[lambda].sub.t]([xi], [eta]) exists and are finite for all t [epsilon] [T.sub.1] [intersection] [T.sub.2]:
The value of measure of similarity of [xi], [eta] can be obtained from the formula analogous to (3):
[micro]([xi], [eta]) = 1/[absolute val. of [T.sub.1][intersection][T.sub.2]] [[sigma].sub.t[epsilon][T.sub.1][intersection][T.sub.2]] E([[lambda].sub.t][xi], [eta])). (6)
The random variables [[lambda].sub.t] ([xi], [eta]) and t[epsilon] [T.sub.1] [intersection] [T.sub.2] can be used to construct the statistics verifying the hypothesis about the (non)stationarity of process generating given time series. These statistics may be used together with well-known stationarity tests such as the Dickey-Fuller test (see, for example [Hamilton, 1994, p. 475]). To show this idea, consider the stochastic processes ([X.sub.t]) ([Y.sub.t]) ([1.sub.j]) defined for t = 0, 1,..., T:
P([X.sub.0] = a) = P([Y.sub.0] = b) = P([1.sub.t] = c) = 1 for some numbers a, b, c,
[X.sub.t] = [c.sub.1] + [X.sub.t-1] + [[epsilon].sub.t], t = 1, ..., T, (7)
and [Y.sub.t] = [c.sub.2] + [Y.sub.t-1] + [[delta].sub.t], t = 1, ..., T,
where [c.sub.1], [c.sub.2] are constants and ([[epsilon].sub.1]), ([[delta].sub.t] are independent Gaussian white noises with zero-expected values and unitary variances.
Consider the random variables of:
[[lambda].sub.1] = [micro]([xi], [B.sup.[T/2]][eta]), (8a)
[[lambda].sub.2] = [micro]([xi], [eta]), (8b)
and
[[lambda].sub.3] = [micro]([xi], 1), (8c)
where [micro] is defined by (6), B denotes the time-translation operator, [(B[eta]).sub.t] = [[eta].sub.t+1], and [x] is the integer part of number x.
All of these random variables have finite all moments. If random variables [xi], [eta] are random walks with dynamics described by the three conditions in (7), then for any s = 1,2, ...:
E[[[lambda].sup.s].sub.i] = [[integral].sub.[R.sup.2]] [[absolute val. of 1 + xy].sup.s]/(1 + [x.sup.2])[(1 + [y.sup.2]).sup.2] [(sgn(xy)).sup.s] f(x) f(y) dxdy if i = 1, 2,
and
E[[[lambda].sup.s].sub.2] = [[integral].sub.R] 1/1 + [x.sup.2] [(sgn(x)).sup.s] f(x) dx,
Where f denotes density function of normal distribution N(O, 1). It follows from central limit theorem that the sequence of random variables are:
[[lambda].sub.i] = E[[lambda].sub.i]/D[[lambda].sub.i],
where D[[lambda].sub.i] = [(E[[[lambda].sup.2].sub.i] - [(E[[lambda].sub.i]).sup.2]).sup.1/2] and converges to the variable with standard normal distribution N(0, 1). This observation gives the critical values and sets for verifying (7).
It is easy to see that random variables in (8a) through (8c) and their asymptotic distributions are useful in testing integration and cointegration properties of processes (see Hamilton [1994, pp. 437, 571-4]). These procedures require replacing the differences [delta][[xi].sub.t], [delta][[eta].sub.t] in (5) by d-order ones, [[delta].sup.d][[xi].sub.t], [[delta].sup.d][[eta].sub.t].
(*.) Warsaw School of Econmics--Poland
References
Dhrymes, P. J. Distributed Lags: Problems of Estimation and Formulation, San Francisco, CA: Holden-Day, 1971.
Dorosiewicz, S.; Michalski, T. "Podobienstwo funkcji w badaniach ekonomicznych (metody i przykady)," Przeglad Stalystyczny, XLV, 2, 1998, pp. 187-96.
Hamilton, J. D. Time Series Analysis, Princeton, NJ: Princeton University Press, 1994.
Michalski, T. "Eastern Economies Toward European Union: Statistical Measures of Perspective Evaluation," International Advances in Economic Research, 1, 3, 1995.
Nowak, E. "Metody taksonomiczne w klasyfikacji obiektow spoleczno-gospodarczych," Warsaw, Poland: State Economics Publishing House, 1990.
Sobczyk K. Stochastic Differential Equations, Warsaw, Poland: Wydawnictwa Naukowo-Techniczne, 1996.