Copula functions: characterizing uncertainty
in probabilistic systems
P.Kumar
University
of Northern British Columbia
Prince
George, BC V2N 4Z9, Canada
Uncertainty is
ominpresent and its understanding is central in decision making process.
Uncertainty emerges when there is less information than the total information
required to describe a system and environment. Uncertainty and information are
so closely associated that information provided by an experiment for instance
is equal to the amount of uncertainty removed. Uncertainty prevails in several
forms and various kinds of uncertainties may arise from random fluctuations,
incomplete information, imprecise perception, vagueness etc. The probability
theory based framework deals with the uncertainty of random phenomenon while
fuzzy set concept provides an appropriate mathematical framework to deal with
vagueness. We in this paper describe the concept of copula functions to
characterize uncertainty associated with probabilistic systems. Copula
functions join uniform marginal distributions of random variables to form their
multivariate distribution functions. Copulas are useful because they separate
joint distributions into two contributions; one the
marginal distributions of each variable and other the copula as a measure of
dependence. Several families of copulas with varying shapes are available providing
flexibility in modeling.
Uncertainty
plays a vital role in our differing perceptions about the phenomena observed
around us. As our perception of the world gets more complex, the number of
phenomena about which we are uncertain increases as well the uncertainty about
each phenomena. To understand and decrease this uncertainty, we collect
increasing amount of information. However this may cause an increase in
uncertainty instead of helping to understand it. We may for instance refer to
the second law of thermodynamics which states: Uncertainty in the world always
tends to increase. Uncertainty is not a monolithic concept. It may appear in
several forms like a probabilistic phenomenon for example occurrence or
nonoccurrence of a random event or like a deterministic phenomenon where we
know that the outcome is not governed by chance but we are fuzzy about the
possibility of a specific outcome. We deal in this paper with the probabilistic
uncertainty. Enormous growth has been witnessed with regard to the applications
of information theoretic framework in physical, engineering, biological and
social sciences and more so in the
fields of information technology, nonlinear systems and molecular biology.
Shannon [1] laid the mathematical foundation of information theory in the
context of communication theory in his seminal paper: A mathematical theory of
communication. He defined a probabilistic measure of uncertainty referred to as
entropy. However earlier contributions in this direction have been due to
Nyquist [2] and Hartley [3]. The remarkable success of Shannon's entropy
measure has been primarily because it could quantify and analyze uncertainty
present in the probabilistic systems. Since Shannon's work, significant
contributions to the area of entropy optimizing principles and information
measures have been made [4-6]. We now summarize some basic results about
entropy in characterizing the stochastic dependence between two discrete random
variables. However, random variables could be either discrete or continuos.
Generalizations to the multivariate situations are obvious.
In research we considered different examples to
illustrate applications of the information-theoretic uncertainty measures: one
univariate case and other bivariate distribution.
Example
1. Benford's Law [26] is a powerful and relatively simple tool
for pointing suspicion at frauds, embezzlers, tax evaders, sloppy accountants
and even computer bugs. The income tax agencies of several nations and several
states are using detection software based on Benford's Law, as are a score of
large companies and accounting businesses.
Example
2. The viscosity (
1.
C.E.Shannon. A
Mathematical Theory of Communication-An Integrated Approach. Cambridge
University Press, 1948.
2.
Nyquist. Certain topics in telegraph transmission theory.
Trans. AIEE, 47, 1928; 617-644, Reprint as classic paper in Proc. IEEE, 90, 2,
2002.
3.
R.V.L.Hartley. Transformation of information. Bell Systems
Technical Journal, 7, 1928, 535-563.
4.
J.N.Kapur. Maximum
Entropy Models in Science and Engineering. Wiley Eastern, Delhi, 1989.
5.
J.N.Kapur, H.K.Kesavan. Entropy Maximization Principles with Applications. Academic
Press, 1992.
6.
Karmeshu, N.R.Pal. Uncertainty, Entropy and Maximum Entropy
Principles (an overview, In Entropy
Measures, Maximum Entropy Principles and Engineering Applications.
Karmeshu (Ed), Springer, 2002.
7.
A.M.Mathai, P.N.Rathie. Basic Concepts in Information Theory and Statistics. John
Wiley&Sons, 1975.
8.
S.Kullback, R.A.Leibler. On information
and sufficiency. Annals Mathematical Statistics, 22, 1951, 79-86.
9.
E.Kovács. On the using of copulas in characterizing of
dependence with entropy. Pollack Periodica-An International Journal from
Engineering and Information Sciences, 2007.
10.
A.Sklar. Fonctions de répartition á n
dimensional et leurs marges. Publ. Inst. Stat. Univ. Paris, 8 (1959), 229-231.
11.
D.J.G.Farlie. The preformance of some correlation
coefficients for a general bivariate distribution. Biometrika, 47, 1960,
307-323.
12.
E.J.Gumbel. Bivariate exponential distributions. Journal of
the American Statistical Association, 55, 1960, 698-707.
13.
D.Morganstern. Einfache beispiele zweidimensionaler
verteilungen. Mittelingsblatt für Mathematische Statistik, 8, 1956,
234-235.
14.
M.Frécht. Sue les tableaux de corrélation dont
les marges son données. Ann. Univ. Lyon, Sect. A, 9, 1951, 53-77.
15.
W.Hoeffding. Masstabinvariance korrelationsmasse. Schriften
des Mathematischen Instituts für Angewandte Mathematik der
Universität Berlin, 5, 3, 1940, 179-233.
16.
W.Hoeffding. Masstabinvariance korrelationsmasse für
diskontinuierliche vereteilungen. Arkiv für Mathematischen Wirtschaften
und Sozialforschung, 7, 1941, 49-70.
17.
R.Nelsen. An
Introduction to Copulas. Springer, New York, 2006.
18.
P.Kumar. Copulas: Distribution Functions and Simulation. In
Lovric, Miodrag (Ed), International
Encyclopedia of Statistical Science, Springer Science+Business Media,
LLC, Heidelberg, 2011.
19.
B.Schweizer, E.Wolff. On nonparametric measures of dependence
for random variables. Annals of Statistics, 9, 1981, 879-885.
20.
B.Schweizer, A.Sklar. Probabilistic
Metric Spaces. North Holland, New York, 1983.
21.
D.Tjøstheim. Measures of dependence and tests of
independence, Statistics, 28, 1996, 249-284.
22.
H.Joe. Multivariate
Models and Dependent Concepts. Chapman&Hall, New York, 1997.
23.
M.E.Johnson. Multivariate
Statistical Simulation. Wiley: New York, 1987.
24.
A.W.Marshall, I.Olkin. Families of multivariate distributions.
Journal of the American Statistical Association, 83, 1988, 834-841.
25.
G.Mercier. Measures
de Dépendance entre Images RSO. GET/ENST Bretagne,
Tech.Rep.RR-2005003-ITI, 2005; http//:perso.enst-bretagne.fr/126mercierg
26.
F.Benford. The law of anomalous
numbers. Proceedings of the American Philosophical Society 78, 4, 1938)
551-572.
27.
D.C.Montgomery. Design
and Analysis of Experiments. John Wiley, 2009.
© 1995-2008 Kazan State University