Science

Copula functions: characterizing uncertainty

in probabilistic systems

P.Kumar

University of Northern British Columbia

Prince George, BC V2N 4Z9, Canada

Uncertainty is ominpresent and its understanding is central in decision making process. Uncertainty emerges when there is less information than the total information required to describe a system and environment. Uncertainty and information are so closely associated that information provided by an experiment for instance is equal to the amount of uncertainty removed. Uncertainty prevails in several forms and various kinds of uncertainties may arise from random fluctuations, incomplete information, imprecise perception, vagueness etc. The probability theory based framework deals with the uncertainty of random phenomenon while fuzzy set concept provides an appropriate mathematical framework to deal with vagueness. We in this paper describe the concept of copula functions to characterize uncertainty associated with probabilistic systems. Copula functions join uniform marginal distributions of random variables to form their multivariate distribution functions. Copulas are useful because they separate joint distributions into two contributions; one the marginal distributions of each variable and other the copula as a measure of dependence. Several families of copulas with varying shapes are available providing flexibility in modeling.

Uncertainty plays a vital role in our differing perceptions about the phenomena observed around us. As our perception of the world gets more complex, the number of phenomena about which we are uncertain increases as well the uncertainty about each phenomena. To understand and decrease this uncertainty, we collect increasing amount of information. However this may cause an increase in uncertainty instead of helping to understand it. We may for instance refer to the second law of thermodynamics which states: Uncertainty in the world always tends to increase. Uncertainty is not a monolithic concept. It may appear in several forms like a probabilistic phenomenon for example occurrence or nonoccurrence of a random event or like a deterministic phenomenon where we know that the outcome is not governed by chance but we are fuzzy about the possibility of a specific outcome. We deal in this paper with the probabilistic uncertainty. Enormous growth has been witnessed with regard to the applications of information theoretic framework in physical, engineering, biological and social sciences  and more so in the fields of information technology, nonlinear systems and molecular biology. Shannon [1] laid the mathematical foundation of information theory in the context of communication theory in his seminal paper: A mathematical theory of communication. He defined a probabilistic measure of uncertainty referred to as entropy. However earlier contributions in this direction have been due to Nyquist [2] and Hartley [3]. The remarkable success of Shannon's entropy measure has been primarily because it could quantify and analyze uncertainty present in the probabilistic systems. Since Shannon's work, significant contributions to the area of entropy optimizing principles and information measures have been made [4-6]. We now summarize some basic results about entropy in characterizing the stochastic dependence between two discrete random variables. However, random variables could be either discrete or continuos. Generalizations to the multivariate situations are obvious.

In research we considered different examples to illustrate applications of the information-theoretic uncertainty measures: one univariate case and other bivariate distribution.

Example 1. Benford's Law [26] is a powerful and relatively simple tool for pointing suspicion at frauds, embezzlers, tax evaders, sloppy accountants and even computer bugs. The income tax agencies of several nations and several states are using detection software based on Benford's Law, as are a score of large companies and accounting businesses.

Example 2. The viscosity (  in centistokes under 100 ) of a polymer is related to the reaction temperature (  in ) and catalyst feed rate (  in lb/h). An experiment has been conducted to model viscosity, reaction temperature and catalyst feed rate and the corresponding results obtained [27; page 393].

1.           C.E.Shannon. A Mathematical Theory of Communication-An Integrated Approach. Cambridge University Press, 1948.

2.           Nyquist. Certain topics in telegraph transmission theory. Trans. AIEE, 47, 1928; 617-644, Reprint as classic paper in Proc. IEEE, 90, 2, 2002.

3.           R.V.L.Hartley. Transformation of information. Bell Systems Technical Journal, 7, 1928, 535-563.

4.           J.N.Kapur. Maximum Entropy Models in Science and Engineering. Wiley Eastern, Delhi, 1989.

5.           J.N.Kapur, H.K.Kesavan. Entropy Maximization Principles with Applications. Academic Press, 1992.

6.           Karmeshu, N.R.Pal. Uncertainty, Entropy and Maximum Entropy Principles (an overview, In Entropy Measures, Maximum Entropy Principles and Engineering Applications. Karmeshu (Ed), Springer, 2002.

7.           A.M.Mathai, P.N.Rathie. Basic Concepts in Information Theory and Statistics. John Wiley&Sons, 1975.

8.           S.Kullback, R.A.Leibler. On information and sufficiency. Annals Mathematical Statistics, 22, 1951, 79-86.

9.           E.Kovács. On the using of copulas in characterizing of dependence with entropy. Pollack Periodica-An International Journal from Engineering and Information Sciences, 2007.

10.        A.Sklar. Fonctions de répartition á n dimensional et leurs marges. Publ. Inst. Stat. Univ. Paris, 8 (1959), 229-231.

11.        D.J.G.Farlie. The preformance of some correlation coefficients for a general bivariate distribution. Biometrika, 47, 1960, 307-323.

12.        E.J.Gumbel. Bivariate exponential distributions. Journal of the American Statistical Association, 55, 1960, 698-707.

13.        D.Morganstern. Einfache beispiele zweidimensionaler verteilungen. Mittelingsblatt für Mathematische Statistik, 8, 1956, 234-235.

14.        M.Frécht. Sue les tableaux de corrélation dont les marges son données. Ann. Univ. Lyon, Sect. A, 9, 1951, 53-77.

15.        W.Hoeffding. Masstabinvariance korrelationsmasse. Schriften des Mathematischen Instituts für Angewandte Mathematik der Universität Berlin, 5, 3, 1940, 179-233.

16.        W.Hoeffding. Masstabinvariance korrelationsmasse für diskontinuierliche vereteilungen. Arkiv für Mathematischen Wirtschaften und Sozialforschung, 7, 1941, 49-70.

17.        R.Nelsen. An Introduction to Copulas. Springer, New York, 2006.

18.        P.Kumar. Copulas: Distribution Functions and Simulation. In Lovric, Miodrag (Ed), International Encyclopedia of Statistical Science, Springer Science+Business Media, LLC, Heidelberg, 2011.

19.        B.Schweizer, E.Wolff. On nonparametric measures of dependence for random variables. Annals of Statistics, 9, 1981, 879-885.

20.        B.Schweizer, A.Sklar. Probabilistic Metric Spaces. North Holland, New York, 1983.

21.        D.Tjøstheim. Measures of dependence and tests of independence, Statistics, 28, 1996, 249-284.

22.        H.Joe. Multivariate Models and Dependent Concepts. Chapman&Hall, New York, 1997.

23.        M.E.Johnson. Multivariate Statistical Simulation. Wiley: New York, 1987.

24.        A.W.Marshall, I.Olkin. Families of multivariate distributions. Journal of the American Statistical Association, 83, 1988, 834-841.

25.        G.Mercier. Measures de Dépendance entre Images RSO. GET/ENST Bretagne, Tech.Rep.RR-2005003-ITI, 2005;    http//:perso.enst-bretagne.fr/126mercierg

26.        F.Benford. The law of anomalous numbers. Proceedings of the American Philosophical Society 78, 4, 1938) 551-572.

27.        D.C.Montgomery. Design and Analysis of Experiments. John Wiley, 2009.

 

 




[Contents]

homeKazanUniversitywhat's newsearchlevel upfeedback

© 1995-2008 Kazan State University