On Generalized Measures Of Information With Maximum And Minimum Entropy Prescriptions
Abstract
KullbackLeibler relativeentropy or KLentropy of P with respect to R deﬁned as ∫xlnddPRdP , where P and R are probability measures on a measurable space (X, ), plays a basic role in the
deﬁnitions of classical information measures. It overcomes a shortcoming of Shannon entropy – discrete case deﬁnition of which cannot be extended to nondiscrete case naturally. Further, entropy and other classical information measures can be expressed in terms of KLentropy and
hence properties of their measuretheoretic analogs will follow from those of measuretheoretic KLentropy. An important theorem in this respect is the GelfandYaglomPerez (GYP) Theorem which equips KLentropy with a fundamental deﬁnition and can be stated as: measuretheoretic KLentropy equals the supremum of KLentropies over all measurable partitions of X . In this thesis we provide the measuretheoretic formulations for ‘generalized’ information measures, and
state and prove the corresponding GYPtheorem – the ‘generalizations’ being in the sense of R ´enyi and nonextensive, both of which are explained below.
KolmogorovNagumo average or quasilinear mean of a vector x = (x1, . . . , xn) with respect to a pmf p= (p1, . . . , pn)is deﬁned ashxiψ=ψ−1nk=1pkψ(xk), whereψis an arbitrarycontinuous and strictly monotone function. Replacing linear averaging in Shannon entropy with KolmogorovNagumo averages (KNaverages) and further imposing the additivity constraint – a characteristic property of underlying information associated with single event, which is logarithmic – leads to the deﬁnition of αentropy or R ´enyi entropy. This is the ﬁrst formal wellknown generalization of Shannon entropy. Using this recipe of R´enyi’s generalization, one can prepare only two information measures: Shannon and R´enyi entropy. Indeed, using this formalism R´enyi
characterized these additive entropies in terms of axioms of KNaverages. On the other hand, if one generalizes the information of a single event in the deﬁnition of Shannon entropy, by replacing the logarithm with the so called qlogarithm, which is deﬁned as lnqx =x1− 1 −1 −q , one gets
what is known as Tsallis entropy. Tsallis entropy is also a generalization of Shannon entropy but it does not satisfy the additivity property. Instead, it satisﬁes pseudoadditivity of the form x ⊕qy = x + y + (1 − q)xy, and hence it is also known as nonextensive entropy. One can apply
R´enyi’s recipe in the nonextensive case by replacing the linear averaging in Tsallis entropy with KNaverages and thereby imposing the constraint of pseudoadditivity. A natural question that
arises is what are the various pseudoadditive information measures that can be prepared with this recipe? We prove that Tsallis entropy is the only one. Here, we mention that one of the important characteristics of this generalized entropy is that while canonical distributions resulting from ‘maximization’ of Shannon entropy are exponential in nature, in the Tsallis case they result in powerlaw distributions.
The concept of maximum entropy (ME), originally from physics, has been promoted to a general principle of inference primarily by the works of Jaynes and (later on) Kullback. This connects information theory and statistical mechanics via the principle: the states of thermodynamic equi
librium are states of maximum entropy, and further connects to statistical inference via select the probability distribution that maximizes the entropy. The two fundamental principles related to the concept of maximum entropy are Jaynes maximum entropy principle, which involves maximizing Shannon entropy and the Kullback minimum entropy principle that involves minimizing relativeentropy, with respect to appropriate moment constraints.
Though relativeentropy is not a metric, in cases involving distributions resulting from relativeentropy minimization, one can bring forth certain geometrical formulations. These are reminiscent of squared Euclidean distance and satisfy an analogue of the Pythagoras’ theorem. This property
is referred to as Pythagoras’ theorem of relativeentropy minimization or triangle equality and plays a fundamental role in geometrical approaches to statistical estimation theory like information geometry. In this thesis we state and prove the equivalent of Pythagoras’ theorem in the
nonextensive formalism. For this purpose we study relativeentropy minimization in detail and present some results.
Finally, we demonstrate the use of powerlaw distributions, resulting from MErescriptions
of Tsallis entropy, in evolutionary algorithms. This work is motivated by the recently proposed generalized simulated annealing algorithm based on Tsallis statistics.
To sum up, in light of their wellknown axiomatic and operational justiﬁcations, this thesis establishes some results pertaining to the mathematical signiﬁcance of generalized measures of information. We believe that these results represent an important contribution towards the ongoing
research on understanding the phenomina of information.
(For formulas pl see the original document)
ii
Collections
Related items
Showing items related by title, author, creator and subject.

Minimization Problems Based On A Parametric Family Of Relative Entropies
Ashok Kumar, M (20170821)We study minimization problems with respect to a oneparameter family of generalized relative entropies. These relative entropies, which we call relative entropies (denoted I (P; Q)), arise as redundancies under mismatched ... 
Higher Spins, Entanglement Entropy And Holography
Datta, Shouvik (20170821)The idea of holography [1, 2] finds a concrete realization in form of the AdS/CFT correspondence [3, 4]. This duality relates a field theory with conformal symmetries to quantum gravity living in one higher dimension. In ... 
Optimization Algorithms for Deterministic, Stochastic and Reinforcement Learning Settings
Joseph, Ajin George (20180530)Optimization is a very important field with diverse applications in physical, social and biological sciences and in various areas of engineering. It appears widely in machine learning, information retrieval, regression, ...