3-5 August 2022
Universität Klagenfurt
Europe/Vienna timezone

Measuring statistical dependency with optimal transport

4 Aug 2022, 15:30
20m
HS 4 ( Universität Klagenfurt)

HS 4

Universität Klagenfurt

Talk Statistics Session B5 Statistics

Description

In this talk, we introduce a novel framework for measuring statistical dependency between two random variables $X$ and $Y$, the transport dependency $\tau(X, Y) \ge 0$. This coefficient relies on the notion of optimal transport and is applicable to random variables, taking values in general Polish spaces. It can be estimated consistently via the corresponding empirical measure, is versatile and adaptable to various scenarios by proper choices of the cost function, and intrinsically respects metric properties of the ground spaces. Notably, statistical independence is characterized by $\tau(X, Y) = 0$, while large values of $\tau(X, Y)$ indicate highly regular relations between $X$ and $Y$. Indeed, for suitable base costs, $\tau(X, Y)$ is maximized if and only if $Y$ can be expressed as 1-Lipschitz function of $X$ or vice versa.
We exploit this characterization and define a class of dependency coefficients with values in $[0, 1]$, which can emphasizes different functional relations. In particular, for suitable costs the transport correlations is symmetric and attains the value $1$ if and only if $Y = f(X)$ where $f$ is a multiple of an isometry, which makes it comparable to the distance correlation.
Finally we illustrate how the transport dependency can be used in practice to explore dependencies between random variables, in a gene expression study.

Primary authors

Thomas Giacomo Nies (University of Göttingen) Mr Thomas Staudt (University of Göttingen) Prof. Axel Munk (University of Göttingen)

Presentation Materials

There are no materials yet.