MINE: Maximal Information-based Nonparametric Exploration

Technical Information


The calculation of MIC and other MINE statistics

MIC and the other MINE statistics are calculated from a matrix of scores generated from a given set of two-variable data. This matrix, called the characteristic matrix, is created by searching for grids that maximize the penalized mutual information of the distribution induced on each grid's cells by the data. Different relationship types give rise to characteristic matrices with different properties. For instance, strong relationships yield characteristic matrices with high peaks, monotonic relationships yield symmetric characteristic matrices, and complex relationships yield characteristic matrices whose peaks are far from the origin.

Statistical significance

As with any data exploration technique, it is important to address multiple testing concerns thoroughly when using MINE statistics. We suggest doing so by controlling the false discovery rate as in the original MINE paper. For instructions on how to compute p-values for MIC and other MINE statistics, see our FAQ.

Publication

More detailed information is contained in the published paper describing MINE:

[D. Reshef, Y. Reshef], H. Finucane, S. Grossman, G. McVean, P. Turnbaugh, E. Lander, [[M. Mitzenmacher, P. Sabeti]]. Detecting novel associations in large datasets. Science 334, 6062 (2011). [abstract] [full text] [reprint] [accompanying commentary]

[...],[[...]] These authors contributed equally to this work and are listed alphabetically.

Follow-up work

The following papers contain follow-up work related to MINE:

Y. Reshef*, D. Reshef*, P. Sabeti**, M. Mitzenmacher**. Equitability, interval estimation, and statistical power. ArXiv pre-print (2015). [arXiv]

Y. Reshef*, D. Reshef*, P. Sabeti**, M. Mitzenmacher**. Measuring dependence powerfully and equitably. Journal of Machine Learning Research (2016). [paper]

D. Reshef*, Y. Reshef*, P. Sabeti**, M. Mitzenmacher**. An empirical study of the maximal and total information coefficients and leading measures of dependence. Annals of Applied Statistics (2018). [paper]

* and ** denote co-first and co-last authorship respectively.