  THIS SOURCE CODE IS SUPPLIED "AS IS" WITHOUT WARRANTY OF ANY KIND, AND
  ITS AUTHOR AND THE JOURNAL OF MACHINE LEARNING RESEARCH (JMLR) AND JMLR'S
  PUBLISHERS AND DISTRIBUTORS, DISCLAIM ANY AND ALL WARRANTIES, INCLUDING 
  BUT NOT LIMITED TO ANY IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 
  FOR A PARTICULAR PURPOSE, AND ANY WARRANTIES OR NON INFRINGEMENT. THE 
  USER ASSUMES ALL LIABILITY AND RESPONSIBILITY FOR USE OF THIS SOURCE 
  CODE, AND NEITHER THE AUTHOR NOR JMLR, NOR JMLR'S PUBLISHERS AND 
  DISTRIBUTORS, WILL BE LIABLE FOR DAMAGES OF ANY KIND RESULTING FROM ITS
  USE. Without limiting the generality of the foregoing, neither the 
  author, nor JMLR, nor JMLR's publishers and distributors, warrant that
  the Source Code will be error-free, will operate without interruption, or
  will meet the needs of the user.

This MATLAB code reproduces figures and tables from the paper 
Austin J. Brockmeier, Tingting Mu, Sophia Ananiadou, and John Y. Goulermas,
"Quantifying the Informativeness of Similarity Measurements",
Journal of Machine Learning Research, vol. 18, July, 2017.

Version 1.0	 July 11, 2017     Austin J. Brockmeier ajbrockmeier@gmail.com


((Instructions))

1. To run individual scripts 
(assuming the working directory is informativeness)
>> cd scripts
>> addpath(genpath('../functions')) % add all functions and subdirectories
>> addpath('../utilities') %add some utility functions

Then run an individual script for example
>> run cmd_Figure_1_info_embed_vectors.m

2. To run all of the scripts (may take a while)
>> run cmd_create_all.m  

In either case, scripts will write intermediate results figures, and tables
to informativeness/results

((Contents))
functions/bures_metric.m % Computes Bures distance between two PSD matrices
functions/measures/* % Various informativeness measures from the paper
functions/regular_graph_info.m % Compute the informativeness of a regular
% graph with a certain number of edges

% Sampling denoising by maximizing informativeness via a Gaussian kernel
functions/denoiseSample/info_minfunc.m % Requres minFunc.m toolbox 
functions/denoiseSample/info_fminunc.m % Uses Matlab's fminunc intead

% Correlation matrix denoising (Algorithm 1 in the paper)
% denoiseInfo.m is the top-level function that repeatedly calls 
% denoiseInfoADMM.m with warm restarts and changing penalities until a the
% matrix's ranks falls below a given value
functions/denoiseCorrelationMatrix/denoiseInfo.m  
functions/denoiseCorrelationMatrix/denoiseInfoADMM.m
functions/denoiseCorrelationMatrix/infoProx.m  % proximal operator (Eq. 10) 
functions/denoiseCorrelationMatrix/SOSST.m % second-order spectral 
% soft-thresholding operator (equation E.1 Appendix E)


% Functions for computing cluster-class correspondence 
utilities/brute_force_accuracy.m % accuracy using best linear assignment 
utilities/compare_clusterings.m % normalized mutual information and 
% variation of information

% other functions
utilities/generate_2D_cluster_data_set.m % Generate some toy data sets
utilities/mat2patch.m % Draw a matrix as a bunch of patches
utilities/recursblkdiag.m  % Make block diagonal matrix for arbitrary number 
utilities/safe_sqrtm.m   % Matrix square root ignoring negative eigenvals
utilities/getData.m % download data sets and create MATLAB versions in data/


((Requirements))
Tested in MATLAB 8.6 R2015b (version 8.6) on Mac OS X.

MATLAB - Optimization Toolbox 7.3: fminbnd
MATLAB - Statistics and Machine Learning Toolbox 10.1: kmeans, etc. 

Mark Schmidt's minFunc https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html

Some scripts (Table 6, Figure 13, Figure 14, Table 9) require an internet
connection in order to download the relevant datasets from online webpages


((Notes))
The scripts are meant to reproduce the results as closely as possible. 
Nonetheless, some of the figures were created by manually arranging 
sub-figures in Inkscape. Additionally, images of graphs (Figure 9, 10, 11)
were created in GraphViz. 