项目作者: eXascaleInfolab

项目描述 :
Extremely fast evaluation of the extrinsic clustering measures: various (mean) F1 measures and Omega Index (Fuzzy Adjusted Rand Index) for the multi-resolution clustering with overlaps/covers, standard NMI, clusters labeling
高级语言: C++
项目地址: git://github.com/eXascaleInfolab/xmeasures.git
创建时间: 2017-02-20T16:56:39Z
项目社区:https://github.com/eXascaleInfolab/xmeasures

开源协议:Apache License 2.0

下载


xmeasures - Extrinsic Clustering Measures

Extremely fast evaluation of accuracy (extrinsic quality) measures for the [overlapping/fuzzy] clusterings (collections of groups of items):
family of [mean] F1 measures (including Average F1-Score) and Omega Index (fuzzy version of the Adjusted Rand Index) for overlapping multi-resolution clusterings with unequal node base (and optional node base synchronization) using various matching policies (micro, macro and combined weighting),
and standard NMI for non-overlapping clustering on a single resolution. xmeasures also provides clusters labeling with the indices of the ground-truth clusters considering 1:n match and evaluating F1, precision and recall of the labeled clusters.

xmeasures evaluates F1 and NMI for collections of hundreds thousands [overlapping] clusters (covers, communities) withing a dozen seconds on an ordinary laptop using a single CPU core. The computational time is O(N)
unlike O(N * C)
of the existing state of the art implementations, where N is the number of nodes in the network and C is the number of clusters. Computational complexity for Omega Index is standard and equals O(N^2 * s/2), where s is the average sharing ratio (membership) of the nodes, typically s -> 1.
xmeasures is one of the utilities designed for the PyCaBeM clustering benchmark to evaluate clusterings of large networks.

The paper: Accuracy Evaluation of Overlapping and Multi-resolution Clustering Algorithms on Large Datasets

  1. @inproceedings{Xms19,
  2. author={Artem Lutov and Mourad Khayati and Philippe Cudr{\'e}-Mauroux},
  3. title={Accuracy Evaluation of Overlapping and Multi-resolution Clustering Algorithms on Large Datasets},
  4. booktitle={6th IEEE International Conference on Big Data and Smart Computing (BigComp 2019)},
  5. year={2019},
  6. keywords={accuracy metrics, overlapping community evaluation, multi-resolution clustering evaluation, Generalized NMI, Omega Index, MF1, similarity of collections of sets}
  7. }

Related papers about the implemented measures:

  • Omega Index (fuzzy version of the Adjusted Rand Index), which equal to ARI when applied for the non-overlapping clusterings;
  • Mean F1 measures: F1a (Average F1-Score), F1p is much more indicative and discriminative than the presented there F1a but the respective paper has not been published yet;
  • NMI measure.

    Standard NMI is implemented considering overlapping and multi-resolution clustering only to demonstrate non-applicability of the standard NMI for such cases, where it yields unfair results. See GenConvNMI for the fair generalized NMI evaluation.

The execution time and the total processing time (relative power consumption) of xmeasures on a single CPU core vs ParallelComMetric on multiple SMP cores evaluated on the SNAP DBLP dataset and shown in the log scale demonstrates that xmeasures evaluates F1 family measures multiple orders of magnitude faster than other state-of-the-art solutions:
Clubmark_Poster-w1024

Author: (c) Artem Lutov artem@exascale.info

Content

Deployment

The target platform is NIX/Posix, the binary is compiled for Linux Ubuntu x64 and also should work on Windows 10+ x64 (see details in this article).

Requirements

There are no any requirements for the execution or compilation except the standard C++ library.

To run the prebuilt executable on Linux Ubuntu 16.04 x64, the standard library can be installed by: $ sudo apt-get install libstdc++6.

Compilation

Application Compilation

  1. $ make release

The following build errors might occur on some platforms and should be resolved as outlined.

  • If your default compiler is g++/gcc < 5.x, then g++-5 or higher should be installed and Makefile might need to be edited replacing g++, gcc with g++-5, gcc-5.
  • -fstack-clash-protection compiler flag is added since xmeasures v4.0.5, which might not be supported by Clang/LLVM and GCC < 8.2. This flag just hardens the application against some stack overflow attacks and should be excluded from the Makefile if not supported on your platform.

To update/extend the input parameters, modify args.ggo and run GenerateArgparser.sh (calls gengetopt) before running make. To install gengetopt, execute: $ sudo apt-get install gengetopt.

Library Compilation

Some core functionality of xmeasures is available as a library with C API, making it possible to link the library from Python and other scripting languages.
The interface is defined in include/interface_c.h.
To build the library, execute:

  1. $ make -f Makefile_lib release

Usage

Execution Options:

  1. $ ../xmeasures -h
  2. xmeasures 4.0.4
  3. Extrinsic measures evaluation: Omega Index (a fuzzy version of the Adjusted
  4. Rand Index, identical to the Fuzzy Rand Index) and [mean] F1-score (prob, harm
  5. and avg) for the overlapping multi-resolution clusterings, and standard NMI for
  6. the non-overlapping clustering on a single resolution. Unequal node base is
  7. allowed in the evaluating clusterings and optionally can be synchronized
  8. removing nodes from the clusters missed in one of the clusterings
  9. (collections).
  10. Usage: xmeasures [OPTIONS] clustering1 clustering2
  11. clustering - input file, collection of the clusters to be evaluated.
  12. Examples:
  13. $ ./xmeasures -fp -kc networks/5K25.cnl tests/5K25_l0.825/5K25_l0.825_796.cnl
  14. $ ./xmeasures -fh -kc -i tests/5K25.cll -ph -l networks/5K25.cnl
  15. tests/5K25_l0.825/5K25_l0.825_796.cnl
  16. $ ./xmeasures -ox tests/clsevalsx/omega_c4.3-1.cnl
  17. tests/clsevalsx/omega_c4.3-2.cnl
  18. Extrinsic measures are evaluated, i.e. two input clusterings (collections of
  19. clusters) are compared to each other. Optionally, a labeling of the evaluating
  20. clusters with the specified ground-truth clusters is performed.
  21. NOTE:
  22. - Multiple evaluating measures can be specified.
  23. - Each cluster should contain unique members, which is ensured only if the
  24. 'unique' option is specified.
  25. - All clusters should be unique to not affect Omega Index evaluation, which
  26. can be ensured by the [resmerge](https://github.com/eXascaleInfolab/resmerge)
  27. utility.
  28. - Non-corrected unequal node base in the clusterings is allowed, it penalizes
  29. the match.Use [OvpNMI](https://github.com/eXascaleInfolab/OvpNMI) or
  30. [GenConvNMI](https://github.com/eXascaleInfolab/GenConvNMI) for NMI evaluation
  31. in the arbitrary collections (still each cluster should contain unique
  32. members).
  33. Evaluating measures are:
  34. - OI - Omega Index (a fuzzy version of the Adjusted Rand Index, identical to
  35. the Fuzzy Rand Index), which yields the same value as Adjusted Rand Index when
  36. applied to the non-overlapping clusterings.
  37. - [M]F1 - various [mean] F1 measures of the Greatest (Max) Match including
  38. the Average F1-Score (suggested by J. Leskovec) with the optional weighting.
  39. NOTE: There are 3 matching policies available for each kind of F1. The most
  40. representative evaluation is performed by the F1p with combined matching
  41. policy (considers both micro and macro weighting).
  42. - NMI - Normalized Mutual Information, normalized by either max or also
  43. sqrt, avg and min information content denominators.
  44. ATTENTION: This is a standard NMI, which should be used ONLY for the HARD
  45. partitioning evaluation (non-overlapping clustering on a single resolution).
  46. It penalizes overlapping and multi-resolution structures.
  47. -h, --help Print help and exit
  48. -V, --version Print version and exit
  49. -O, --ovp evaluate overlapping instead of the
  50. multi-resolution clusters, where max matching
  51. for any shared member between R overlapping
  52. clusters is 1/R (the member is shared)
  53. instead of 1 (the member fully belongs to
  54. each [hierarchical sub]group) for the member
  55. belonging to R distinct clusters on R
  56. resolutions.
  57. NOTE: It has no effect for the Omega Index
  58. evaluation. (default=off)
  59. -q, --unique ensure on loading that all cluster members are
  60. unique by removing all duplicates.
  61. (default=off)
  62. -s, --sync=filename synchronize with the specified node base
  63. omitting the non-matching nodes.
  64. NOTE: The node base can be either a separate,
  65. or an evaluating CNL file, in the latter case
  66. this option should precede the evaluating
  67. filename not repeating it
  68. -m, --membership=FLOAT average expected membership of the nodes in the
  69. clusters, > 0, typically >= 1. Used only to
  70. facilitate estimation of the nodes number on
  71. the containers preallocation if this number
  72. is not specified in the file header.
  73. (default=`1')
  74. -d, --detailed detailed (verbose) results output
  75. (default=off)
  76. Omega Index:
  77. -o, --omega evaluate Omega Index (a fuzzy version of the
  78. Adjusted Rand Index, identical to the Fuzzy
  79. Rand Index and on the non-overlapping
  80. clusterings equals to ARI). (default=off)
  81. -x, --extended evaluate extended (Soft) Omega Index, which
  82. does not excessively penalize distinctly
  83. shared nodes. (default=off)
  84. Mean F1:
  85. -f, --f1[=ENUM] evaluate mean F1 of the [weighted] average of
  86. the greatest (maximal) match by F1 or partial
  87. probability.
  88. NOTE: F1h <= F1a, where:
  89. - p (F1p or Ph) - Harmonic mean (F1) of two
  90. [weighted] averages of the Partial
  91. Probabilities, the most indicative as
  92. satisfies the largest number of the Formal
  93. Constraints (homogeneity, completeness and
  94. size/quantity except the rag bag in some
  95. cases);
  96. - h (F1h) - Harmonic mean (F1) of two
  97. [weighted] averages of all local F1 (harmonic
  98. means of the Precision and Recall of the best
  99. matches of the clusters);
  100. - a (F1a) - Arithmetic mean (average) of
  101. two [weighted] averages of all local F1, the
  102. least discriminative and satisfies the lowest
  103. number of the Formal Constraints.
  104. Precision and recall are evaluated relative
  105. to the FIRST clustering dataset
  106. (ground-truth, gold standard).
  107. (possible values="partprob",
  108. "harmonic", "average" default=`partprob')
  109. -k, --kind[=ENUM] kind of the matching policy:
  110. - w - Weighted by the number of nodes in
  111. each cluster (known as micro weighting,
  112. MF1_micro)
  113. - u - Unweighed, where each cluster is
  114. treated equally (known as macro weighting,
  115. MF1_macro)
  116. - c - Combined(w, u) using geometric mean
  117. (drops the value not so much as harmonic
  118. mean)
  119. (possible values="weighted",
  120. "unweighed", "combined"
  121. default=`weighted')
  122. Clusters Labeling & F1 evaluation with Precision and Recall:
  123. -l, --label=gt_filename label evaluating clusters with the specified
  124. ground-truth (gt) cluster indices and
  125. evaluate F1 (including Precision and Recall)
  126. of the (best) MATCHED labeled clusters only
  127. (without the probable subclusters).
  128. NOTE: If 'sync' option is specified then the
  129. file name of the clusters labels should be
  130. the same as the node base (if specified) and
  131. should be in the .cnl format. The file name
  132. can be either a separate or an evaluating CNL
  133. file, in the latter case this option should
  134. precede the evaluating filename not repeating
  135. it.
  136. Precision and recall are evaluated relative
  137. to the FIRST clustering dataset
  138. (ground-truth, gold standard).
  139. -p, --policy[=ENUM] Labels matching policy:
  140. - p - Partial Probabilities (maximizes
  141. gain)
  142. - h - Harmonic Mean (minimizes loss,
  143. maximizes F1)
  144. (possible values="partprob", "harmonic"
  145. default=`harmonic')
  146. -u, --unweighted Labels weighting policy on F1 evaluation:
  147. weighted by the number of instances in each
  148. label by default (micro weighting, F1_micro)
  149. or unweighed, where each label is treated
  150. equally (i.e. macro weighting, F1_macro)
  151. (default=off)
  152. -i, --identifiers=labels_filename
  153. output labels (identifiers) of the evaluating
  154. clusters as lines of space-separated indices
  155. of the ground-truth clusters (.cll - clusters
  156. labels list)
  157. NOTE: If 'sync' option is specified then the
  158. reduced collection is outputted to the
  159. <labels_filename>.cnl besides the
  160. <labels_filename>
  161. NMI:
  162. -n, --nmi evaluate NMI (Normalized Mutual Information),
  163. applicable only to the non-overlapping
  164. clusters (default=off)
  165. -a, --all evaluate all NMIs using sqrt, avg and min
  166. denominators besides the max one
  167. (default=off)
  168. -e, --ln use ln (exp base) instead of log2 (Shannon
  169. entropy, bits) for the information measuring
  170. (default=off)

Empty lines and comments (lines starting with #) in the input file (cnl format) are omitted.

Examples
Evaluate harmonic mean of the weighted average of the greatest (maximal) match by partial probabilities (the most discriminative F1-measure) using macro weighting (default as the most frequently used, thought combined weighting is the most indicative one):

  1. $ ./xmeasures -f data/3cls5nds.cnl data/4cls6nds.cnl

Evaluate harmonic mean of the weighted average (by the cluster size) of the greatest (maximal) match by F1s and insure than all cluster members are unique (the duplicated members are removed):

  1. $ ./xmeasures -fh -q data/3cls5nds.cnl data/4cls6nds.cnl

Evaluate harmonic mean of the [unweighted] average of the greatest (maximal) match by partial probabilities and synchronize the node base with the first evaluating collection, and considering overlapping clusters instead of multi-resolutions (-O does not matter for the case of non-overlapping single resolution collections):

  1. $ ./xmeasures -sku -fp -O data/3cls5nds.cnl data/4cls6nds.cnl

Evaluate arithmetic mean of the weighted average (by the cluster size) of the greatest (maximal) match by F1s and NMI with all denominators synchronizing node base of the evaluating collections with 1lev4nds2cls.cnl:

  1. $ ./xmeasures -fa -na -s data/1lev4nds2cls.cnl data/3cls5nds.cnl data/4cls6nds.cnl

Evaluate combined weighed and unweighted F1h (harmonic mean of the average F1s), label the clusters with the indices of provided labels, evaluate standard F1, precision and recall of the labeled clusters and output the labels to the clslbs.cll:

  1. $ ./xmeasures -fh -kc -i clslbs.cll -l labels.cnl clusters.cnl

Evaluate extended Omega Index and mean F1h (harmonic mean of the weighted average of the greatest (maximal) match by F1):

  1. $ ./xmeasures -ox -fh omega_c4.3-1.cnl omega_c4.3-2.cnl

Note: Please, star this project if you use it.

Related Projects

  • GenConvNMI - Overlapping NMI evaluation that is compatible with the original NMI and suitable for both overlapping and multi resolution (hierarchical) clustering evaluation.
  • OvpNMI - NMI evaluation for the overlapping clusters (communities) that is not compatible with the standard NMI value unlike GenConvNMI, but it is much faster than GenConvNMI.
  • Clubmark - A parallel isolation framework for benchmarking and profiling clustering (community detection) algorithms considering overlaps (covers).
  • ParallelComMetric - A parallel toolkit implemented with Pthreads (or MPI) to calculate various extrinsic and intrinsic quality metrics (with and without ground truth community structure) for non-overlapping (hard, single membership) clusterings.
  • CluSim - A Python module that evaluates (slowly) various extrinsic quality metrics (accuracy) for non-overlapping (hard, single membership) clusterings.
  • resmerge - Resolution levels clustering merger with filtering. Flattens hierarchy/list of multiple resolutions levels (clusterings) into the single flat clustering with clusters on various resolution levels synchronizing the node base.
  • ExecTime - A lightweight resource consumption profiler.
  • TInfES - Type inference evaluation scripts and accessory apps used for the benchmarking.