drcme.bin.run_post_r_merging

Script for determining how many GMM components to merge.

Gaussian mixture model (GMM) components are merged into a smaller number of clusters using an entropy criterion as described by Baudry et al. (2010). A piecewise linear fit is used to determine the point at which merging should terminate.

class drcme.bin.run_post_r_merging.PostGmmMergingParameters(extra=None, only=None, exclude=(), prefix='', strict=None, many=False, context=None, load_only=(), dump_only=(), partial=False)[source]

Parameter schema for merging

This schema is designed to be a schema_type for an ArgSchemaParser object

PostGmmMergingParameters

key

description

default

field_type

json_type

input_json

file path of input json file

NA

InputFile

str

output_json

file path to output json file

NA

OutputFile

str

log_level

set the logging level of the module

ERROR

LogLevel

str

tau_file

Path to file with cluster membership probabilities

NA

InputFile

str

labels_file

Path to file with cluster labels

NA

InputFile

str

merge_info_file

Path to JSON file with number of components after entropy-based merging

NA

OutputFile

str

entropy_piecewise_components

Number of components (2 or 3) for piecewise linear fit of entropy scores

3

Integer

int

Functions

main(tau_file, labels_file, merge_info_file, …)

Main runner function for script.

Classes

PostGmmMergingParameters([extra, only, …])

Parameter schema for merging