Index

MixtureDensityNetworks.MDN
MixtureDensityNetworks.MDN
MixtureDensityNetworks.MixtureDensityNetwork
MixtureDensityNetworks.MixtureDensityNetwork
MixtureDensityNetworks.MultivariateGMM
MixtureDensityNetworks.MultivariateGMM
MixtureDensityNetworks.UnivariateGMM
MixtureDensityNetworks.UnivariateGMM
MixtureDensityNetworks.fit!
MixtureDensityNetworks.generate_data
MixtureDensityNetworks.likelihood_loss
MixtureDensityNetworks.likelihood_loss
MixtureDensityNetworks.predict_mode

API

MixtureDensityNetworks.MDN — Type

MDN

A model type for constructing a Mixture Density Network, based on MixtureDensityNetworks.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MDN = @load MDN pkg=MixtureDensityNetworks

Do model = MDN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MDN(mixtures=...).

A neural network which parameterizes a Gaussian Mixture Model (GMM) distributed over the target varible y conditioned on the features X.

Training Data

In MLJ or MLJBase, bind an MDN instance model to data with mach = machine(model, X, y) where

X: any table of input features (eg, a DataFrame) whose columns belong to the Continuous scitypes`.
y: the target, which can be any AbstractVector whose element scitype is Continuous.

Hyperparameters

mixtures=5: number of Gaussian mixtures in the predicted distribution
layers=[128,]: hidden layer topology, starting from the first hidden layer
η=1e-3: learning rate used for the optimizer
epochs=1: number of epochs to train the model
batchsize=32: batch size used during training

Operations

predict(mach, Xnew): return the distributions over the target conditioned on the new features Xnew having the same scitype as X above.
predict_mode(mach, Xnew): return the largest modes of the distributions over targets conditioned on the new features Xnew having the same scitype as X above.
predict_mean(mach, Xnew): return the means of the distributions over targets conditioned on the new features Xnew having the same scitype as X above.
predict_median(mach, Xnew): return the medians of the distributions over targets conditioned on the new features Xnew having the same scitype as X above.

Fitted Parameters

The fields of fitted_params(mach) are:

fitresult: the trained mixture density model, compatible with the Flux ecosystem.

Report

learning_curve: the average training loss for each epoch.
best_epoch: the epoch (starting from 1) with the lowest training loss.
best_loss: the best (lowest) loss encountered durind training. Corresponds to the average loss of the best epoch.

Accessor Functions

training_losses(mach) returns the learning curve as a vector of average training losses for each epoch.

Examples

using MLJ
MDN = @load MDN pkg=MixtureDensityNetworks
mdn = MDN(mixtures=12, epochs=100, layers=[512, 256, 128])
X, y = make_regression(100, 1) # synthetic data
mach = machine(mdn, X, y) |> fit!
Xnew, _ = make_regression(3, 1)
ŷ = predict(mach, Xnew) # new predictions
report(mach).best_epoch # best epoch encountered during training 
report(mach).best_loss # best loss encountered during training 
training_losses(mach) # learning curve

source

MixtureDensityNetworks.MDN — Method

MDN(; mixtures=5, layers=[128], η=1e-3, epochs=1, batchsize=32)

Defines an MDN model with the given hyperparameters.

Parameters

mixtures: The number of gaussian mixtures to use in estimating the conditional distribution (default=5).
layers: A vector indicating the number of nodes in each of the hidden layers (default=[128,]).
η: The learning rate to use when training the model (default=1e-3).
epochs: The number of epochs to train the model (default=1).
batchsize: The batchsize to use during training (default=32).

source

MixtureDensityNetworks.MixtureDensityNetwork — Type

struct MixtureDensityNetwork{T}

A Flux model for implementing a standard Mixture Density Network.

Parameters

hidden::Flux.Chain
output::Any

source

MixtureDensityNetworks.MixtureDensityNetwork — Method

MixtureDensityNetwork(
    input::Int64,
    output::Int64,
    layers::Vector{Int64},
    mixtures::Int64
) -> Union{MixtureDensityNetwork{MultivariateGMM}, MixtureDensityNetwork{UnivariateGMM}}

Construct a standard Mixture Density Network.

Parameters

input: The dimension of the input features.
output: The dimension of the output. Setting output = 1 indicates a univariate model, whereas output > 1 indicates a multivariate model.
layers: The topolgy of the hidden layers, starting from the first layer.
mixtures: The number of Gaussian mixtures to use in the predicted distribution.

source

MixtureDensityNetworks.MultivariateGMM — Type

struct MultivariateGMM

A layer which produces a multivariate Gaussian Mixture Model as its output.

Parameters

outputs::Int64
mixtures::Int64
μ::Flux.Dense
Σ::Flux.Dense
w::Flux.Chain

source

MixtureDensityNetworks.MultivariateGMM — Method

MultivariateGMM(
    input::Int64,
    output::Int64,
    mixtures::Int64
) -> MultivariateGMM

Construct a layer which returns a multivariate Gaussian Mixture Model as its output.

Parameters

input: Specifies the length of the feature vectors. The layer expects a matrix with the dimensions input x N as input.
output: Specifies the length of the label vectors. The layer returns a matrix with dimensions output x N as output.
mixtures: The number of mixtures to use in the GMM.

source

MixtureDensityNetworks.UnivariateGMM — Type

struct UnivariateGMM

A layer which produces a univariate Gaussian Mixture Model as its output.

Parameters

μ::Flux.Dense
Σ::Flux.Dense
w::Flux.Chain

source

MixtureDensityNetworks.UnivariateGMM — Method

UnivariateGMM(
    input::Int64,
    mixtures::Int64
) -> UnivariateGMM

Construct a layer which returns a univariate Gaussian Mixture Model as its output.

Parameters

input: Specifies the length of the feature vectors. The layer expects a matrix with the dimensions input x N as input.
mixtures: The number of mixtures to use in the GMM.

source

MixtureDensityNetworks.fit! — Method

fit!(
    m,
    X::Matrix{<:Real},
    Y::Matrix{<:Real};
    opt,
    batchsize,
    epochs,
    verbosity
) -> Tuple{Any, NamedTuple{(:learning_curve, :best_epoch, :best_loss), Tuple{Vector{Float64}, Int64, Float64}}}

Fit the model to the data given by X and Y.

Parameters

m: The model to be trained.
X: A dxn matrix where d is the number of input features and n is the number of samples.
Y: A dxn matrix where d is the dimension of the output and n is the number of samples.
opt: The optimization algorithm to use during training (default = Adam(1e-3)).
batchsize: The batch size for each iteration of gradient descent (default = 32).
epochs: The number of epochs to train for (default = 100).
verbosity: Whether to show a progress bar (default = 1) or not (0).

source

MixtureDensityNetworks.generate_data — Method

generate_data(
    n_samples::Int64
) -> Tuple{Matrix{Float64}, Matrix{Float64}}

Generate some synthetic data for testing purposes.

Parameters

n_samples: The number of samples we want to generate.

Returns

The sythetic features X and labels Y as a tuple (X, Y).

source

MixtureDensityNetworks.likelihood_loss — Method

likelihood_loss(
    distributions::Vector{<:Distributions.MixtureModel{Distributions.Multivariate}},
    y::Matrix{<:Real}
) -> Any

Conpute the negative log-likelihood loss for a set of labels y under a set of multivariate Gaussian Mixture Models.

Parameters

distributions: A vector of multivariate Gaussian Mixture Model distributions.
y: A dxn matrix of labels where d is the dimension of each label and n is the number of samples.

source

MixtureDensityNetworks.likelihood_loss — Method

likelihood_loss(
    distributions::Vector{<:Distributions.MixtureModel{Distributions.Univariate}},
    y::Matrix{<:Real}
) -> Any

Conpute the negative log-likelihood loss for a set of labels y under a set of univariate Gaussian Mixture Models.

Parameters

distributions: A vector of univariate Gaussian Mixture Model distributions.
y: A 1xn matrix of labels where n is the number of samples.

source

MixtureDensityNetworks.predict_mode — Method

predict_mode(
    m::MixtureDensityNetwork,
    X::Matrix{<:Real}
) -> Any

Predict the point associated with the highest probability in the conditional distribution P(Y|X).

Parameters

m: The model with which to generate a prediction.
X: The input to be passed to m. Expected to be a matrix with dimensions dxn where n is the number of observations.

Returns

The mode of each distribution returned by m(X).

source