tacotron

tacotron_decoder

Modified by blisc to enable support for tacotron models, specfically enables the prenet

class parts.tacotron.tacotron_decoder.BasicDecoderOutput[source]

Bases: parts.tacotron.tacotron_decoder.BasicDecoderOutput

class parts.tacotron.tacotron_decoder.TacotronDecoder(decoder_cell, helper, initial_decoder_state, attention_type, spec_layer, stop_token_layer, prenet=None, dtype=tf.float32, train=True)[source]

Bases: tensorflow.contrib.seq2seq.python.ops.decoder.Decoder

Basic sampling decoder.

__init__(decoder_cell, helper, initial_decoder_state, attention_type, spec_layer, stop_token_layer, prenet=None, dtype=tf.float32, train=True)[source]

Initialize TacotronDecoder.

Parameters:
  • decoder_cell – An RNNCell instance.
  • helper – A Helper instance.
  • initial_decoder_state – A (possibly nested tuple of…) tensors and TensorArrays. The initial state of the RNNCell.
  • attention_type – The type of attention used
  • stop_token_layer – An instance of tf.layers.Layer, i.e., tf.layers.Dense. Stop token layer to apply to the RNN output to predict when to stop the decoder
  • spec_layer – An instance of tf.layers.Layer, i.e., tf.layers.Dense. Output layer to apply to the RNN output to map the ressult to a spectrogram
  • prenet – The prenet to apply to inputs
Raises:

TypeError – if cell, helper or output_layer have an incorrect type.

batch_size

The batch size of input values.

initialize(name=None)[source]

Initialize the decoder.

Parameters:name – Name scope for any created operations.
output_dtype

A (possibly nested tuple of…) dtype[s].

output_size

A (possibly nested tuple of…) integer[s] or TensorShape object[s].

step(time, inputs, state, name=None)[source]

Perform a decoding step.

Parameters:
  • time – scalar int32 tensor.
  • inputs – A (structure of) input tensors.
  • state – A (structure of) state tensors and TensorArrays.
  • name – Name scope for any created operations.
Returns:

(outputs, next_state, next_inputs, finished).

tacotron_helper

Modified by blisc to enable support for tacotron models Custom Helper class that implements the tacotron decoder pre and post nets

class parts.tacotron.tacotron_helper.TacotronHelper(inputs, prenet=None, time_major=False, sample_ids_shape=None, sample_ids_dtype=None, mask_decoder_sequence=None)[source]

Bases: tensorflow.contrib.seq2seq.python.ops.helper.Helper

Helper for use during eval and infer. Does not use teacher forcing

__init__(inputs, prenet=None, time_major=False, sample_ids_shape=None, sample_ids_dtype=None, mask_decoder_sequence=None)[source]

Initializer.

Parameters:
  • inputs (Tensor) – inputs of shape [batch, time, n_feats]
  • prenet – prenet to use, currently disabled and used in tacotron decoder instead.
  • sampling_prob (float) – see tacotron 2 decoder
  • anneal_teacher_forcing (float) – see tacotron 2 decoder
  • stop_gradient (float) – see tacotron 2 decoder
  • time_major (bool) – (float): see tacotron 2 decoder
  • mask_decoder_sequence (bool) – whether to pass finished when the decoder passed the sequence_length input or to pass unfinished to dynamic_decode
batch_size

Batch size of tensor returned by sample.

Returns a scalar int32 tensor.

initialize(name=None)[source]

Returns (initial_finished, initial_inputs).

next_inputs(time, outputs, state, stop_token_predictions, name=None, **unused_kwargs)[source]

Returns (finished, next_inputs, next_state).

sample(time, outputs, state, name=None)[source]

Returns sample_ids.

sample_ids_dtype

DType of tensor returned by sample.

Returns a DType.

sample_ids_shape

Shape of tensor returned by sample, excluding the batch dimension.

Returns a TensorShape.

class parts.tacotron.tacotron_helper.TacotronTrainingHelper(inputs, sequence_length, prenet=None, time_major=False, sample_ids_shape=None, sample_ids_dtype=None, model_dtype=tf.float32, mask_decoder_sequence=None)[source]

Bases: tensorflow.contrib.seq2seq.python.ops.helper.Helper

Helper funciton for training. Can be used for teacher forcing or scheduled sampling

__init__(inputs, sequence_length, prenet=None, time_major=False, sample_ids_shape=None, sample_ids_dtype=None, model_dtype=tf.float32, mask_decoder_sequence=None)[source]

Initializer.

Parameters:
  • inputs (Tensor) – inputs of shape [batch, time, n_feats]
  • sequence_length (Tensor) – length of each input. shape [batch]
  • prenet – prenet to use, currently disabled and used in tacotron decoder instead.
  • sampling_prob (float) – see tacotron 2 decoder
  • time_major (bool) – (float): see tacotron 2 decoder
  • mask_decoder_sequence (bool) – whether to pass finished when the decoder passed the sequence_length input or to pass unfinished to dynamic_decode
batch_size

Batch size of tensor returned by sample.

Returns a scalar int32 tensor.

initialize(name=None)[source]

Returns (initial_finished, initial_inputs).

next_inputs(time, outputs, state, name=None, **unused_kwargs)[source]

Returns (finished, next_inputs, next_state).

sample(time, outputs, state, name=None)[source]

Returns sample_ids.

sample_ids_dtype

DType of tensor returned by sample.

Returns a DType.

sample_ids_shape

Shape of tensor returned by sample, excluding the batch dimension.

Returns a TensorShape.