target_extraction.allen.models.target_sentiment package

Submodules

target_extraction.allen.models.target_sentiment.atae module

class target_extraction.allen.models.target_sentiment.atae.ATAEClassifier(vocab, context_field_embedder, context_encoder, target_encoder, feedforward=None, context_attention_activation_function='tanh', target_field_embedder=None, AE=True, AttentionAE=True, inter_target_encoding=None, target_position_weight=None, target_position_embedding=None, initializer=<allennlp.nn.initializers.InitializerApplicator object>, regularizer=None, dropout=0.0, label_name='target-sentiment-labels', loss_weights=None, use_target_sequences=False)[source]

Bases: allennlp.models.model.Model

forward(tokens, targets, target_sentiments=None, target_sequences=None, metadata=None, position_weights=None, position_embeddings=None, **kwargs)[source]

The text and targets are Dictionaries as they are text fields they can be represented many different ways e.g. just words or words and chars etc therefore the dictionary represents these different ways e.g. {‘words’: words_tensor_ids, ‘chars’: char_tensor_ids}

Return type

Dict[str, Tensor]

get_metrics(reset=False)[source]

Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with [Metric`s](../training/metrics/metric.md). Metrics should be populated during the call to `forward, with the Metric handling the accumulation of the metric until this method is called.

Return type

Dict[str, float]

make_output_human_readable(output_dict)[source]

Adds the predicted label to the output dict, also removes any class probabilities that do not have a target associated which is caused through the batch prediction process and can be removed by using the target mask.

Everyting in the dictionary will be of length (batch size * number of targets) where the number of targets is based on the number of targets in each sentence e.g. if the batch has two sentences where the first contian 2 targets and the second 3 targets the number returned will be (2 + 3) 5 target sentiments.

Return type

Dict[str, Tensor]

reset_parameters()[source]

Intitalises the attnention vector

target_extraction.allen.models.target_sentiment.in_context module

class target_extraction.allen.models.target_sentiment.in_context.InContextClassifier(vocab, context_field_embedder, context_encoder, target_encoding_pooling_function='mean', feedforward=None, initializer=<allennlp.nn.initializers.InitializerApplicator object>, regularizer=None, dropout=0.0, label_name='target-sentiment-labels', loss_weights=None)[source]

Bases: allennlp.models.model.Model

forward(tokens, targets, target_sequences, target_sentiments=None, metadata=None, **kwargs)[source]

B = Batch NT = Number Targets B_NT = Batch * Number Targets TSL = Target Sequence Length CSL = Context Sequence Length (number tokens in the text incl padding) D = Dimension of the vector EC_D = Encoded Context Dimension ET_D = Embedded Text Dimension

The text and targets are Dictionaries as they are text fields they can be represented many different ways e.g. just words or words and chars etc therefore the dictionary represents these different ways e.g. {‘words’: words_tensor_ids, ‘chars’: char_tensor_ids}

Return type

Dict[str, Tensor]

get_metrics(reset=False)[source]

Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with [Metric`s](../training/metrics/metric.md). Metrics should be populated during the call to `forward, with the Metric handling the accumulation of the metric until this method is called.

Return type

Dict[str, float]

make_output_human_readable(output_dict)[source]

Adds the predicted label to the output dict, also removes any class probabilities that do not have a target associated which is caused through the batch prediction process and can be removed by using the target mask.

Return type

Dict[str, Tensor]

target_extraction.allen.models.target_sentiment.interactive_attention_network module

class target_extraction.allen.models.target_sentiment.interactive_attention_network.InteractivateAttentionNetworkClassifier(vocab, context_field_embedder, context_encoder, target_encoder, feedforward=None, context_attention_activation_function='tanh', target_attention_activation_function='tanh', target_field_embedder=None, inter_target_encoding=None, target_position_weight=None, target_position_embedding=None, initializer=<allennlp.nn.initializers.InitializerApplicator object>, regularizer=None, dropout=0.0, label_name='target-sentiment-labels', loss_weights=None, use_target_sequences=False)[source]

Bases: allennlp.models.model.Model

forward(tokens, targets, target_sentiments=None, target_sequences=None, metadata=None, position_weights=None, position_embeddings=None, **kwargs)[source]

The text and targets are Dictionaries as they are text fields they can be represented many different ways e.g. just words or words and chars etc therefore the dictionary represents these different ways e.g. {‘words’: words_tensor_ids, ‘chars’: char_tensor_ids}

Return type

Dict[str, Tensor]

get_metrics(reset=False)[source]

Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with [Metric`s](../training/metrics/metric.md). Metrics should be populated during the call to `forward, with the Metric handling the accumulation of the metric until this method is called.

Return type

Dict[str, float]

make_output_human_readable(output_dict)[source]

Adds the predicted label to the output dict, also removes any class probabilities that do not have a target associated which is caused through the batch prediction process and can be removed by using the target mask.

Everyting in the dictionary will be of length (batch size * number of targets) where the number of targets is based on the number of targets in each sentence e.g. if the batch has two sentences where the first contian 2 targets and the second 3 targets the number returned will be (2 + 3) 5 target sentiments.

Return type

Dict[str, Tensor]

target_extraction.allen.models.target_sentiment.split_contexts module

class target_extraction.allen.models.target_sentiment.split_contexts.SplitContextsClassifier(vocab, context_field_embedder, left_text_encoder, right_text_encoder, feedforward=None, target_field_embedder=None, target_encoder=None, inter_target_encoding=None, initializer=<allennlp.nn.initializers.InitializerApplicator object>, regularizer=None, dropout=0.0, label_name='target-sentiment-labels', loss_weights=None)[source]

Bases: allennlp.models.model.Model

forward(left_contexts, right_contexts, targets, target_sentiments=None, metadata=None, **kwargs)[source]

The text and targets are Dictionaries as they are text fields they can be represented many different ways e.g. just words or words and chars etc therefore the dictionary represents these different ways e.g. {‘words’: words_tensor_ids, ‘chars’: char_tensor_ids}

Return type

Dict[str, Tensor]

get_metrics(reset=False)[source]

Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with [Metric`s](../training/metrics/metric.md). Metrics should be populated during the call to `forward, with the Metric handling the accumulation of the metric until this method is called.

Return type

Dict[str, float]

make_output_human_readable(output_dict)[source]

Adds the predicted label to the output dict, also removes any class probabilities that do not have a target associated which is caused through the batch prediction process and can be removed by using the target mask.

Return type

Dict[str, Tensor]

target_extraction.allen.models.target_sentiment.util module

target_extraction.allen.models.target_sentiment.util.concat_position_embeddings(embedding_context, position_indexes=None, target_position_embedding=None)[source]
Parameters
  • embedding_context (Tensor) – Tensor of shape (batch size * number targets, context sequence length, context dim)

  • position_indexes (Optional[Dict[str, LongTensor]]) – Dictionary of token indexer name to a Tensor of shape (batch size, number targets, text sequence length)

  • target_position_embedding (Optional[TextFieldEmbedder]) – An embedding function for the position indexes, where the dimension of the position embedding is position dim.

Return type

Tensor

Returns

If position_indexes and target_position_embedding are None then the embedding_context is returned without any change. Else the relevant position embeddings are concatenated onto the relevant token embeddings within the embedding_context to create a Tensor of shape (batch size * number targets, text sequence length, context dim + position dim)

Raises

ValueError – If target_position_embedding is not None when position_indexes is None.

target_extraction.allen.models.target_sentiment.util.elmo_input_reshape(inputs, batch_size, number_targets, batch_size_num_targets)[source]

NOTE: This does not work for the hugginface transformers as when they are processed by the token indexers they produce additional key other than token ids such as mask ids and segment ids that also need handling, of which we have not had time to handle this yet. A way around this, which would be more appropriate, would be to use target_sequences like in the InContext model, to generate contextualised targets from the context rather than using the target words as is without context.

Parameters
  • inputs (Dict[str, Dict[str, Tensor]]) – The token indexer dictionary where the keys state the token indexer and the values are the Tensors that are of shape (Batch Size, Number Targets, Sequence Length)

  • batch_size (int) – The Batch Size

  • number_targets (int) – The max number of targets in the batch

  • batch_size_num_targets (int) – Batch Size * number of targets

Return type

Dict[str, Dict[str, Tensor]]

Returns

If the inputs contains a elmo or ‘token_characters’ key it will reshape all the keys values into shape (Batch Size * Number Targets, Sequence Length) so that it can be processed by the ELMO or character embedder/encoder.

target_extraction.allen.models.target_sentiment.util.elmo_input_reverse(embedded_input, inputs, batch_size, number_targets, batch_size_num_targets)[source]
Parameters
  • embedded_input (Tensor) – The embedding generated after the embedder has been forwarded over the inputs

  • inputs (Dict[str, Dict[str, Tensor]]) – The token indexer dictionary where the keys state the token indexer and the values are the Tensors that are of shape (Batch Size, Number Targets, Sequence Length)

  • batch_size (int) – The Batch Size

  • number_targets (int) – The max number of targets in the batch

  • batch_size_num_targets (int) – Batch Size * number of targets

Return type

Tensor

Returns

If the inputs contains a elmo or ‘token_characters’ key it will reshape the embedded_input into the original shape of (Batch Size, Number Targets, Sequence Length, embedding dim)

target_extraction.allen.models.target_sentiment.util.loss_weight_order(model, loss_weights, label_name)[source]
Parameters
  • model (Model) – The model that you want to know the loss weights for. Requires a vocab.

  • loss_weights (Optional[List[float]]) – The loss weights to give to the labels. Can be None and if so returns None.

  • label_name (str) – The name of the vocab for the label that is to be predicted.

Return type

Union[None, List[float]]

Returns

None if loss weights is None. Else return a list of weights to give to each label, where the original loss weights are ordered by [‘negative’, ‘neutral’, ‘positive’] and the returned are ordered by occurrence in the models vocab for that label_name

Module contents