target_extraction.allen package¶
Subpackages¶
- target_extraction.allen.dataset_readers package
- target_extraction.allen.models package
- Subpackages
- target_extraction.allen.models.target_sentiment package
- Submodules
- target_extraction.allen.models.target_sentiment.atae module
- target_extraction.allen.models.target_sentiment.in_context module
- target_extraction.allen.models.target_sentiment.interactive_attention_network module
- target_extraction.allen.models.target_sentiment.split_contexts module
- target_extraction.allen.models.target_sentiment.util module
- Module contents
- target_extraction.allen.models.target_sentiment package
- Submodules
- target_extraction.allen.models.target_tagger module
- Module contents
- Subpackages
- target_extraction.allen.modules package
- target_extraction.allen.predictors package
- target_extraction.allen.token_indexers package
Submodules¶
target_extraction.allen.allennlp_model module¶
-
class
target_extraction.allen.allennlp_model.
AllenNLPModel
(name, model_param_fp, predictor_name, save_dir=None)[source]¶ Bases:
object
This is a wrapper for the AllenNLP dataset readers, models, and predictors so that the input to functions can be
target_extraction.data_types.TargetTextCollection
objects and the return a metric or metrics as well as predicitons within thetarget_extraction.data_types.TargetTextCollection
objects. This is instead of running everything through multiple bash files callingallennlp train
etc.-
fit
(train_data, val_data, test_data=None)[source]¶ Given the training, validation, and optionally the test data it will train the model that is defined in the model params file provided as argument to the constructor of the class. Once trained the model can be accessed through the model attribute.
NOTE: If the test data is given the model only uses it to fit to the vocabularly that is within the test data, the model NEVER trains on the test data.
- Parameters
train_data (
TargetTextCollection
) – Training data.val_data (
TargetTextCollection
) – Validation data.test_data (
Optional
[TargetTextCollection
]) – Optional, test data.
- Return type
None
-
load
(cuda_device=-1)[source]¶ Loads the model. This does not require you to train the model if the save_dir attribute is pointing to a folder containing a trained model. This is just a wrapper around the load_archive function.
- Parameters
cuda_device (
int
) – Whether the loaded model should be loaded on to the CPU (-1) or the GPU (0). Default CPU.- Return type
Model
- Returns
The model that was saved at self.save_dir
- Raises
AssertionError – If the save_dir argument is None
FileNotFoundError – If the save directory does not exist.
-
predict_into_collection
(collection, key_mapping, batch_size=None, append_if_exists=True)[source]¶ - Parameters
collection (
TargetTextCollection
) – The TargetTextCollection that is to be predicted on and to be the store of the predicted data.key_mapping (
Dict
[str
,str
]) – Dictionary mapping the prediction keys that contain the prediction values to the keys that will store those prediction values within the collection that has been predicted on.batch_size (
Optional
[int
]) – Specify the batch size to predict on. If left None defaults to 64 unless it is specified in the model_param_fp within the constructor then the batch size from the param file is used.append_if_exists (
bool
) – If False and a TargetText within the collection already has a prediction within the given key based on the key_mapping then KeyError is raised.
- Return type
- Returns
The collection that was predict on with the new predictions within the collection stored in keys that are the values of the key_mapping argument. Note that all predictions are sotred within Lists within their respective keys in the collection.
- Raises
KeyError – If the keys from key_mapping is not within the prediction dictionary.
KeyError – If append_if_exists is False and the a TargetText within the collection already has a prediction within the given key based on the key_mapping then this is raised.
-
predict_sequences
(data, batch_size=None)[source]¶ Given the data it will predict the sequence labels and return the confidence socres in those labels as well as the words and text the prediction was predicting on.
- Parameters
data (
Union
[Iterable
[Dict
[str
,Any
]],List
[Dict
[str
,Any
]]]) – Iterable or list of dictionaries that contains at least text key and value and if you do not want the predictor to do the tokenization then provide tokens as well. Some model may also expect pos_tags which the predictor will provide if the text key is only provided.batch_size (
Optional
[int
]) – Specify the batch size to predict on. If left None defaults to 64 unless it is specified in the model_param_fp within the constructor then the batch size from the param file is used.
- Yields
A dictionary containing all the following keys and values: 1. sequence_labels: A list of predicted sequence labels.
This will be a List of Strings.
confidence: The confidence the model had in predicting each sequence label, this comes from the softmax score. This will be a List of floats.
tokens: The tokens that the confidence and sequence labels are associated to
text: The text that the tokens/words relate to.
- Return type
Iterable
[Dict
[str
,Any
]]
-