Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble saving models #147

Closed
mbw314 opened this issue Jul 23, 2019 · 2 comments
Closed

Trouble saving models #147

mbw314 opened this issue Jul 23, 2019 · 2 comments

Comments

@mbw314
Copy link

mbw314 commented Jul 23, 2019

I'm trying to export a model trained on a dataset with the following features:
#users: 700K
#user features: IDs only
#items: 30K
#item features: IDs only
#interactions: 56M

The model uses the default architecture except for WRMB loss, and was trained for only 5 epochs.

I've tried to export the model in 2 ways.

  1. Use TensorRec.save_model. The problem here is that apparently the model is enormous, since I get this error: InvalidArgumentError: Cannot serialize protocol buffer of type tensorflow.GraphDef as the serialized size (11399300521bytes) would be larger than the limit (2147483647 bytes)
  2. pickle the model directly. The cause of this problem is less clear to me, but a Google search reveals many similar issues. The error is TypeError: can't pickle _thread.RLock objects

Any advice on how to get around these issues would be greatly appreciated!

@mbw314
Copy link
Author

mbw314 commented Jul 25, 2019

To add a bit more context, finding a way to use TensorRec.save_model would be great, but I'm ultimately interested in obtaining a TensorFlow SavedModel representation of the model to use for scoring (i.e., the equivalent of calling predict).

One option is calling the tf.saved_model.simple_save function. Similar to #142, this requires one to specify the input and output nodes of the graph to be saved. A comment there points to placeholders in TensorRec._build_tf_graph for training hyper-parameters and user/item feature iterators for input, and to the predict function for output (presumably self.tf_prediction).

As mentioned in another comment to #142, the hyper-parameter placeholders seem to be irrelevant inputs for prediction, but I'm not sure how to specify the input nodes for user and item features. E.g., using self.tf_user_feature_iterator for user input fails: AttributeError: 'Iterator' object has no attribute 'dtype'.

Does anyone know how to do this or anything similar?

@mbw314
Copy link
Author

mbw314 commented Aug 6, 2019

A quick follow-up:

The huge model size seems to a result of how TensorFlow handles numpy data; see this link: https://www.tensorflow.org/guide/datasets#consuming_numpy_arrays

I had been using training data in sparse matrix format, which gets converted to numpy arrays here:

def create_tensorrec_dataset_from_sparse_matrix(sparse_matrix):

One way around this is to save the training data first in TFRecord format. One caveat here is that it is easy to run out of memory during training unless batching is enabled. Since the user_batch_size parameter is only enabled for sparse matrix input, I had to handle batching directly, which was not difficult.

As for obtaining a SavedModel object, the method described in this comment worked fine: #138

@mbw314 mbw314 closed this as completed Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant