Retrieving the Current Era from the Inlet Piping
I am using tf.train.string_input_producer
with an epoch limit to feed data into my model. How can I get the current epoch of this op during training?
I noticed that there are some nodes in the graph associated with this operator, one of which contains the epoch limit, but I cannot find where the actual current value is stored. Surely this can be traced somewhere?
More generally, how can I keep track of the current epoch in the TFRecords pipeline?
source to share
I couldn't find this anywhere in TF.
My solution was to do it manually, by batch (infinite) iteration, and just name my nodes as often as I wanted (determined in advance by calculating the number of elements in the dataset, dividing by the batch size = one epoch).
This is made easier in a recent TF release using tensorflow.contrib.data.TFRecordDataset
:
d = TFRecordDataset('some_filename.tfrecords')
d = d.map(function_which_parses_your_protobuf_format)
d = d.repeat()
d = d.shuffle()
d = d.batch(batch_size)
Then you can determine the size of your dataset using
record_count = sum([1 for r in tf.python_io.tf_record_iteration('your_filename.tfrecord')])
It seems like more work, but it provides better flexibility as you can use caching, for example, so you don't have to preprocess your dataset in advance and thus keep the original untouched dataset in a tfrecord file.
source to share