Creating `input_fn` from an iterator

Most of the tutorials focus on the case where the entire set of teaching materials fits into memory. However, I have an iterator that acts like an endless stream of (heck, shortcuts) -corrections (creating them cheaply on the fly).

When implementing input_fn

for tensorflows estimator, I can return an instance from the iterator as

def input_fn():
   (feature_batch, label_batch) = next(it)
   return tf.constant(feature_batch), tf.constant(label_batch)

      

or input_fn

should it return the same (functions, labels) -tracks for every call?

Also, this function is called multiple during training as I hope it looks like the following pseudocode:

for i in range(max_iter):
   learn_op(input_fn())

      

+3


source to share


2 answers


The argument is input_fn

used during training, but the function itself is called once. Therefore, creating a complex input_fn

, out-of-scope constant array returned as described in the tutorial is not that easy.

Tensorflow offers two examples of this non-trivial input_fn

for numpy and panda , but they start with an in-memory array, so this won't help you with your problem.

You can also take a look at their code by following the links above to see how they implement an efficient non-trivial one input_fn

, but you may find that it requires more code that you would like.



If you want to use a lower level Tensorflow interface, things will be simpler and more flexible. There is a tutorial that covers most of the needs and the suggested solutions are simple (-er) to implement.

In particular, if you already have an iterator that returns data, as you described in your question, using placeholders (the Feed section in the previous link) should be straightforward.

+2


source


I found a pull request that converts generator

to input_fn

: https://github.com/tensorflow/tensorflow/pull/7045/files

Relevant part



  def _generator_input_fn():
    """generator input function."""
    queue = feeding_functions.enqueue_data(
      x,
      queue_capacity,
      shuffle=shuffle,
      num_threads=num_threads,
      enqueue_size=batch_size,
      num_epochs=num_epochs)

    features = (queue.dequeue_many(batch_size) if num_epochs is None
                else queue.dequeue_up_to(batch_size))
    if not isinstance(features, list):
      features = [features]
    features = dict(zip(input_keys, features))
    if target_key is not None:
      if len(target_key) > 1:
        target = {key: features.pop(key) for key in target_key}
      else:
        target = features.pop(target_key[0])
      return features, target
    return features
  return _generator_input_fn

      

+2


source







All Articles