Apache Storm: Tracking Tuples by Unique ID from Source Spout to Final Bolt

I need a method to uniquely identify tuples across the entire Storm topology so that each tuple can be traced back from Spout to the final Bolt.

As I understand it, this is when you pass a unique emit message id from a spout like:

String msgID = UUID.randomUUID();
// emits a line from user tasks with msg id
outputCollector.emit(new Values(task), msgID);

      

This identifier is somehow returned when called on the spout (could this be modeled earlier to return the passed identifier at any time?). But using the message receive id on a tuple like:

inputTuple.getMessageId()

      

This returns a new message, not the one passed to the Spout, which is generated by the Tuple. Link https://groups.google.com/forum/#!topic/storm-user/xBEqMDa-RZs

Questions

1) Is there a way to get tuple.getMessageId () when the collector emits a Tuple.

2) Alternatively, is it possible to transfer the message transmitted to the spout in some way from the tuple with any spout or bolt in the toplegia?

End Solution I want to be able to set an ID on a tuple when it is emitted, and then be able to identify that tuple again at any point in the Storm topology.

Or will there be a unique message that my system will keep track of should be passed as a field / value on each exit of each nose and bolt.

thank

+3


source to share


2 answers


It is not possible to access the system generated identifiers from the manufacturer (only from the consumer via tuple.getMessageId()

). To keep track of the tuples the way you want it, you need to (by your own idea) add an identifier as a regular field value for the tuple and copy it in each bolt into the corresponding output set (s).



+1


source


Several parts of this answer. First, as you correctly point out, it's up to you to create a unique identifier in your spout for every tuple you emit. Second, if you want to access this ID anywhere in your topology, add this ID to the Spout-issued composite tuple. Third (just for completeness), if there is anything in your emitted tuple that you need to know when handling an ack or crash in your Spout, add that information as part of the composite value that makes up your post ID.

As an example, I usually use the Tuple itself as the message id when emitting a tuple from a spout:



outputCollector.emit(myTuple, myTuple);

      

It might be overkill, but at least I have access to all the information in the tuple everywhere.

0


source







All Articles