Erlang consumer queue

I have a problem where I want to pull discrete chunks of data from disk into a queue and delete them in another process. This data is randomly allocated to disk, so it won't be significantly extracted from sequential reads. This is a lot of data, so I can't load everything at once, nor is it efficiently pulling a block out at once.

I'd like the consumer to be able to run at their own speed, but to keep a healthy queue of data ready for it so I don't constantly wait on disk, read as I process chunks.

Is there an established way to do this? I. with the framework of jobs or securityvalve? Implementing this is similar to wheel reuse, since a slow consumer running on disk data is a common problem.

Any suggestions as to how best to tackle this Erlang path?

+3


source to share


1 answer


You can use the option {read_ahead, Bytes}

on file:open/2

:

{read_ahead, Size}

This option enables buffering of read data. If the calls read/2

are significantly less than Size

bytes, read operations against the operating system are still performed on blocks of Size

bytes. Additional data is buffered and returned on subsequent calls read/2

, which gives a performance boost as the number of calls to the operating system decreases.

The buffer is read_ahead

also widely used by the function read_line/1

in mode raw

, why this option is recommended (for performance reasons) when accessing raw files using this function.

If the calls read/2

for sizes are not significantly smaller or even larger than Size

bytes, performance gains cannot be expected.



You were vague about the sizes you talked about, but it looks like with this buffer size there should be a decent start to implementing what you need.

+1


source







All Articles