Using pre-downsampled data when plotting large time series in PyQtGraph
I need to plot a large time series in PyQtGraph (millions of points). It is nearly impossible to schedule it as-is, and when you enable the optimization options (downsampling using setDownsampling
and clipping using setClipToView
) it is not yet available when scaling (only when you zoom in, it becomes fast thanks to clipping).
I have an idea. I could have deleted the data beforehand as it is static. Then I could use the cached data downsampled when upscaling and the raw data when upscaling.
How can I achieve this?
source to share
@Three_pineapples' answer describes a really good improvement over the default downsampling in PyQtGraph, but still requires downsampling on the fly, which is problematic in my case.
So I decided to implement a different strategy, which is to pre-shrink the data and then select either the downsampled data or the raw data depending on the "zoom level".
I am combining this approach with the default auto-downgrade strategy used primarily by PyQtGraph to provide further speed improvements (which could be further improved with @three_pineapples suggestions).
This way, PyQtGraph always starts with much smaller data, making zooming and panning instantaneous even with really high sample counts.
My approach is summarized in this code, which the monkey fixes the getData method PlotDataItem
.
# Downsample data
downsampled_data = downsample(data, 100)
# Replacement for the default getData function
def getData(obj):
# Calculate the visible range
range = obj.viewRect()
if range is not None:
dx = float(data[-1, 0] - data[0, 0]) / (data.size[0] - 1)
x0 = (range.left() - data[0, 0]) / dx
x1 = (range.right() - data[0, 0]) / dx
# Decide whether to use downsampled or original data
if (x1 - x0) > 20000:
obj.xData = downsampled_data[:, 0]
obj.yData = downsampled_data[:, 1]
else:
obj.xData = data[:, 0]
obj.yData = data[:, 1]
# Run the original getData of PlotDataItem
return PlotDataItem.getData(obj)
# Replace the original getData with our getData
plot_data_item.getData = types.MethodType(getData, plot_data_item)
source to share
I did something similar on a project I am working on called runviewer . The general idea is to reprogram the data whenever the x-range of the plot changes. An example method we are using:
-
Connect the method to a
sigXRangeChanged
signalPlotWidget
that sets a boolean flag indicating that the data should be re-sampled. -
Run a thread that checks the boolean flag every x seconds (we chose 0.5 seconds) to see if the data needs to be resampled. If so, the data is resampled using an algorithm of your choice (we wrote our own in C). This data is then sent back to the main thread (for example, use
QThread
and signal back to the main thread) where pyqtgraph is called to update the data in the graph (note you can only call pyqtgraph methods from the main thread!)
We use a boolean flag to decouple x range change events from resampling. You don't want to re-fetch every time the X range changes as the signal is repeatedly fired when zooming with the mouse and you do not want to generate a resampling call queue as resampling is slow even when using C
You also need to ensure that the resample stream immediately sets the boolean flag to False if it detects it as True, and then runs the resample algorithm. This means that subsequent x range change events during the current oversampling result in subsequent oversampling.
Perhaps you can also improve this by not polling the flag but using some sort of Threading event / condition.
Note that oversampling from Python is indeed very slow, so we decided to write a C resampling algorithm and call it from Python. numpy is mostly in C so will be fast. However, I don't think they had a feature that preserves the resampling algorithm. Most oversampling people are just standard downsampling where you take every Nth point, but we wanted to still show that there are features smaller than the sample size when downsampling.
Additional performance comments
I suspect part of the performance issue with the pyqtgraph built-in method is that the downsampling is done on the main thread. Thus, down sampling must be completed before the graph can again respond to user input. Our method avoids this. Our approach also limits the number of times the sample down occurs, at most, everythe length of time it takes to down-sample + the poll delay
seconds. So with the latency we are using, we only decrease every 0.5-1 seconds while keeping the main thread (and thus the UI) intact. This means that the user can see the coarse sampled data if it is growing rapidly, but this is fixed in no more than 2 resampling iterations (so no more than a 1-2 second delay). Also, since it takes a short amount of time to fix, the refresh / redraw with newly sampled data is often done after the user has finished interacting with the UI, so they don't notice any unresponsiveness during redraw.
Obviously the time I am citing is totally dependent on the resampling rate and polling latency!
source to share