How do I store measurement points for a large area in a database so that it can be quickly selected?

Problem

I'm not sure how to store measurement data for large areas so that it can be quickly retrieved from the database.

More details

Measurement data

Measurement data consists of:

  • Longitude
  • Latitude
  • Signal strength
  • Transmitter ID

Each area has the potential to have multiple transmitters.

The data describing the signal from one transmitter is saved in a file. From one file I create about 2 million lines (points with signal strength). All data must be retained due to changes in signal strength.

For a single transmitter table, picking points that are +/- x meters in any general relative direction (← ↑ → ↓) from a specific point (longitude, latitude) takes approximately 0.5 s.

Problem

I need to show a signal of about 65000 points in one request, so it will take quite a lot (65000x05s) to calculate it.

What I've done

I decided to keep each transmitter in a separate table. In the main table, I only store the coordinates from the bottom left corner and the top right corner (this way I can determine which transmitter table is composed of points near a point and I can select data from that particular table). However the problem still exists as my table still has about 2 million rows.

Main table in which all transmitters are stored:

╔════╦═══════════════════════╦════════════════════ ═══╦═════════════════════════╦═════════════════════ ═══╗
║ id ║ left_lower_corner_lon ║ left_lower_corner_lat ║ right_upper_corner_lon ║ right_upper_corner_lat ║
╠════╬═══════════════════════╬════════════════════ ═══╬═════════════════════════╬═════════════════════ ═══╣
║ 1 ║ 12 ║ 48 ║ 13 ║ 49 ║
║ 2 ║ 12.5 ║ 48 ║ 14 ║ 50 ║
╚════╩═══════════════════════╩════════════════════ ═══╩═════════════════════════╩═════════════════════ ═══╝

and now tables for single transmitters (i.e. transmitter_1):

╔═════════╦═════════╦═══════════╗
║ lon ║ lat ║ sig ║
╠═════════╬═════════╬═══════════╣
║ 48,0004 ║ 12,0002 ║ -123,0000 ║
║ 48,0004 ║ 12,0003 ║ 124,0000 ║
╚═════════╩═════════╩═══════════╝

Now, to get signals from all transmitters for a specific point, I first select the trasmitter ID, and after that I look at the nearest point in the following table. But it takes too long for only one point (0.5 s).

My request

#in order to test, im using these variables:
SET @_lon = 13.729520117164848;
SET @_lat = 51.126581079972624;

SELECT lat, lon, sig, SQRT(
    POW(69.1 * (lat - @_lat), 2) +
    POW(69.1 * (@_lon - lon) * COS(lat / 57.3), 2)) AS distance
FROM (SELECT * FROM `transmitter_1` WHERE (lon <= @_lon+0.00009 && lon >= @_lon-0.00009 && lat <= @_lat+0.00009 && lat >= 0.00009)) AS nearest_points
HAVING distance < 25  ORDER BY distance LIMIt 1;

      

My ideas

I believe I am creating more tables for one transmitter and again storing coordinates for 2 corners.

  • How to find a compromise between the number of rows in a table and the execution time from php?
  • Should I create the whole chain in my sql (as functions) and not in php? Will I send this path more than 0.005 per point?
    • Select a transmitter table consisting of a dot
    • Then select the parts table for the transmitter that consists of that point
    • And finally by choosing the closest point as a MySQL function
+3


source to share


1 answer


Every time you enter a field in "where", make sure the field is indexed. You can also experiment with multi-pole indexes (if you always search on lon and lat, put both fields in the index).

Where speed comes from. I've seen queries that shorten hours (literally) to seconds with proper indexing.

In tables with millions of records, you can also experiment with data segmentation using separate tables, separate databases, and analyze data partitioning in databases.



But before you go there: Index.

Run your request. Index lon followed by index lat (index only, not unique or primary). Start it up again. Then index the long and lat together at the same index and run the query again. Compare your time. You should see a significant reduction in response times.

+1


source







All Articles