Does it really query for primary keys?

Jeff Atwood wrote once , he found himself querying the database for primary keys and then getting all the corresponding fields with an IN clause doubled as quickly as its one-liner instance.

I wonder if this applies to all situations, and if not, what are the cases where it still provides significant room for improvement in terms of performance?

Also, how expensive is it to access the db through a scripting language library? I'm mainly talking about the very famous PHP-MySQL combination.

+2


source to share


3 answers


Jeff Atwood is talking about SQL Server, not MySQL. SQL optimizations are known to depend on the DBMS, configuration, query, data and cache state. In addition to selecting only the primary key fields will be at least as fast as selecting the entire row, it is difficult to generalize. Of course, it is difficult to generalize to some extent, which would be useful. You will need to compare your specific case.



Based on my experience with MySQL, I would be surprised if fetching parts with an IN query was faster than doing it SELECT *

in the first place. I understand that it is SELECT *

more expensive than SELECT id

because MySQL has to look up the index data in both cases, but in the first case, an extra step has to be taken to get the data that makes up the rest of the row, which may require additional disk accesses (especially since the table data with are less likely to be in the cache than the index). However, with an InnoDB clustered index (since the primary key would be if you are using InnoDB), there is a special case where the data is stored along with the index entry in the clustered index. In this case, I believe it SELECT *

will be almost the same speed as SELECT id

.

+2


source


It depends. Sometimes, as Jeff's blog post clearly indicates, it can provide (significant) productivity gains. But it is generally best to let the query optimizer find the best execution plan it can, and then try to manually optimize especially slow queries.

From "We Install Built-in Linq Language Constructs by Default and Move on to Manual Tuning of Old SQL Blocks, Where Performance Traces Tell Us What We Need". Likewise, you should by default the query optimizer do what it does and move on to tuning your SQL statements, where performance traces tell you what you need.



Connecting to a database engine from a scripting language is usually very fast. Typically, the actual execution of the queries will take much longer than actually connecting to the database server and transferring the results from the database server to the script query.

+3


source


Retrieving data using a key will always be faster when grabbing data from a table. This is how databases work; grabbing indexed data is faster than grabbing non-indexed data. And getting just the key might be faster, since all the DB engine has to do is "expand" the data from the index into a result set.

As for your "dear" question, I am assuming you mean "it is slow". I have not found this. One of the most computationally expensive parts of a query is opening a connection, and most (if not all) modern databases use some form of connection caching, so it's not that expensive anymore. As far as the requests themselves are concerned, the only real cost will be network latency, so you should see the requests take the same amount of time or not much longer than if you were requesting a non-scripted language (in other words, milliseconds).

0


source







All Articles