"in" and "not in" counts don't add up - what's wrong?

I have multiple servers. Some of them are assigned ips. I want to find out how many are not. There are clearly more servers than the ips assigned, but my db tells me there are no servers that don't have ips assigned ...

I'm here. Is my DB corrupted in some strange way?

SELECT COUNT(*) 
  FROM server

      

... returns:

+----------+
| count(*) |
+----------+
|    23088 | 
+----------+
1 row in set (0.00 sec)

      

It:

SELECT COUNT(*) 
  FROM server 
 WHERE server_id IN (SELECT DISTINCT(server_id) 
                       FROM ips)

      

... returns:

+----------+
| count(*) |
+----------+
|    13811 | 
+----------+
1 row in set (0.01 sec)

      

It:

SELECT COUNT(*) 
  FROM server 
 WHERE server_id NOT IN (SELECT DISTINCT(server_id) 
                           FROM ips);

      

... returns:

+----------+
| count(*) |
+----------+
|        0 | 
+----------+
1 row in set (0.02 sec)

      

The results have been edited to protect the culprit, but you got the idea.

  • All tables are InnoDB.
  • Check table

    returns ok on both of these tables.

EDIT: thanks for the suggestion to use LEFT JOIN

. This definitely confirms that the problem is with a MySQL bug.

mysql> SELECT count(s.server_id) FROM server s LEFT JOIN ips i on s.server_id = i.server_id WHERE i.server_id IS NULL;
+--------------------+
| count(s.server_id) |
+--------------------+
|               9277 | 
+--------------------+
1 row in set (0.04 sec)

      

Since 9277 + 13811 = 23088, this means all servers without ips + all servers with ips are indeed == all servers.

I have scheduled my mysql server upgrade for early next week. Stay with us.

+2


source to share


8 answers


Assuming the error truppo found is causing this, you can use this workaround:

select count(*)
from server s
left join ips i on i.server_id = s.server_id
where i.server_id is null

      



Above is i.server_id is null

true if left join

no match is found (just as all columns from i

could give null

for this situation).

+2


source


What version of MySQL? There seems to be a bug in <5.0.25 / 5.1.12 which could be the culprit:

Bug # 21282 : NOT IN, more than 1000 returns incorrect results using INDEX:



Using SELECT ... WHERE some_field NOT IN (...)

and then 1000 or more values ​​in NOT IN will partially cause the server to return incorrect results if it has an INDEX / UNIQUE key on the string field. Less than 1000 criteria are working correctly.

+7


source


Do you have Nulls in your columns?

+6


source


server_id not in (ids)

does not match columns NULL

, so you only get non-NULL servers server_id

that are not included ips

. You want to use instead where server_id is null

.

+4


source


Do you have a record that has NULL for server_id? Because that would have been ruled out in both cases.

+1


source


if you have NULLs in your columns, they will evaluate to false in both cases. the result you get is in

+ not in

-nulls

+1


source


select count(*) 
from server 
where server_id not in (select distinct(server_id) from ips)
or server_id is NULL

      

0


source


I would guess that something strange is happening with IN and NOT IN. This may be a bug or a "known limitation".

I would suggest first trying to answer your original question (servers without ip) and then looking at the data. This may give you an indication of what might be going on.

So, here are some alternative ideas to give you what you are looking for:

SELECT server_id
FROM server
MINUS
SELECT server_id
FROM ips

      

or

SELECT server_id
FROM server s LEFT JOIN ips i on s.server_id = i.server_id
WHERE i.server_id is null

      

As stated above, this can give you an idea of ​​why the data is not "caught" by your original statements.

0


source







All Articles