Problems with storing numeric data in text columns - SELECT ... BETWEEN
Several years ago, I was working on a system where the numeric primary key was stored in a varchar column [SQL Server], so I quickly unstuck when querying with the BETWEEN statement:
SELECT ID FROM MyTable WHERE ID BETWEEN 100 AND 110;
Results:
100 102 103 109 110 11
It was just bad design. However, I am working on a third party ERP system, which, as you can imagine, should be versatile and flexible; so we have different tables where alphanumeric fields are represented where the business only uses numerical values ββ- so similar problems can arise.
I guess this is a fairly common problem; I have a fairly simple solution, but I'm curious how others approach such problems.
My simple solution:
SELECT ID FROM MyTable
WHERE ID BETWEEN iStartValue AND iEndValue
AND (LENGTH(ID) = LENGTH(iStartValue)
OR LENGTH(ID) = LENGTH(iEndValue));
As you can tell, this is an Oracle system, but I usually work on SQL Server - so perhaps database agnostic solutions are needed.
Edit 1: To scratch this off - I don't understand why proprietary solutions are also discouraged.
Edit 2: Thanks for all the answers. I'm not sure if I'm disappointed that there is no obvious and difficult solution, but I'm accordingly glad that I don't see anything obvious!
I think I still prefer my own solution; it's simple and it works - is there a reason why I shouldn't use it? I can't believe this is much, if not at all, less effective than the other suggested solutions.
I understand that in an ideal world this problem would not exist; but unfortunately I do not work in a perfect world and this is often the case when you make the best of a bad situation.
source to share
If you are sure that the values ββin the ID are only numeric, why not just CAST them
WHERE CAST(ID as int) BETWEEN iStartValue AND iEndValue
EDIT 1: An extension to the casting method that should work is to use an auxiliary query to output all numeric records. Please note, I do not think this method is better than the one suggested above, I am including it as it answers the problem.
SELECT ID
FROM (
SELECT ID
FROM MyTable
WHERE ISNUMERIC(ID) = 1
AND CHARINDEX ('.', ID) = 0
AND CHARINDEX ('-', ID) = 0
) a
WHERE CONVERT(bigint, ID) BETWEEN 0 AND 12000
ORDER BY LENGTH(ID) ASC, ID
Checking for "-" and ".". no characters required. I am assuming your ids cannot be negative or decimal.
source to share
I don't know if this might work in your situation, but ...
How about adding an actual numeric column to a table populated with a value (SQL Server you could use a computed column with a constant index set on it)
Other vendors use a different mechanism for DBs to fill (trigger, materialized view, etc.)
and then use that column instead of varchar one ...
source to share
Maybe LPAD (id, 12, '') will work for you. It should make all the column values ββ12 wide, with spaces on the left.
Also, I would be a little concerned about the numbers in varchar2 columns.
if you are doing any number like analytics you may get an exception for non-numeric data.
source to share
Another option is to leave the zero digits of your numbers and use an operator in between. For reasons of expediency, it is probably best to include this as the second condition (so that possible indices can still be used). Something like that...
SELECT ID FROM MyTable
WHERE ID BETWEEN iStartValue AND iEndValue
And Right('0000000000' + ID, 10) Between iStartValue and iEndValue
I tested this in SQL Server and returned the correct values. You may need to change this to work with Oracle.
source to share