TSQL Create a string of 5 characters long, all digits [0-9] that do not yet exist in the database
What's the best way to do this?
I need to create a 5 digit long string where all characters are numeric. However, I need to be able to do this "x" the number of times (user variable) and store those random strings in the database. Also, I cannot create the same line twice. Old lines will be deleted after 6 months.
Pseudo-code
DECLARE @intIterator INT,
@intMax
SET @intIterator = 1
SET @intMax = 5 (number of strings to generate)
WHILE @intIterator <= @intMax
BEGIN
-- GENERATE RANDOM STRING OF 5 NUMERIC DIGITS
???
-- INSERT INTO DB IF DOESN'T ALREADY EXIST
INSERT INTO TSTRINGS
SELECT @RANDOMSTRING
IF @@ERROR = 0
SET @intIterator = @intIterator + 1
END
I know this is probably not the best way to do it, so advice is appreciated. But really looking for ideas on how to generate numeric strings of length 5.
source to share
All in one. This should find the remaining @intMax values if you have (100000 - @intMax) rows already with @intMax permutations on the left
INSERT TOP (@intMax) MyTable (RndColumn)
SELECT
RndValue
FROM
(
SELECT DISTINCT TOP 100000 -- covers potential range from 00000 to 99999
RIGHT('00000' + CAST(ABS(CHECKSUM(NEWID())) AS varchar(10)), 5) AS RndValue
FROM
sys.columns c1, sys.columns c2
) foo
WHERE
NOT EXISTS (SELECT *
FROM
MyTable T
WHERE
T.RndColumn = foo.RndValue
source to share
The "obvious" way can be described as " key = random; while (key already selected) { key = random }
". It works, but the birthday paradox implies our chances of collision increasing at an alarming exponential rate in proportion to the number of keys already in use. Thus, choosing a random key averages exponentially longer with each new key, and will most likely end up in an infinite or arbitrarily long cycle.
You are much better at generating your list of keys in the front like this:
-
Hold the table
UniqueKeys
containing all pre-computed rows '00000' .. '99999' in addition to the fieldkeyOrder
, which is always initialized tonewId()
on insert.keyOrder
must be indexed. -
Whenever you need to "generate" a string, you can , which will fetch the next available key at almost constant time. Now that you have a key, you can remove it from it to prevent reuse.
SELECT TOP 1
key
FROM UniqueKeys ORDER BY keyOrder
UniqueKeys
-
Crop and update the table every six months
UniqueKeys
.
This style has the advantage of being relatively simple to implement, almost constant time to generate the next key, and avoiding the nasty "check if exists in loop" scenario described above.
source to share
Do you need logic to check if a number exists?
You can use the following to generate your random number:
CREATE FUNCTION RandNumber()
RETURNS float
AS
BEGIN
RETURN (SELECT RandNumber FROM vRandNumber)
END
CREATE FUNCTION RandNumber2(@Min int, @Max int)
RETURNS float
AS
BEGIN
RETURN @Min + (select RandNumber from RetRandNumber) * (@Max-@Min)
END
Then just call RandNumber on your selection.
Here's the site I found with this script: Here
source to share
Something like that?
CREATE FUNCTION RandNumber2(@Min int, @Max int)
RETURNS float
AS
BEGIN
DECLARE @TheNumber INT
SET @TheNumber = (SELECT CONVERT(INT, Rand()*(@Max-@Min)+@Min))
WHILE (SELECT COUNT(IndexColumn) WHERE CONVERT(INT, IndexColumn) = @TheNumber) > 0
BEGIN
-- Do it again - we have a collision
SET @TheNumber = (SELECT CONVERT(INT, Rand()*(@Max-@Min)+@Min))
END
DECLARE @Result VARCHAR(5)
SET @Result = RIGHT('00000' + CONVERT(VARCHAR(5), @TheNumber), 5)
RETURN @Result
END
source to share
One way to create a string like this:
DECLARE @Foo char(5)
SET @Foo = right(str((checksum(newid()) + 100000), 11, 0), 5)
PRINT @Foo
As far as uniqueness is concerned, you will need to loop around a table containing previously indexed values (indexed!), Only exiting the loop when a "new" identifier is generated. You might run into concurrency issues if two separate processes somehow generate the same value, where the first does not enter it into the table before the second checks for existance ... but a lot depends on when and how this value is actually used.
source to share
Here's a set-based approach using SQL 2005 syntax (will be a little easier with SQL 2008, but you didn't specify). Also, if you have a table of numbers, you can cut out a large chunk of it.
No loops, no duplicates, and should be nearly instantaneous (assuming the target column is indexed).
DECLARE @intMax integer
SET @intMax = 5
INSERT INTO TSTRINGS
SELECT q.nString
FROM ( SELECT ROW_NUMBER() OVER (ORDER BY (newID())) AS N2,
RIGHT(REPLICATE('0', 5) + CONVERT(varchar(5), N), 5) as nString
FROM --the subquery below could be replaced by a numbers table
(SELECT TOP 100000
ROW_NUMBER() OVER (ORDER BY (ac1.Object_ID))-1 AS N
FROM Master.sys.columns ac1
CROSS JOIN Master.sys.columns ac2
CROSS JOIN Master.sys.columns ac3) numbers
WHERE RIGHT(REPLICATE('0', 5) + CONVERT(varchar(5), N), 5)
NOT IN (SELECT nString FROM TSTRINGS) --check to see if reused
) q
WHERE q.N2 <= @intMax
source to share