Remove duplicate values ​​in a cell MySQL

I have a table with a column of type "search_text" text

.

In this field I have values:

 1. 'MyBook MyBook PDF PDF', 
 2. 'Example 1 Example 2 Example 3'
 3. 'John Snow John Snow'

      

I would like to clear these fields.

Expected Result:

 1. 'MyBook PDF', 
 2. 'Example 1 2 3'
 3. 'John Snow'

      

The approach I came up with is this: read the field for each record, separate it with a space (''), put each text into an array, do array_unique

in PHP, and then return the array to string c join

in PHP.

Thing is, this is a PHP based solution, I would like to have a MySQL solution for that. I have over 180,000 records that I need to clean up, I don't know what impact this might have on PHP.

I found a solution for MS SQL : Remove duplicate values ​​on SQL Server core

Help with gratitude.

SQL of my test data:

CREATE TABLE IF NOT EXISTS `test` (
`id` int(10) unsigned NOT NULL,
  `search_text` text COLLATE utf8_unicode_ci NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

INSERT INTO `test` (`id`, `search_text`) VALUES
(1, 'MyBook MyBook PDF PDF'),
(2, 'Example 1 Example 2 Example 3'),
(3, 'John Snow John Snow'),
(4, 'test test test test formula test test test formula test test test formula test test test formula test test test formula test test test formula '),
(5, '');

ALTER TABLE `test`
 ADD PRIMARY KEY (`id`);

ALTER TABLE `test`
MODIFY `id` int(10) unsigned NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=6;

      

+3


source to share


3 answers


I went for a PHP solution here:

$s = 'John Snow John Snow';
//remove duplicate values in string
$tmpArray = explode(" ", $s);
$tmpArray = array_unique($tmpArray);
$s = join(" ", $tmpArray);

      



which runs before INSERT

and it does what I wanted.

0


source


Try to sort by count :)



SELECT DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(test.search_text, ' ', numbers.n), ' ', - 1) col_name
FROM (
    SELECT 1 n

    UNION ALL

    SELECT 2

    UNION ALL

    SELECT 3

    UNION ALL

    SELECT 4
    ) numbers
INNER JOIN test ON CHAR_LENGTH(test.search_text) - CHAR_LENGTH(REPLACE(test.search_text, ' ', '')) >= numbers.n - 1
ORDER BY col_name;

      

0


source


You will need to write a MySQL function to do this for you. I think the PHP page will be fine. 180,000 records are not that many and should (unless you are using a low-end server) work without any extra effort.

I wrote 2 for you that you could use:

DROP PROCEDURE IF EXISTS explode;
DELIMITER //
CREATE PROCEDURE explode(str_string TEXT) 
NOT DETERMINISTIC
BEGIN
DROP TABLE IF EXISTS explosion;                                
CREATE TABLE explosion (id INT AUTO_INCREMENT PRIMARY KEY NOT NULL, word VARCHAR(100));                                
SET @sql := CONCAT('INSERT INTO explosion (word) VALUES (', REPLACE(QUOTE(str_string), " ", '\'), (\''), ')');                                
PREPARE myStmt FROM @sql;                                
EXECUTE myStmt;                                
END //
DELIMITER ;

      

This procedure creates an explode function for use in MySQL. It uses a temporary table and blows up whitespace-separated words in it

This function will then read this table and put it into another temporary table with duplicates removed:

DROP PROCEDURE IF EXISTS removeDuplicates;
DELIMITER //
CREATE PROCEDURE removeDuplicates(str TEXT) 
BEGIN
    DECLARE temp_word TEXT;
    DECLARE last_word TEXT DEFAULT "";
    DECLARE result TEXT;
    DECLARE finished INT DEFAULT false;
    DECLARE words_cursor CURSOR FOR
        SELECT word FROM explosion;
    DECLARE CONTINUE handler FOR NOT found
        SET finished = true;

    CALL explode(str);
    DROP TABLE IF EXISTS temp_words;
    CREATE TABLE temp_words (id INT AUTO_INCREMENT PRIMARY KEY NOT NULL, t VARCHAR(100));

    OPEN words_cursor;
    loop_words: LOOP

        FETCH words_cursor INTO temp_word;

        IF finished THEN
            LEAVE loop_words;
        END IF;

        IF last_word = "" THEN
            INSERT INTO temp_words (t) VALUES (temp_word);
            SET last_word = temp_word;
            ITERATE loop_words;
        END IF;

        IF last_word = temp_word THEN
            SET last_word = temp_word;
            ITERATE loop_words;
        END IF; 

        INSERT INTO temp_words (t) VALUES (temp_word);

    END LOOP loop_words;
    CLOSE words_cursor;

END //

DELIMITER ;

      

So, all you have to do is figure out how to get the records in temp_words

the current database table.

0


source







All Articles