Understanding the Difficulty of Removing Duplicates from a Linked List

Question

Understanding the Difficulty of Removing Duplicates from a Linked List

I wrote this program to remove duplicate nodes from an unsorted linked list:

#include<bits/stdc++.h>
using namespace std;

/* A linked list node */
struct Node
{
    int data;
    struct Node *next;
};

// Utility function to create a new Node
struct Node *newNode(int data)
{
   Node *temp = new Node;
   temp->data = data;
   temp->next = NULL;
   return temp;
}

/* Function to remove duplicates from a
   unsorted linked list */
void removeDuplicates(struct Node *start)
{
    // Hash to store seen values
    unordered_set<int> seen;

    /* Pick elements one by one */
    struct Node *curr = start;
    struct Node *prev = NULL;
    while (curr != NULL)
    {
        // If current value is seen before
        if (seen.find(curr->data) != seen.end())
        {
           prev->next = curr->next;
           delete (curr);
        }
        else
        {
           seen.insert(curr->data);
           prev = curr;
        }
        curr = prev->next;
    }
}

/* Function to print nodes in a given linked list */
void printList(struct Node *node)
{
    while (node != NULL)
    {
        printf("%d ", node->data);
        node = node->next;
    }
}

/* Driver program to test above function */
int main()
{
    /* The constructed linked list is:
     10->12->11->11->12->11->10*/
    struct Node *start = newNode(10);
    start->next = newNode(12);
    start->next->next = newNode(11);
    start->next->next->next = newNode(11);
    start->next->next->next->next = newNode(12);
    start->next->next->next->next->next =
                                    newNode(11);
    start->next->next->next->next->next->next =
                                    newNode(10);

    printf("Linked list before removing duplicates : \n");
    printList(start);

    removeDuplicates(start);

    printf("\nLinked list after removing duplicates : \n");
    printList(start);

    return 0;
}

Does finding each item in a hash table affect complexity? If so, what should be the time complexity of this algorithm, given that the set is implemented as a binary search tree, where the cost of finding an item is O (logn) in the worst case. According to me, T (n) = T (n-1) + log (n-1), i.e. The nth element will do log (n-1) comparisons (i.e. the height of the tree with n-1 elements) Please give a mathematical analysis.

+3

c ++ set algorithm time-complexity

abhishek gupta May 25 '17 at 17:23

source to share

1 answer

pstrjds · Accepted Answer · 2017-05-25T17:49:36+0000

Does each element of the hash table detect complexity?

Well, in your code, you are using unordered_set , which has an average complexity of O (1), so the simple answer is No.

... given that the set is implemented as a binary search tree where the cost of finding an item is O (logn) in the worst case.

Again, you chose unordered_set

which is not a binary search. I believe some of the implementations set

use Red / Black trees and you will be looking at O (logN), but unordered_set

that should be constant time. So now the only problem is traversing your linked list. Which, since you are just moving in one direction while visiting each node, it is an O (N) operation.

Understanding the Difficulty of Removing Duplicates from a Linked List

More articles: