Simple web crawler I need to remove duplicate url that is present in an array
I am using an array to store the url and I need to eliminate the url that is present more than once in the array because I no longer need to crawl the same url:
self.level = [] # array where the URL are present
for link in self.soup.find_all('a'):
self.level.append(link.get('href'))
print(self.level)
I need to eliminate the duplicate url before crawling that url.
+3
source to share