How to remove text from an image

I am trying to remove text from images, for example we have a screenshot of an instagram post, now we tried to extract only the image from this screenshot, in our .NET / C # code we read the whole pixel and check its color to see if we get white space so that we could remove all the unused space and extract only the picture, but that didn't work as expected. Does anyone have an idea to get it right?

+3


source to share


2 answers


To extract text from an image, you need to use some OCR library like Tesseract.

https://github.com/tesseract-ocr/tesseract



If necessary, you can use some type of .Net image editor like AForge.

https://github.com/andrewkirillov/AForge.NET

+3


source


This is a very broad question. Divide your problem into steps and start solving from the first step.

The best .Net library is EMGUCV, an OpenCV wrapper that is widely used in image processing.

AForge.Net is another good one. Follow the documentation for handling text data from images.



Logic 1. Track the texts on the image. 2. If the font and size are similar and static, you can enter fixed patterns and matching patterns. 3. Then there are several options for deleting the found object (here you will find texts as object or area). 4. You should adjust after removing texts from images. This will require image recovery algorithms. They are all available at EMGUCV.

See the documentation.

+1


source







All Articles