In this post, I will tell you about a project that I and my friend did last year. That was IC detection from a printed circuit board and then recognize it using optical character recognition.
This is the application of Open Computer Vision (OpenCV).
First of all, I will show you a flowchart that will describe our methodology and then I will discuss in detail.
This is the application of Open Computer Vision (OpenCV).
First of all, I will show you a flowchart that will describe our methodology and then I will discuss in detail.
There are two main parts of our algorithm: Localizing each IC on the PCB, extracting and saving it. Then we use Tesseract OCR engine to read the labels of each detected IC.
A. PREPROCESSING
Before we apply any algorithms to our image we have to process it to obtain a proper image. To save processing time we will resize the image maintaining the aspect ratio. After that remove noise from the image using Gaussian Blur.
B. SEGMENTATION
First, we convert the RGB image to HSV (Hue, Saturation, Value) color space. The H channel creates a mask for all the integrated circuits on the board. We then threshold the image to get a binary image However, it is not perfect.
C. MORPHOLOGICAL OPERATIONS
As we can see the thresholding is not perfect so we performed morphological operations (first dilate and then erode the dilated image) on the threshold image.
D. EXTRACTION
To extract the integrated circuits from PCB we first have to find edges of the ICs. We used Canny Edge Detection algorithm to find edges. After that, we found contours for those edges. These contours also included small areas on PCB that were not IC so, we had to exclude them. We excluded every contour which had less area than the mean area of contours. We found bounding rectangle for remaining contours and cropped that from the original image. This gave us all the integrated circuits on the PCB.
![]() |
| One of the extracted IC |
E. OPTICAL CHARACTER RECOGNITION (OCR)
Once the ICs were saved on the disk, we identified the ICs by reading their labels using OCR. For OCR, we used Tesseract OCR engine. If we pass the ICs directly to Tesseract OCR engine it will fail to read the labels so we pre-processed the image by smoothing and thresholding it. We used Otsu’s algorithm to threshold the image. We also eroded the image to make edges clear. We then passed this image to Tesseract OCR engine and got the labels in text format.
As you know the result cannot be perfect so there are some limitations also.
LIMITATIONS:
It can only detect and identify ICs in well-lit images.
Smaller ICs are not detected.
The accuracy of OCR is very low.
Now I know that I have used many technical terms here and not described it fully but in future posts, I will definitely explain them one by one. However, if you want to see the code and the paper that we have written then please let me know in the comments section and I will share the link of my GitHub account where you can find all the test images and code.
Till then enjoy learning.



Comments
Post a Comment