A while ago we developed a solution for our customer, where PDF documents were automatically processed and imported to a document system. The core of the process is to analyze every page of the document, read codes (bar codes, qr codes) and merge pages together to create a final document, which is then imported to a document system.
In this article I will touch on reading bar codes and it’s challenges. Let’s start with an overview of the core process.
We run the process as Azure Function. The image is consumed by the function and split page by page. Then each page is converted to bitmap image and pre-processed by few image filters to clean up the background and enhance the foreground. Then this processed image is fed to the bar code reading library which spits out bar codes. Finally the document is imported to the document system.
That’s in theory! There are few challenges when reading the bar codes and let’s discuss them below.
PDF quality
In the beginning we need an optimal quality of a scanned document. This can be achieved by selecting appropriate hardware (plus it’s maintenance), PDF quality (like compression, PDF version) and employee education to properly bundle and orient paper documents before the scan. We want to get an optimum quality and PDF size before we start the process to yield the best results. By addressing the above pinpoints, you can achieve that.
Image pre-processing filters
After this, we can focus on the “software” part, which takes care for the PDF processing. Before we can read the bar codes, we need to run image pre-processing filters to prepare them. For that we used a .NET library called Magick.NET and a combination of image filters from binarization, despeckle, dilate to erode and others.
Our algorithm tries to apply filters in few different ways to extract the bar codes from the problematic document pages. If you are not limited by “instant” response to your customers you can go nuts here and process the image in a “long running process” by applying different resolutions and image filters. It’s better automatically than manually, right?! Of course if scans are really poor and bar codes unreadable, you are left with manual intervention.
Process memory consumption
The image processing is very heavy on computer resources (especially the more pixels you have), so you really need to find an optimal way for your problem. In our case memory consumption is very important because we are running the process as Azure function. And as we know, in case of Azure Functions we are billed based on the “duration of the process” x “memory consumption” x “number of processes”. We addressed this issue by testing and trying different algorithm parameters and at the end we got to 99.5% success rate. After thorough examination, we noticed that the other 0.5% is mostly linked to the hardware issues or improper document handling.
Here are the core libraries we use in the process mentioned:
- Aspose.PDF for .NET – commercial library for reading and manipulating PDF documents.
- Magick.NET – image pre-processing before reading bar codes.
- ZXing.NET – a library to read bar codes from the pre-processed image.
Leave your comments down below :).