Opencv License Plate Recognition Training Data
By Tait Brown How I replicated an $86 million project in 57 lines of code When an experiment with existing open source technology does a “good enough” jobThe Victoria Police are the primary law enforcement agency of Victoria, Australia. With over 16,000 vehicles stolen in Victoria this past year — at a cost of about $170 million — the police department is experimenting with a variety of technology-driven solutions to crackdown on car theft. They call this system BlueNet.To help prevent fraudulent sales of stolen vehicles, there is already a VicRoads for checking the status of vehicle registrations. The department has also invested in a stationary license plate scanner — a fixed tripod camera which scans passing traffic to automatically identify stolen vehicles.Don’t ask me why, but one afternoon I had the desire to prototype a vehicle-mounted license plate scanner that would automatically notify you if a vehicle had been stolen or was unregistered. Understanding that these individual components existed, I wondered how difficult it would be to wire them together.But it was after a bit of googling that I discovered the Victoria Police had recently undergone a trial of a similar device, and the estimated cost of roll out was somewhere in the vicinity of $86,000,000. One astute commenter pointed out that the $86M cost to fit out 220 vehicles comes in at a rather thirsty $390,909 per vehicle.Surely we can do a bit better than that. Existing stationary license plate recognition systems The Success CriteriaBefore getting started, I outlined a few key requirements for product design.
Requirement #1: The image processing must be performed locallyStreaming live video to a central processing warehouse seemed the least efficient approach to solving this problem. Besides the whopping bill for data traffic, you’re also introducing network latency into a process which may already be quite slow.Although a centralized machine learning algorithm is only going to get more accurate over time, I wanted to learn if an local on-device implementation would be “good enough”. Requirement #2: It must work with low quality imagesSince I don’t have a Raspberry Pi camera or USB webcam, so I’ll be using dashcam footage — it’s readily available and an ideal source of sample data. As an added bonus, dashcam video represents the overall quality of footage you’d expect from vehicle mounted cameras.
Requirement #3: It needs to be built using open source technologyRelying upon a proprietary software means you’ll get stung every time you request a change or enhancement — and the stinging will continue for every request made thereafter. Using open source technology is a no-brainer. My solutionAt a high level, my solution takes an image from a dashcam video, pumps it through an open source license plate recognition system installed locally on the device, queries the registration check service, and then returns the results for display.The data returned to the device installed in the law enforcement vehicle includes the vehicle’s make and model (which it only uses to verify whether the plates have been stolen), the registration status, and any notifications of the vehicle being reported stolen.If that sounds rather simple, it’s because it really is. For example, the image processing can all be handled by the openalpr library.This is really all that’s involved to recognize the characters on a license plate: A Minor CaveatPublic access to the VicRoads APIs is not available, so license plate checks occur via web scraping for this prototype. While generally frowned upon — this is a proof of concept and I’m not slamming anyone’s servers.Here’s what the dirtiness of my proof-of-concept scraping looks like: ResultsI must say I was pleasantly surprised.I expected the open source license plate recognition to be pretty rubbish. Additionally, the image recognition algorithms are probably not optimised for Australian license plates.The solution was able to recognise license plates in a wide field of view. Annotations added for effect.
Number plate identified despite reflections and lens distortion.Although, the solution would occasionally have issues with particular letters. Incorrect reading of plate, mistook the M for an HBut the solution would eventually get them correct.
A few frames later, the M is correctly identified and at a higher confidence ratingAs you can see in the above two images, processing the image a couple of frames later jumped from a confidence rating of 87% to a hair over 91%.I’m confident, pardon the pun, that the accuracy could be improved by increasing the sample rate, and then sorting by the highest confidence rating. Alternatively a threshold could be set that only accepts a confidence of greater than 90% before going on to validate the registration number.Those are very straight forward code-first fixes, and don’t preclude the training of the license plate recognition software with a local data set. The $86,000,000 QuestionTo be fair, I have absolutely no clue what the $86M figure includes — nor can I speak to the accuracy of my open source tool with no localized training vs. The pilot BlueNet system.I would expect part of that budget includes the replacement of several legacy databases and software applications to support the high frequency, low latency querying of license plates several times per second, per vehicle.On the other hand, the cost of $391k per vehicle seems pretty rich — especially if the BlueNet isn’t particularly accurate and there are no large scale IT projects to decommission or upgrade dependent systems. Future ApplicationsWhile it’s easy to get caught up in the Orwellian nature of an “always on” network of license plate snitchers, there are many positive applications of this technology. Imagine a passive system scanning fellow motorists for an abductors car that automatically alerts authorities and family members to their current location and direction.Teslas vehicles are already brimming with cameras and sensors with the ability to receive OTA updates — imagine turning these into a fleet of virtual good samaritans.
Sep 18, 2018 openalpr. OpenALPR is an open source Automatic License Plate Recognition library written in C with bindings in C#, Java, Node.js, Go, and Python. The library analyzes images and video streams to identify license plates. The output is the text representation of any license plate characters. Real world license plate images along with license plate annotations to train this network. Since the training data we collected covers a wide range of variations, the network is designed to adapt to varying situations and be robust to regional di erences in license plates. Training: License Plate Recognition Train a series of CNNs for the.
Ubers and Lyft drivers could also be outfitted with these devices to dramatically increase the coverage area.Using open source technology and existing components, it seems possible to offer a solution that provides a much higher rate of return — for an investment much less than $86M.Part 2 — I’ve published an update, in which I test with my own footage and catch an unregistered vehicle, over here. FreeCodeCamp is a donor-supported tax-exempt 501(c)(3)nonprofit organization (United States Federal TaxIdentification Number: )Our mission: to help people learn to code for free. Weaccomplish this by creating thousands of videos, articles, andinteractive coding lessons - all freely available to thepublic. We also have thousands of freeCodeCamp study groupsaround the world.Donations to freeCodeCamp go toward our education initiatives,and help pay for servers, services, and staff.
In this tutorial, you will learn how to apply OpenCV OCR (Optical Character Recognition). We will perform both (1) text detection and (2) text recognition using OpenCV, Python, and Tesseract.A few weeks ago I showed you how to. Using this model we were able to detect and localize the bounding box coordinates of text contained in an image.The next step is to take each of these areas containing text and actually recognize and OCR the text using OpenCV and Tesseract.To learn how to build your own OpenCV OCR and text recognition system, just keep reading! Figure 1: The Tesseract OCR engine has been around since the 1980s.
As of 2018, it now includes built-in deep learning capability making it a robust OCR tool (just keep in mind that no OCR system is perfect). Using Tesseract with OpenCV’s EAST detector makes for a great combination.Tesseract, a highly popular OCR engine, was originally developed by Hewlett Packard in the 1980s and was then open-sourced in 2005. Found SSEAs long as you seetesseract 4 somewhere in the output you know that you have the latest version of Tesseract installed on your system. Install your Tesseract + Python bindingsNow that we have the Tesseract binary installed, we now need to install the Tesseract + Python bindings so our Python scripts can communicate with Tesseract and perform OCR on images processed by OpenCV.If you are using a Python virtual environment (which I highly recommend so you can have separate, independent Python environments) use theworkon command to access your virtual environment. Congratulations!If you don’t see any import errors, your machine is now configured to perform OCR and text recognition with OpenCVLet’s move on to the next section (skipping the Pi instructions) where we’ll learn how to actually implement a Python script to perform OpenCV OCR.
Install Tesseract 4 and supporting software on Raspberry Pi and RaspbianNote: You may skip this section if you aren’t on a Raspberry Pi.Inevitably, I’ll be asked how to install Tesseract 4 on the Rasberry Pi.The following instructions aren’t for the faint of heart — you may run into problems. They are tested, but mileage may vary on your own Raspberry Pi.First, uninstall your OpenCV bindings from system site packages. Figure 3: The OpenCV OCR pipeline.Now that we have OpenCV and Tesseract successfully installed on our system we need to briefly review our pipeline and the associated commands.To start, we’ll apply to detect the presence of text in an image. The EAST text detector will give us the bounding box (x, y)-coordinates of text ROIs.We’ll extract each of these ROIs and then pass them into Tesseract v4’s LSTM deep learning text recognition algorithm.The output of the LSTM will give us our actual OCR results.Finally, we’ll draw the OpenCV OCR results on our output image.But before we actually get to our project, let’s briefly review the Tesseract command (which will be called under the hood by thepytesseract library).When calling thetessarct binary we need to supply a number of flags. The three most important ones are- l,- oem, and- psm.The- l flag controls the language of the input text.
We’ll be usingeng (English) for this example but you can see all the languages Tesseract supports.The- oem argument, or OCR Engine Mode, controls the type of algorithm used by Tesseract.You can see the available OCR Engine Modes by executing the following command. $ tesseract -help-psmPage segmentation modes:0 Orientation and script detection (OSD) only.1 Automatic page segmentation with OSD.2 Automatic page segmentation, but no OSD, or OCR.3 Fully automatic page segmentation, but no OSD. (Default)4 Assume a single column of text of variable sizes.5 Assume a single uniform block of vertically aligned text.6 Assume a single uniform block of text.7 Treat the image as a single text line.8 Treat the image as a single word.9 Treat the image as a single word in a circle.10 Treat the image as a single character.11 Sparse text. Find as much text as possible in no particular order.12 Sparse text with OSD.13 Raw line. Treat the image as a single text line,bypassing hacks that are Tesseract-specific. Bypassing hacks that are Tesseract - specific.For OCR’ing text ROIs I’ve found that modes6 and7 work well, but if you’re OCR’ing large blocks of text then you may want to try3, the default mode.Whenever you find yourself obtaining incorrect OCR results I highly recommend adjusting the- psm as it can have dramatic influences on your output OCR results.
Project structureBe sure to grab the zip from the “Downloads” section of the blog post.From there unzip the file and navigate into the directory. Thetree command allows us to see the directory structure in our terminal.
1 directory, 7 filesOur project contains one directory and two notable files:. images /: A directory containing six test images containing scene text. We will attempt OpenCV OCR with each of these images. frozeneasttextdetection.pb: The EAST text detector. This CNN is pre-trained for text detection and ready to go. I did not train this model — it is provided with OpenCV; I’ve also included it in the “Downloads” for your convenience. textrecognition.py: Our script for OCR — we’ll review this script line by line.
The script utilizes the EAST text detector to find regions of text in the image and then takes advantage of Tesseract v4 for recognition.Implementing our OpenCV OCR algorithmWe are now ready to perform text recognition with OpenCV!Open up thetextrecognition.py file and insert the following code. Import cv2Today’s OCR script requires five imports, one of which is built into OpenCV.Most notably, we’ll be usingpytesseract and OpenCV. Myimutils package will be used for as OpenCV’sNMSBoxes function doesn’t seem to be working with the Python API. I’ll also note that NumPy is a dependency for OpenCV.Theargparse package is included with Python and handles command line arguments — there is nothing to install.Now that our imports are taken care of, let’s implement thedecodepredictions function.
Return ( rects, confidences )Thedecodepredictions function begins on Line 8 and is. Args = vars ( ap. Parseargs ( ) )Our script requires two command line arguments:. image: The path to the input image. east: The path to the pre-trained EAST text detector.Optionally, the following command line arguments may be provided:. min - confidence: The minimum probability of a detected text region. width: The width our image will be resized to prior to being passed through the EAST text detector. Our detector requires multiples of 32.
height: Same as the width, but for the height. Again, our detector requires multiple of32 for resized height. padding: The (optional) amount of padding to add to each ROI border. You might try values of0.05 for 5% or0.10 for 10% (and so on) if you find that your OCR result is incorrect.From there, we will load + preprocess our image and initialize key variables.
( H, W ) = image. Shape : 2 Ourimage is loaded into memory and copied (so we can later draw our output results on it) on Lines 82 and 83.We grab the original width and height ( Line 84) and then extract the new width and height from theargs dictionary ( Line 88).Using both the original and new dimensions, we calculate ratios used to scale our bounding box coordinates later in the script ( Lines 89 and 90).Ourimage is then resized, ignoring aspect ratio ( Line 93).Next, let’s work with the EAST text detector. # construct a blob from the image and then perform a forward pass of# the model to obtain the two output layer setsblob = cv2.dnn.blobFromImage(image, 1.0, (W, H),(123.68, 116.78, 103.94), swapRB=True, crop=False)net.setInput(blob)(scores, geometry) = net.forward(layerNames)# decode the predictions, then apply non-maxima suppression to# suppress weak, overlapping bounding boxes(rects, confidences) = decodepredictions(scores, geometry)boxes = nonmaxsuppression(np.array(rects), probs=confidences). Boxes = nonmaxsuppression ( np. Array ( rects ), probs = confidences )To determine text locations we:.
Construct ablob on Lines 109 and 110. Read more about the process.
Pass theblob through the neural network, obtainingscores andgeometry ( Lines 111 and 112). Decode the predictions with the previously defineddecodepredictions function ( Line 116). Apply via my ( Line 117).
NMS effectively takes the most likely text regions, eliminating other overlapping regions.Now that we know where the text regions are, we need to take steps to recognize the text! We begin to loop over the bounding boxes and process the results, preparing the stage for actual text recognition. # in order to apply Tesseract v4 to OCR text we must supply# (1) a language, (2) an OEM flag of 4, indicating that the we# wish to use the LSTM neural net model for OCR, and finally# (3) an OEM value, in this case, 7 which implies that we are# treating the ROI as a single line of textconfig = ('-l eng -oem 1 -psm 7')text = pytesseract.imagetostring(roi, config=config)# add the bounding box coordinates and OCR'd text to the list# of resultsresults.append(((startX, startY, endX, endY), text)). Append ( ( ( startX, startY, endX, endY ), text ) )Taking note of the comment in the code block, we set our Tesseractconfig parameters on Line 151 ( English language, LSTM neural network, and single-line of text).Note: You may need to configure the- psm value using my instructions at the top of this tutorial if you find yourself obtaining incorrect OCR results.Thepytesseract library takes care of the rest on Line 152 where we callpytesseract. Imagetostring, passing ourroi andconfig string.? Boom! In two lines of code, you have used Tesseract v4 to recognize a text ROI in an image. Just remember, there is a lot happening under the hood.Our result (the bounding box values and actualtext string) are appended to theresults list ( Line 156).Then we continue this process for other ROIs at the top of the loop.Now let’s display/print the results to see if it actually worked. WaitKey ( 0 )Our results aresorted from top to bottom on Line 159 based on the y-coordinate of the bounding box (though you may wish to sort them differently).From there, looping over theresults, we:.
Print the OCR’dtext to the terminal ( Lines 164-166). Strip out non-ASCII characters fromtext as OpenCV does not support non-ASCII characters in thecv2. PutText function ( Line 171).
Draw (1) a bounding box surrounding the ROI and (2) the resulttext above the ROI ( Lines 173-176). Display the output and wait for any key to be pressed ( Lines 179 and 180).OpenCV text recognition resultsNow that we’ve implemented our OpenCV OCR pipeline, let’s see it in action.Be sure to use the “Downloads” section of this blog post to download the source code, OpenCV EAST text detector model, and the example images.From there, open up a command line, navigate to where you downloaded + extracted the zip, and execute the following command. Figure 7: Our OpenCV OCR pipeline has trouble with the text regions identified by OpenCV’s EAST detector in this scene of a bake shop.
Keep in mind that no OCR system is perfect in all cases. Can we do better by changing some parameters, though?In the first attempt of OCR’ing this bake shop storefront, we see that “SHOP” is correctly OCR’d, but:. The “U” in “CAPUTO” is incorrectly recognized as “TI”. The apostrophe and “S” is missing from “CAPUTO’S’. And finally, “BAKE” is incorrectly recognized as a vertical bar/pipe (“ ”) with a period (“.”).By adding a bit of padding we can expand the bounding box coordinates of the ROI and correctly recognize the text. Figure 9: With a padding of 25%, we are able to recognize “Designer” in this sign, but our OpenCV OCR system fails for the smaller words due to the color being similar to the background. We aren’t even able to detect the word “SUIT” and while “FACTORY” is detected, we are unable to recognize the text with Tesseract.
Our OCR system is far from perfect.I increased the padding to 25% to accommodate the angle/perspective of the words in this sign. This allowed for “Designer” to be properly OCR’d with EAST and Tesseract v4. But the smaller words are a lost cause likely due to the similar color of the letters to the background.In these situations there’s not much we can do, but I would suggest referring to the limitations and drawbacks section below for suggestions on how to improve your OpenCV text recognition pipeline when confronted with incorrect OCR results.
Hi Adrian, this is a great post! Thanks for sharing!I have the same trouble.
I am working in a project where I am OCRizing documents that are scanned but they have handwritten dates which are very important to me. What I did first was define the text region, then apply line segmentation and send each line to the Tesseract network to extract the text. The problem is date these dates are in the middle of some specific line that has other important information and the neural net is getting really confused when trying to predict the dates and sometimes the of the text.I think your suggestion of training a simple CNN would work but I’m still a king of newbie. How could I do that? Would it be retraining the Tesseract NN? Do I have to find this lines in each document I run, or the neural net would recognize them by itself?I also would like to know if my approach is good:1-Define text region and crop the image;2-Apply line segmentation3-Send each line to TesseractThank you again!Lucas from Brazil 😊.
Dear Dr Adrian,The above examples work for fonts with serifs eg Times Roman and without serifs, eg Arial,Can OCR software be applied to detecting characters of more elaborate fonts, such as Old English fonts used for example in the masthead for the Washington Post,? There are other examples of Old English fonts at.To put it another way, do you need to train or have a dataset for fancy fonts such as Old English in order to have recognition of fonts of that type?Thank you,Anthony of Sydney. Hi Adrian,Thank you for this. I’ve messed with tesseract in the past but have struggled to get good results out of it (and I think I was using the LSTM version but I’m unsure) on data for work. Our data is under varying lighting conditions and can have significant blur. We use GCP’s OCR solution at the moment which works really really well on this data but if course can get costly.One thing I’ve repeatedly tried to do and failed is figure out how to train tesseract on my own data (both real and synthetic). So much so that I gave up and (for the one part of our pipeline that Google doesn’t work well on) built my own deep learning based OCR system which works quite well (but incurs significant RnD overhead).
If you know how to train tesseract and would be willing to write that down, I would deeply appreciate that. Hi Adrian, even though not related to this post i had thought about NN/AI security.I’m not currently working on CV myself so im unsure if im up to date but you would probably know.There were methods (like pixel attacks) that allowed someone who was familiar with the architecture of a CNN to create images or modify images to get a desired output.= change x, let the the model classify an airplane as a fish.The big “let down” here is that i could only do that with my own NN so its pretty pointless and the security risk pretty low. But now that i think about how CV is implemented by semi-experts and without clear rules and standards i would imagine a lot of CV software solutions out there and those that are about to be build will make use of the state of the art nets of the big researchers and will base their nets on that.
I don’t know what I am doing wrong but I’ve tired this about 100 times now and keep getting he ‘Nonetype’ error where the image.copy is used line 83. Do I need to add the location to the image on the preceding lineline 82?
Coz I’ve done that now in at least 8 different ways and still keep getting that error.Also, where does the code actually refer to the image location and also the location for the east code? If I’ve followed the code correctly, then this should be line 88 for image location and line 111 for east file.
So, do I change the string value to the locations for the respective file?Any help on this matter will be highly appreciated. Thanks for sharing the code though. Coming from a different coding language, this page has been a lot of help to translate the image processing principles.
Using a stylized font with exaggerated serifs (not as exaggerated as Old English typface typical of newspaper brands). The Tesseract text detection bounding boxes are cutting off significant parts of some letters rendering the text recognition inaccurate. Even when embedding the very font by using a trainingdata file trained by ocr7.com and using perfect text examples created using the very same font, this problem occurs. Is it possible to tweak tesseract’s bounding box parameters?Shouldn’t Tesseract produce excellent results when exclusively using training data created with the one font it is asked to detect/recognize?Your text detection tutorial describes how to do so, but I don’t believe that part of the text recognition process is exposed when using tesseract to do all processing.
C# License Plate Recognition
Hi, Adrain.Great tutorial and many thanks.I am a novice in the image processing field. After carefully following all the installation steps and the compiling the code, I was able to run the code succesfully.One can simply use your tutorial and start working out of the box with minimal time.I do have some doubts.1. I would like to know more on the min-confidence parameter.2.
What type of algorithm/ method does imutils method use for the non-maxima supression.3. The detected text area in the form of rectangle is stored in the variable boxes, in the form of Nx4 matrix, where N is the number of text boxes detected, with each row containing the co-ordinate of each rectangle boxes. Please clarify if my assumption is wrong.
Is there any officail fixed dimensions (like pixels or length or width) of the image that want to use for text detection. I tried googling “Official ICDAR dataset format”, couldn’t get any result. I have seen in some papers that, the performance of the method for text-detection is computed on the area of detected text. So, how should I approach for the evaluation process in my image dataset to use values stored in ‘boxes’. Is there any specific open source tools that I could fed the values of boxes. then again, I have to define the co-ordinates of text area in the image manually, it seems, but how?
Sorry, for asking too many questions.Your works helped me a lot. Thabk you again.Regards,Haruo. Hello Adrian,Currently i am facing some issue whereby my scripts will run tesseract (with thread) on the video frame every 6 secs to extract the information on the video frame.But, everytime when the video almost ends, the process will slow down significantly and all the cpu cores usage will suddenly spike to 100%. Then, there will be processesproduced (which ends up in zombies processes) and a lot of xxx.png and xxxout.txt produced in the /tmp directory. Do you or anyone else ever face this issue? Hope to hear from you guys soon.Thanks in advance and have a nice day.Regards,Gordon.