4
edits
m (→Tesseract Tips) |
m (→Tesseract Tips) |
||
Line 51: | Line 51: | ||
Tesseract makes characteristic errors. Some of these such as "\/\/" or "\X/" substituted for for "W" can be | Tesseract makes characteristic errors. Some of these such as "\/\/" or "\X/" substituted for for "W" can be | ||
be globally replaced as it is highly unlikely that they would occur on their own on a label. Others such as "O" substituted for "0", "1" or "!" substituted for "l" or "Z" substituted for "2" or visa versa can be replaced in a context-dependent manner in dates, latitudes and longitudes, etc. For instance, "0ct. !Z, ZOlZ" can be located with a regular expression and changed to "Oct. 12, 2012" so that it can be entered into a database. | be globally replaced as it is highly unlikely that they would occur on their own on a label. Others such as "O" substituted for "0", "1" or "!" substituted for "l" or "Z" substituted for "2" or visa versa can be replaced in a context-dependent manner in dates, latitudes and longitudes, etc. For instance, a string containing multiple errors such as "0ct. !Z, ZOlZ" can be programmatically located with a regular expression and changed to "Oct. 12, 2012" or even "12-October-2012" so that it can be entered into a database. | ||
<br>Misc notes: | <br>Misc notes: | ||
Will often recognize vertical text<br> Image input can be tif, jpeg, or gif | Will often recognize vertical text<br> Image input can be tif, jpeg, or gif |
edits