OCRing Music from YouTube with Common Lisp: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 13: Line 13:
Of all the classical OCR libraries out there, Tesseract is probably the most famous. There are a few knobs to tweak on it, but in general you just chuck your image at it and let it rip. I honestly figured this would work really well, since this is monospaced, easily readable characters that should theoretically be a perfect match for these old-skool OCR techniques. There's a Lisp binding out there for it: https://github.com/GOFAI/cl-tesseract, so I quickly grabbed it, pointed it at a sample image, and...
Of all the classical OCR libraries out there, Tesseract is probably the most famous. There are a few knobs to tweak on it, but in general you just chuck your image at it and let it rip. I honestly figured this would work really well, since this is monospaced, easily readable characters that should theoretically be a perfect match for these old-skool OCR techniques. There's a Lisp binding out there for it: https://github.com/GOFAI/cl-tesseract, so I quickly grabbed it, pointed it at a sample image, and...


[[File:Article2.png|500px]]
[[File:Article2.png|600px]]


Wtf? I tried all sorts of techniques to pre-process the image, align the text, whatever, and Tesseract sucked every time. I'm not sure if it's optimized for print or what, but I just could not for the life of me get it to produce correct scans more than like 50% of the time. So it seems to be nearly worthless in this day and age.
Wtf? I tried all sorts of techniques to pre-process the image, align the text, whatever, and Tesseract sucked every time. I'm not sure if it's optimized for print or what, but I just could not for the life of me get it to produce correct scans more than like 50% of the time. So it seems to be nearly worthless in this day and age.

Navigation menu