OCRing Music from YouTube with Common Lisp: Difference between revisions

OCRing Music from YouTube with Common Lisp (view source)

1 byte added , 5 January 2025

no edit summary

88

edits

@@ Line 25: / Line 25: @@
 [[File:Article3.png|600px]]
-Urghhh, kinda better than Tesseract, but still, wtf. Again I tried a bunch of methods to enhance the readability of this, but nothing really worked perfectly. It gets it right most of the time but then occasionally just goes nuts and puts something totally wrong. I guess that's what you get when you have "intelligence" interpreting visual data. I also tried Gemini, same situation. Looking back, maybe I should have cranked the temperature down, but regardless, this solution is a bit overkill anyway, since it's doing a separate HTTP request to a massive GPU-based model for every little chunk of text, cost a (relative) fortune, and took forever.
+Urghhh, kinda better than Tesseract, but still, wtf. Again I tried a bunch of methods to enhance the readability of this, but nothing really worked perfectly. It gets it right most of the time but then occasionally just goes nuts and puts something totally wrong. I guess that's what you get when you have "intelligence" interpreting visual data. I also tried Gemini, same situation. Looking back, maybe I should have cranked the temperature down, but regardless, this solution is a bit overkill anyway, since it's doing a separate HTTP request to a massive GPU-based model for every little chunk of text, costs a (relative) fortune, and took forever.
 = Attempt 3: Oldskool Pixel Diffing =