daily Programming: OCR tesseract 看圖

Tags: daily-programming, ocr, python, tesseract

到本單位工作快三年了

終於有比較有趣的東西了~

錄使用者畫面擷取文字資訊

萬張高樓平地起

影像就是一張張圖片在跳舞的組合

這回先來看圖下回再來把影片整個情境串起來

安裝
Tesseract: OCR engine
- github
python:
- pytesseract
  - 站在巨人的肩膀上把 tesseract 當核心使用python完成 OCR需求
  - pytesseract requires Python ‘>=3.7’ but the running Python is 3.6.8
  - pip install pytesseract
- opencv-python
  - pip install opencv-python
範例
辨識語系選擇
github
lang="chi_tra" --> 我在安裝 tesseract 已經額外選擇安裝 chinese tradition

   import cv2
   import pytesseract
   
   # 我灌在
   # C:\Program Files\Tesseract-OCR
   pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract'
   img_cv = cv2.imread(r'yutingBlog.JPG')
   img_rgb = cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB)
   ## lang="chi_tra" --> 我在安裝 tesseract 已經額外選擇安裝 chinese tradition
   print(pytesseract.image_to_string(img_rgb, lang='chi_tra'))