Nhận diện ký tự bằng PaddleOCR

Trong một số bài toán bạn chỉ cần xác định vùng nào có text chứ không cần nhận diện ký tự. Khi đó PaddleOCR sẽ là công cụ phù hợp cho bạn, ngoài ra còn có thể đọc được văn bản, rất tiếc là không hỗ trợ tiếng Việt. Văn bản có dấu sẽ đọc ra không dấu, ví dụ như “Thị giác máy tính” sẽ nhận diện thành “Thi giac may tinh”.

Bài viết này hướng dẫn chạy code trên hệ điều hành Windows 10.

Nếu bạn cần nhận diện văn bản tiếng Việt thì đọc bài Nhận diện văn bản bằng Tesseract

Phần 1: Thử nghiệm độ chính xác nhận diện văn bản

Ảnh cần trích xuất text

Kết quả trích xuất text gồm tọa độ các vùng text, ký tự nhận diện được và confident

Thử với một ảnh khác, lần này lấy kết quả để vẽ khung chữ nhật

Nhận diện text từ ảnh chụp

Có thể thấy kết quả nhận diện khá tốt

Phần 2: Ứng dụng của Paddle OCR

Đọc chỉ số đồng hồ điện bằng OCR

Phần 3: Cách build chương trình nhận diện văn bản Paddle OCR

Bước 1: cài đặt Python 3.7.3 x64
Bước 2: cài đặt Visual Studio 2015 trở lên để có Visual C++ 140 dùng để compile code
Bước 3: cài đặt paddle bằng lệnh: pip install paddlepaddle
Bước 4: clone code https://github.com/PaddlePaddle/PaddleOCR
Bước 5: compile code python3 setup.py bdist_wheel, sau bước này sẽ thấy có file paddleocr-x.x.x-py3-none-any.whl trong folder dist

Bước 6: cài đặt file whl bằng lệnh pip3 install dist/paddleocr-x.x.x-py3-none-any.whl

Nếu bạn muốn cài bản GPU thì đọc tại hướng dẫn gốc
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md

Source code

from paddleocr import PaddleOCR
import cv2
# Also switch the language by modifying the lang parameter
ocr = PaddleOCR(lang="en") # The model file will be downloaded automatically when executed for the first time
img_path ='example.jpg'
result = ocr.ocr(img_path)
# Recognition and detection can be performed separately through parameter control
# result = ocr.ocr(img_path, det=False)  Only perform recognition
# result = ocr.ocr(img_path, rec=False)  Only perform detection
# Print detection frame and recognition result
for line in result:
    print(line)

# Visualization
mat = cv2.imread(img_path)

boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]

for box in boxes:    
    top_left     = (int(box[0][0]), int(box[0][1]))
    bottom_right = (int(box[2][0]), int(box[2][1]))

    cv2.rectangle(mat, top_left, bottom_right, (0, 255, 0), 2)

cv2.imshow("result", mat)
cv2.waitKey(0)

from paddleocr import PaddleOCR

import cv2

# Also switch the language by modifying the lang parameter

ocr = PaddleOCR(lang="en") # The model file will be downloaded automatically when executed for the first time

img_path ='example.jpg'

result = ocr.ocr(img_path)

# Recognition and detection can be performed separately through parameter control

# result = ocr.ocr(img_path, det=False) Only perform recognition

# result = ocr.ocr(img_path, rec=False) Only perform detection

# Print detection frame and recognition result

for line in result:

print(line)

# Visualization

mat = cv2.imread(img_path)

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

for box in boxes:

top_left = (int(box[0][0]), int(box[0][1]))

bottom_right = (int(box[2][0]), int(box[2][1]))

cv2.rectangle(mat, top_left, bottom_right, (0, 255, 0), 2)

cv2.imshow("result", mat)

cv2.waitKey(0)

Source code

https://github.com/thigiacmaytinh/TextExtractor

Chúc các bạn thành công

Phần 1: Thử nghiệm độ chính xác nhận diện văn bản

Phần 2: Ứng dụng của Paddle OCR

Phần 3: Cách build chương trình nhận diện văn bản Paddle OCR

Source code

Source code

Leave a Reply Cancel reply