Open Source Python API te integrisarel e OCR śajutnimata

Open Source Python biblioteka savi del śajsaripen e softveresqe kerde te integrisaren lokhes e śajutnimata vaś o prinźaripen e karakterurenqo optikano (OCR) anθ-e lenqe aplikacie.

PaddleOCR si jekh zurali putardi Python biblioteka savi del śajsaripen e softveresqe developerurenqe te integrisaren lokhes e śajutnimata vaś o prinźaripen e optikane karakterurenqo (OCR) anθ-e lenqe Python aplikacie. Si kerdo upral o PaddlePaddle, jekh putardi platforma vaś o śukar siklǒvipen, thaj kerel buti e śukar siklǒvipnasqe modelurenqe maj laćhe te resel bari ekzaktnipe thaj performanso. O PaddleOCR kerel o OCR proceso maj lokho kana del jekh API pe baro nivelo savo abstraktil but katar e tikne niveloske detalura, so kerel maj lokho e developerenge te thoven OCR kapacitetura ande lenge aplikacie.

O PaddleOCR dija pherdo suporto vash buteder shiba thaj skripte. Akana vov suportil 80+ diferentne ćhiba, sar so si arabikani, kinezikani, anglikani, francuzikani, germanikani, japonikani, koreanikani, rusikani, spanijaki, thaj but aver. Akava kerel les te ovel lačho instrumento e developerenge save trubun te keren buti e bute ćhibjasa. Po dur katar e zurale OCR śajutnimata, i biblioteka inćarel vi but laćhe utilitètură vaś te kerel pes buti e imaźenθar thaj e tèkstosθar. Sar egzàmplo, and-o laćharel śajutnimata vaś o angluno procesipen e imaʒenqo, sar so si o deskewing thaj i binarizàcia, sar vi śajutnimata palal o procesuripen vaś te laćharel pes i ćaćutnipen e OCR-esqo avridipen.

PaddleOCR del but diferentne OCR modelura, svako optimizirime vash diferentno kazura vash o hasnipe. Sar egzàmplo, o modelo e Tekstosqo Detekciaqo si kerdo te arakhel thaj te lel avri e tekstosqe regiùnură katar jekh imàgo, kana o modelo e Tekstosqo prinʒaripen si kerdo te pinʒarel o ćaćutno tèksto anθ-e kodola regiùnură. Isi vi jekh Model Ensemble funkcia savi del śajsaripen e developerurenqe te kombinin but modelură te resen vi maj bari ekzaktìta. Sa khetane, o PaddleOCR si jekh zurali thaj lokho te hasnis biblioteka te thoves OCR śajutnimata anθ-e tire Python aplikacie. Lesko suporto vash but shiba thaj skripte, sar vi leske modelura save shaj te keren personalizirime thaj postprocesingoske instrumentura, keren les te avel jekh vasno instrumento vash e developerura save keren buti e OCR-esa.

Ano jekh dikhipe

Jekh dikhipen e PaddleOCR-esqe funkcienqo.

O dikhipen e funkcienqo

Keren OCR
Thajvel OCR śajutnimata
Pinźaren o tèksto le patretosqo
Convet image of text
Pinźardo tèksto e fontosqo
Roden PDF
Aver ćhiba
Keren OCR aplikacie
Garav ando browser
Xramosar o tèksto
But-threading Suporto

PaddleOCR

PaddleOCR suportil popularno formatura e lilaqe lilaqe save si xramosarde telal.

Lekhavno

PNG, JPEG, BMP, TIFF, TGA, DICOM

Xramosaripen

PNG, JPEG, BMP, TIFF

PaddleOCR

Platformoski Independenca

PaddleOCR śaj te kerel buti e .NET Framework 4.8 thaj Python 2.7 & maj upre.

Python 2.7 & opral.

PaddleOCR

Te astarel pes o PaddleOCR

rekomenduime drom te instalis o PaddleOCR si te hasnis o pip. Mangav tumen te labăren kadaja komanda vaś jekh śukar instalàcia

Instalisaren o PaddleOCR prekal o pip

 
Install PaddleOCR via pip
 pip install paddleocr 
Tu śaj vi te instalisares les manualo; tele lel e maj neve lila direktno katar o GitHub repozitoriumo.

`Pinźaripen e tekstosqo e imaʒenqo prekal o PaddleOCR API`

O pinźaripen e tekstosqo e patretosqo si o proceso te lel pes o tèksto katar e patretura. Si jekh laćhi tèknika vaś bute aplikàcie sar so si o skanipen e dokumenturenqo, i digitalizàcia thaj o OCR (Optical Character Recognition). O putardo-surso OCR (Optical Character Recognition) API del jekh set e maj laćhe OCR modelurenqo savo śaj te pinʒarel o tèksto katar buteder imaʒură, maśkar save si skanime dokumentură, śerutne śerutne thaj fotografie. I biblioteka suportil but importantne funkcije save si phangle e pindžaripnasa e tekstosko e image-esko sar so si o ćhivipen e image-esko, Inicializiribe e OCR modelosqo, identifikisaripe e tekstosqo anθ-i image, Prinʒaripen e tekstosqo katar o image, ćhivipen e tekstosqo katar o rezultàto, thaj but aver. O egzàmplo so avel sikavel sar te pinʒares o tèksto katar jekh imàgo and-e Python-esqe aplikacie.

`Keren o prinźaripen e tekstosqo e imaʒenqo anθ-e Python Projèktură`

import paddleocr
ocr = paddleocr.OCR()

# load an image using the PIL
from PIL import Image

image = Image.open('example.jpg')
result = ocr.ocr(image)

# access the recognized text

for line in result:
    print(line[1][0])
    print(line[1][1])

`OCR Dokumentosqo prinźaripen labǎrindoj o Python API`

O pinźaripen e dokumenturenqo sas jekh anθar e prominentne rodipnasqe thana vaś o OCR. Le dokumentura si hasnime pashe sako dyes ande amaro trajo. Kana le softveresqe kerde aplikàcie keren OCR k-o dokumento, vov śaj te lel importanto informàcie, te lel formurăqe thana, te analizisarel o aranźmanto, te garavel digitalo thaj vi te ginavel purane manuskriptură. I biblioteka PaddleOCR putardi rig del śajsaripen e softveresqe kerde te ćhiven bute vrjama dokumenturenqe, te keren OCR operacie thaj te pinʒaren thaj te len avri o tèksto anθar late labǎrindoj o Python kodo. O pinźaripen le tekstosqo si but ćaćes thaj i biblioteka śaj te arakhel lokhes specialne karakterură thaj thana ćaćes.

`Keren OCR Document RecognitionF labǎrindoj o Python API`

img_path = './input_images/11-document-1.jpg'
result = ocr.ocr(img_path)

//Displaying the output.

`Supporto vaś o prinźaripen e tabelaqo anθ-e Python Apps`

I putardi PaddleOCR biblioteka del śajsaripen e softveresqe kerde te pinʒaren e data e tabelaqe and-e lenqe Python aplikacie. O pinźaripen e tabelaqo maj but si les trin modelură, detekcia e tekstosqi jekhe liniaqi-DB, pinʒaripen e tèkstosqo jekh liniaqo-CRNN thaj struktura e tabelaqi sar vi predikcia e koordinàtenqi e ćhelurenqi-SLANet. O egzàmplo so avel sikavel sar te pinʒarel pes o lil savo si les i tabèla. O egzàmplo so avel sikavel sar te labărel pes i metoda draw_ocr savi lel andre i imàgi, e śerutne kutie, e tèkstură, e skorură, thaj o drom karing o lil e fontosqo. Vov del palpale jekh tasvir e granicake kutijenca thaj o detektirime teksto. Tu śaj te sikaves o lil labǎrindoj i sikavipnasqi metoda.

`Thon jekh tasvir thaj arakhen o tèksto andar late prekal o Python API`

from paddleocr import PaddleOCR, draw_ocr

# Load the image that contains the table.

# Load the image
img_path = 'table_image.png'
with open(img_path, 'rb') as f:
    img = f.read()

# Create an instance of the PaddleOCR object
ocr = PaddleOCR()


# Draw the bounding boxes around the detected table cells

boxes = [line[0] for line in result]
scores = [line[1] for line in result]
texts = [line[2][0] for line in result]
im_show = draw_ocr(img, boxes, texts, scores, font_path='arial.ttf')
im_show.show()