cancel
Showing results for 
Search instead for 
Did you mean: 

Issue with number format

bkmarzec
Discoverer
0 Kudos

We are using Document Information Extraction to process Purchase Orders in PDF format. Since we supply the European market, numbers in the documents are specified with a decimal comma, e.g. 12.345,67.

However, we have noticed that the numbers specified in the PDFs are consistently misinterpreted as a decimal point by the Document Information Extractor.

For example, if the PDF contains a deliver quantity of 13.460 in a line item, the Document Information Extractor returns following (13.46):

{
   ...
"documentType": "purchaseOrder", "languageCodes": ["de"], ... "extraction": { "lineItems": [ [ { "name": "quantity", "category": "details", "value": 13.46, "rawValue": "13.46", "type": "number", "page": 1, "confidence": 0.9902912621359221, "label": "quantity" }, ... } }

whereas I would expect 13460.

I tried to change the template and confirm multiple documents with updated value, but it didn't worked.

Is there any other option to specify or train the number interpretation?

View Entire Topic
zeise
Advisor
Advisor
0 Kudos

Hi Bartlomiej,

can you please open a ticket on the component CA-ML-BDP, briefly describe the context of your scenario and attach one or two example documents so we can investigate the issue?

Best regards,
Manuel

bkmarzec
Discoverer
0 Kudos

Thank you for your response Manuel! I created a new ticket.