on 03-28-2023 7:18 PM
We are using Document Information Extraction to process Purchase Orders in PDF format. Since we supply the European market, numbers in the documents are specified with a decimal comma, e.g. 12.345,67.
However, we have noticed that the numbers specified in the PDFs are consistently misinterpreted as a decimal point by the Document Information Extractor.
For example, if the PDF contains a deliver quantity of 13.460 in a line item, the Document Information Extractor returns following (13.46):
{
...
"documentType": "purchaseOrder",
"languageCodes": ["de"],
...
"extraction": {
"lineItems": [
[
{
"name": "quantity",
"category": "details",
"value": 13.46,
"rawValue": "13.46",
"type": "number",
"page": 1,
"confidence": 0.9902912621359221,
"label": "quantity"
},
...
}
}
whereas I would expect 13460.
I tried to change the template and confirm multiple documents with updated value, but it didn't worked.
Is there any other option to specify or train the number interpretation?
Hi Bartlomiej,
can you please open a ticket on the component CA-ML-BDP, briefly describe the context of your scenario and attach one or two example documents so we can investigate the issue?
Best regards,
Manuel
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
67 | |
8 | |
7 | |
7 | |
6 | |
6 | |
5 | |
5 | |
5 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.