Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
tomasz_janasz
Product and Topic Expert
Product and Topic Expert
December Update - General Availability

The GA of Document Information Extraction, premium edition took place on December 6th. The new service plan is now available on all productive BTP landscapes of the Document Information Extraction service.

The BTP Service Description Guide reflects the new service and the metric under section 7.

Official commercial and technical information can be found in SAP Discovery Center and on SAP Help Portal.

November Update - Trial Availability

Document Information Extraction, premium edition was featured at TechEd 2023 as part of a virtual jump-start session. Get a glimpse of the new features and to get enabled in only 25 min: Jump-Start Your Document Processing Use Case with Generative AI.

Here you will find a dedicated step-by-step tutorials to kick start on SAP BTP Trial.




 

Foundation Models, a technological breakthrough in AI, have ushered in a new paradigm with disruptive capabilities. The rise of applications like ChatGPT, built on these foundational models, has rapidly caught the world's attention. Particularly in the context of intelligent document processing (IDP) it allows to extend the pre-defined capabilities with emerging capabilities that have been either unforeseen or very difficult to implement.

Today, we are thrilled to announce the launch of the new version of our product lineup for IDP: Document Information Extraction, premium edition. It is a groundbreaking addition to our software suite that is set to revolutionize the way businesses extract valuable insights from unstructured data. With the power of Large Language Models (LLMs) and Generative AI (GenAI), the premium edition will unlock a whole new level of capabilities, enabling organizations to streamline their operations and further increase their process-related efficiency. The premium features in the new edition encompass:

Auto Extraction of Unstructured Data
Gone are the days of laborious annotation and template creation. The premium edition will offer a schema-based extraction of unstructured data by leveraging the unprecedented capabilities of LLMs. With a simple description of required fields, the solution will automatically extract and organize data, eliminating the need for manual intervention and drastically reducing time-to-value.

Extended Language Support
In an increasingly interconnected world, language barriers should never hinder adoption. That's why we will introduce extended language support for over 40 languages. Businesses will be able to effortlessly extract information from documents written in different languages, enabling faster go-to-market strategies, and expanding global coverage of their IDP functions.

Extensions of SAP Standard Schemas (roadmap outlook 2024)
To cater to the diverse needs of our customers, the premium edition will extend the support for SAP Standard Schemas. This enhancement will allow for a seamless integration with SAP systems and their data models, providing quick extensibility and higher business value.

Immediate Improvement (roadmap outlook 2024)
The premium edition will also come with immediate improvements in accuracy based on feedback data from users. By harnessing the power of LLMs and GenAI, the solution will ensure better accuracy in data extraction, minimizing errors and maximizing the reliability of extracted information. This will lead to higher automation rates and hence, greater operational efficiency and productivity.

The benefits of Document Information Extraction, premium edition are far-reaching. By drastically reducing time-to-value, businesses will be able to quickly extract insights from unstructured data. With the global coverage and faster go-to-market strategies, companies can stay ahead of the competition. Additionally, the quick extensibility and higher business value provide a solid foundation for further growth and innovation.

To give you a sneak preview of the remarkable capabilities of our upcoming release, we have prepared a demo video:



This launch represents a significant stride forward in our commitment to providing relevant, reliable, responsible business AI that delivers real-world results. Let's embark on this adventure together! Feel free to connect with us.

 

Disclaimer:
The intellectual property belongs to SAP and the content is copyrighted. We kindly remind you that the roadmap-related statements are not a commitment and are subject to change.




Learn more

Read more about the news of Document Information Extraction on the help portal!

What is Document Information Extraction?

Document Information Extraction is one of the SAP AI Business Services on the SAP Business Technology Platform (SAP BTP). This ML-enabled service is available through the Cloud Platform Enterprise Agreement (CPEA) and also in the Pay-As-You-Go (PAYGO) model.

Tutorials & Learnings

Blog posts:

SAP Community Page:
20 Comments
gaurang_gujar
Active Participant
0 Kudos
Hi Tomaz,

 

That is a great news , I believe that will game changing in the world of OCR with immense capabilities.

 

Will DOX Premium edition will also be a part of  SAP Build Process Automation Service License ?

 

Regards,

Gaurang
tomasz_janasz
Product and Topic Expert
Product and Topic Expert
0 Kudos
Hi Gaurang,

this is a commercial aspect that I cannot comment on yet. Please note, that GenAI is an emerging technology and we are currently building up know-how around the commercial and legal implications of it specifically in the enterprise context.

We will keep the community posted in that regard in the upcoming months.

Best regards,
Tomasz
AlexDong
Product and Topic Expert
Product and Topic Expert
0 Kudos
hi tomasz.janasz,

Exciting news! I instantly tried with trial account but it seems the document cannot be uploaded, even with several browsers..

Is there any limitation here?


Thanks and best regards,

Alex
tomasz_janasz
Product and Topic Expert
Product and Topic Expert
HI Alex,

the "pending" status might be misleading. You need to run through the wizard and the upload happens as the last step.

Regards,
Tomasz
AlexDong
Product and Topic Expert
Product and Topic Expert
0 Kudos
Hi Tomasz,

Exactly. Now I got it.

Thanks for the reply!

Best regards,

Alex
mcarunnie
Employee
Employee
0 Kudos
Hi,

I don't see the AI aspect of it. May be I'm missing something. I'm able to create template and upload documents and get the values as per the template. What is the AI part of it?
tobias_weller
Advisor
Advisor
0 Kudos
Hi Arun,
For this new functionality, you don't need to create any templates. Instead you only define your schema with the names of the fields you would like to extract. All the rest is being taken care of by AI.

Best regards,

Tobias
mcarunnie
Employee
Employee
0 Kudos
Hi Tobias,

Thanks for your response.

Yes, I realised it when I tried without using template. If I want to correct the results, I see using template provides an option.Will using 'Auto' functionality provide some flexibility to re-wire the results?

My second query is Can we make the results more accurate?
tobias_weller
Advisor
Advisor
0 Kudos

Hi Arun,
At the moment, you can try to improve the results by adjusting your schema definition, we listed some ideas here.

But we also plan to deliver an automated improvement functionality based on user feedback (i.e. your corrections).

mcarunnie
Employee
Employee
0 Kudos
Hi Tobias,

Thanks again!

I have one last question : How reliable it is if we want to derive common format but have several input formats. Our goal is to convert several input formats into one uniform format.
tobias_weller
Advisor
Advisor
0 Kudos
Hi Arun,
That's exactly the purpose, to be able to process diverse documents without having to provide every layout upfront for training.
For exact estimations how well it works in your case, we would need to look at some of the samples or you could directly test it.
keean_ferreira
Explorer
0 Kudos
Hi there. I would like some further clarity around this "Premium Feature". Will we still need to manually configure every field we wish to extract? We work with Manufacturing Batch Records, they have hundreds of fields, and there are differences between each doc. Will this feature still require us to input every field we want extracted? Thanks in advance.
tobias_weller
Advisor
Advisor
0 Kudos
Hi Keean,
You need to once configure the fields that you would like to extract. Is there any superset of fields in your scenario or a generic way of defining them so that they fit to all documents?

So far I have not encoutered a scenario as you described, so please feel free to reach out with more details.

Best regards,

Tobias
keean_ferreira
Explorer
0 Kudos
We could likely define generic categories, however, it would still be very time consuming and I fear that we might lose precision if we take that approach? Is this tool capable of scanning the document and extracting all information it detects?
tobias_weller
Advisor
Advisor
0 Kudos
So far we do not offer such a functionality, but I will note it down to consider it for future enhancements. However, I'm curious how would you process the result of such a functionality, wouldn't you still need to map the extracted information into some DB table?
keean_ferreira
Explorer
0 Kudos
Awesome, thanks for letting me know! Yes exactly, we need to find any variances between the records, so the information would be extracted into a form that the computer can analyze, like a database, CSV, or JSON. Then the app should detect the differences between records and point them out to us for further inspection. It would be great to be able to have some graphs and charts representing the information visually as well. This is more or less what we are looking for. Would we be able to leverage foundational models within the AI Foundation playground to design an app like this ourselves?

Best regards,

Keean.
00022111734
Participant
0 Kudos
Hi Tomsaz

 

Thanks for Insightful Blog well Explained
ThomasFeyer
Explorer
0 Kudos

Hi @tomasz_janasz,

Is there any information when the premium edition will be available in "Pay-As-You-Go for Partners" BTP accounts? According to the Discovery Center, it is not yet available and we would really love to try it out.

Thanks and Best Regards,
Thomas

krishnam_prasanth
Discoverer
0 Kudos

Hi @tobias_weller ,

Can a user of Document Information extraction able to train or map the fields from an invoice to get the required extraction results. How can training able to do for the premium edition of DoX service.

 

 

tobias_weller
Advisor
Advisor
0 Kudos

@krishnam_prasanth we are currently working on a feature that would allow the service to automatically learn from feedback data. This feature is planned to be released in Q3, see Roadmap Explorer