Vectorize your data for Infuse AI in to Business u...

Julian_B

Vectorize your data for Infuse AI in to your Business

using SAP HANA Vector Engine, RAG & BTP Gen-AI

1. Introduction:

SAP Generative AI HUB and SAP HANA Vector Engine enable businesses to harness the power of AI .

SAP HANA Vector Engine is a high-performance in-memory computing engine that leverages vectorized processing techniques to achieve lightning-fast data analysis. By vectorizing business data, the engine enables businesses to process vast amounts of information in parallel, significantly reducing query execution times.

This blog post aims to inform you on how we are enhancing SAP's partner foundation AI model for utilization towards our business cases. We will also discuss how we utilize tools from SAP Gen AI and SAP HANA Vector Engine, as well as RAG techniques, to incorporate AI into our business use case.

The blog offers a comprehensive, step-by-step guide for implementing the GenAI RAG feature into your business operations.

2 Adapt GenAI in to Business

To adapt GenAI in to business , we have started with task specific instructions like RAG and Zero shot prompting.

AI capabilities have already been embedded in most of the SAP business processes through a tool known as JOULE. This digital assistant aids users in text summarization, writing, question & answering, and code generation.

Within the BTP AI Foundation, the SAP AI Launchpad serves as the gateway to access GenAI HUB, ML operations, and the prompt editor. From the GenAI Hub, users can easily access any partner-built or SAP-built foundation models.

Users can easily access to the prompt engineering has been configured with foundation models. From this tool you can easily leverage the Gen AI capabilities such as text summarization, writing, question & answering, code generation, sentiment analysis, and auto-responses for customer service …etc. to your business.

Furthermore, it also effectively identifies potential risks and fraudulent activities through comprehensive data analysis, swiftly detecting any present anomalies. This unique solution significantly enhances customer satisfaction by adeptly responding to customer queries, ensuring a seamless and personalized experience.

Moreover, with its extraordinary ability to analyze extensive customer data, AI offers personalized recommendations and experiences tailored to each individual business's needs.

3. Author

Julian Bellarmin working as a SAP certified cloud solution BTP architect in Tech Mahindra .

Kedar Kulkarni Leading SAP BTP competency solutions.

Saranya Sampath working as a BTP consultant.

4.Vectorization , Retrieval and Generation:

4.1 Process Flow Diagram:

New column Data type: REAL_VECTOR
New vector Constructor : TO_REAL_VECTOR
New similarity search distance function:L2DISTANCE(),COSINE_SIMILARITY()

4.2 Process Steps :

Upload function:

we are collecting the TEXT and PDF document from the directory .
Converting all the documents in to chunks.
Converting all the chunks in to text embeddings through Gen AI HUB Proxy-lang chain embedding libraries and text-embedding ada model 002.
Converting all the embeddings in to vectors by TO_REAL_VECTOR constructor.
Storing the embeddings as vectors into HANA cloud vector DB.

Semantic Search function:

Getting the user query.
Converting the query in embeddings.
Performing the cosine similarity search by passing the query vector.
Extracting the top ranked results.

Text Generation through LLM:

Refine the prompting context according to the specific business requirements.
Passing the context and search result to partner Build foundation Models - GPT-35-TURBO/ GPT4 LLM.
Capturing the LLM response and display to the user.

5 Configuration of BTP AI services.

5.1 Pre-Requisites:

SAP BTP Enterprise Account.
SAP AI Core should be in your global account this has to be in extended service plan.
SAP AI Launchpad should be in standard service plan.
SAP HANA Instance – For configure the REAL VECTOR
If you are using trial version, then you can update this to extended / standard service plan.

5.2 Configuration in BTP Launchpad

5.2.1 Roles Assignment:

You need to have the right roles and permission to access the GEN AI Hub , AI Launchpad , ML Operations, Work space and AI core admin.

5.2.2 Service Key Creation:

Create the service Keys for AI CORE Instance. You can create your own service keys and get the credentials to make the connection between AI Core and python colab tool

5.2.3 Connection and Resource Group Creation:

You can create your ai-core connection and resource group. Here ‘default’ will be the resource group name.

5.2.4 Validate your Foundation Model - AZURE-OPENAI:

Click on the AI launchpad Go to ML Operations -> Scenarios.

Check whether the foundation-models scenario is present in your AI-Core workspace.

Click on the Azure-openai link to know your model supported in AI core by default.

Here we are using gpt-35-turbo , gpt-4 and text-embedding-ada-002

You can see the list of supported model in AI core – as of APR 26-2024.

5.3 Model Configuration:

Go to ML Operations -> Configurations. Click on the Create button to configure the models.

Enter the parameters and click on Next button.

Feed your required foundation model name and version.

You can see your configured model , once it has been created.

5.4 Creating your deployment model:

Create your deployment which is supported by the foundation model.

Choose your executables from the below options.

Once you created the deployment model , wait for the current status to be set to RUNNING.

Note down your deployment model id , which will be used when you want to use the LLM context.

5.5 Validate the prompts with the deployed model

In the Generative AI Hub, you can manage your prompts in the Prompt Management page. Go to Generative AI Hub -> Prompt Management. Here you can click on a prompt to go to its details page

Configure the parameters for the model response.

Frequency Penalty:

Number can be between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

Presence Penalty :

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

Max-Tokens :

The maximum number of tokens allowed for the generated answer.

You need to increase the response token size to get the larger result data.

Temperature :

What sampling temperature to use, between 0 and 2. Higher values will make the output more random, while lower values will make it more focused and deterministic.

You can validate your prompts with different contexts.

6. Create of HANA Cloud Vector Instance

From APR 2024 onwards , SAP supports REAL_VECTOR column in Hana DB containers. Create your Hana cloud instance and note down your DBADMIN credentials , host name & port number which will be used again to make the connection. Start the instance through click on hana cloud central. Once it’s started Open your Data Base explorer.

7 Vectorize your PDF and TEXT file:

7.1 Pre-requisites:

Python3 installed in your system.[ You can use google colab]

Have generative-ai-hub-sdk installed in your system.

[With this SDK you can use all the generative Models available in SAP's gen AI hub.]

7.2 Create the folder directory and upload the text / PDF files.

7.3 install the HANA ML and GenAI Hub SDK libraries

With this SDK you can leverage the power of generative Models like chatGPT available in SAP's generative AI hub.

7.4 Set the environment variable.

Update your AI core service key credentials and execute the below script to set the ENV variables.

7.5 Make the Connection Context to the HANA Vector store

Import the DBAPI libraries to invoke the connection.

7.6 Converting all the documents in to chunks

Use the below libraries for the PDF documents.

Use the below libraries for the text documents.

You can check the size of the chunks of your file by execute the len(text_chunks)script.

Based on the business data , you have to carefully set the chunk size and its overlap size.

7.7 Converting all the chunks in to embeddings and store it as vectors in HANA vector DB

For converting my text /pdf file in to embeddings , we are using langchain init embedding libraries.

For that we are using text-embedding-ada-002 model.

Once you have executed the above script , you can see the “MyTest_VecTable” has been created in the Hana DB with REAL_VECTOR data type.

You can open the table data and see all your chunks will be converted vectors in the REAL_VECTOR column.

Note: Don’t use the data upload option to load these vectors. That will give the error.

8 Extract query results from vector DB:

Now its’s time to validate the vector Store and GEN AI Deployment Models by pass the user prompt query.

8.1 Getting the user query and Convert the query in to embeddings.

We are using text-embedding-ada-002 / text-embedding-ada-002-v2 model for convert the input query strings to Embedding Text.

8.2 Performing the cosine similarity search by pass the query vector.

You need to Pass the ‘Embedding text query’ in to ‘vector store’ by passing ‘select SQL query’ for fetch highly matching results. You can use L2DISTANCE or COSINE_SIMILARITY functions based on your need.

8.3 Extracting the top ranked results.

You can verify the TOP ranking chunk results which is returned from the hana vector store.

9. Text Generation through LLM:

9.1 Configure the prompting context based on the business requirement.

Prompt is guiding GenAI models to produce desired output.

9.2 Pass and Fetch the response from GenAI Hub Deployed model.

In this use case, we are fetching the results based on the RAG [Retrieval Augmented Generation] flag. Make sure that , you are using the right deployment model– which you configured in AI HUB.

9.3 Capture the LLM response from the model and display it.

Here I have set my RAG Flag is True.

You can see the response which is generated by GPT-35-Turbo based on the context which is returned from vector store.

Now modify the [ Retrieval Augmented Generation] RAG flag is false. Then see the response which is directly comes from LLM without referring to HANA vector results.

Now your model is good to respond for any question based on the RAG flag.

10.Create User Interactive to demo the RAG technique

Here is the script to create the user interactive components. Put all your queries in to option array list.

Make the RAG is False by un-select the check box and click ASK LLM button and see the response.

Make the RAG is True by select the check box and click ASK LLM button and see the response.

11. Conclusion:

Infusion of AI capabilities ultimately results in improved business outcomes, enhanced decision-making, and a competitive edge in the market. After this simple use case , understood that, we can do classification, Answering Questions, content generation , zero shot prompting (Task Automation) , Creative writing, language translation ,Personalized Interaction, sensitivity analysis ,summarization, Zero shot prompting , extract content….etc. to our business use case.

This blog provides a comprehensive understanding of the seamless data loading and extraction process using the powerful HANA VECTOR DB and GEN AI solution to achieve the RAG AI feature.

You can also see how accurately this multi-dimensional Hana vector engine fetching the results and your how your LLM deployment model is helping you for generate text based on the context provided for end users.

Vectorize your data for Infuse AI in to Business using Hana Vector and Generative AI

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z