Secure Artificial Intelligence Tools and Machine Learning

July 13, 2023
Nino Kukhaleishvili
Rached Chaaben
Tech4Fin

We live in a fast-evolving age of information, where Artificial Intelligence (AI) tools are starting to be used in many areas like financial decision-making and market analysis. This being said, AI lacks data security, a critical aspect to ensure these tools can be used by businesses. At Symphony, with our focus on security and compliance, we have earned the trust of many financial institutions because we ensure that all conversations and data flows passing on our multi-channel communication platform are end-to-end encrypted, private, and secure.

But AI tools that rely on external Language Learning Models (LLMs), like ChatGPT, expose users to new threats associated with data security. To help our customers adopt this new technology and maintain security standards, Symphony has taken two significant steps. First, to meet the increasing market demand for efficient financial tools with AI capabilities, Symphony acquired Amenity Analytics – a natural language processing (NLP) data analytics platform. Symphony also founded Symphony Labs in collaboration with Mines Paris and Smart Trade in 2022. This collaborative team of data science researchers, architects and security engineers is working on the development and secure implementation of new AI-based financial tools on its platform.

A Symphony of AI financial tools

The team’s initial focus is to establish tools that assist in leveraging the enormous amount of data exchanged through Symphony’s communication channels (Core Symphony platform, Cloud9, WhatsApp, WeChat, Chats, and so on), while protecting privacy and data ownership. Conversational AI tools should empower financial institutions to streamline services and identify hard-to-spot trends and patterns, thus allowing financial professionals to make more informed decisions.

With this mindset, we identified three essential Machine Learning (ML)-based development areas.

Summarization tools to gain insights

The abundance of financial text needed to be taken in to perform day-to-day functions is time-consuming and increases the likelihood of misinterpreting or missing out on pertinent information. Amenity ESG combined with LLMs can deliver accurate summaries for customers based on personalized search criteria to save time.

Imagine a summary of conversations with multiple buy-side and sell-side research analyst teams or salespeople delivered automatically at the end of every day, or having the ability to request one at any time. This could help with a variety of regular tasks, such as writing a recap of the day’s sales or producing a client interaction summary to be posted in SalesForce or another CRM tool.

Search engine

An “Ask Research” bot will be able to answer questions or provide information on ESG, transcripts, news, market developments and other public data in a chat, and provide relevant portfolio news and analysis. The bot could be queried with specific questions like “Show me similar bonds with these additional characteristics that best fit my strategy” or “Give access to HNW clients via federated chats to account, performance and macroeconomic information quickly.”

Enhanced tag suggestion

Tags are useful for highlighting and aggregating chat messages and alerts as part of broader themes. AI can assist by suggesting and standardizing tags that are relevant to specific conversations and allow for use across multiple conversations.

How we shield Machine Learning against security attacks

Despite the range of exciting use cases, AI tools can become a liability, putting sensitive and proprietary data at risk when using external components like computational libraries or when trained on cloud infrastructures. At Symphony Labs, we decided to invest in a new field of research, Privacy-Preserving Machine Learning (PPML), because we want to make sure customer data is never exposed to external threats. Our first goal is to create a secure pipeline for Amenity and LLM’s integration, where AI APIs would provide the base-layer Conversational AI and extend the power of Amenity’s algorithms for financial analytics research with reasoning, input parameter personalization, and logic capabilities.

But what are the threats that Machine Learning faces?

Security challenges against attacks

In typical machine learning scenarios, a central server first collects data from multiple sources, then trains a model on the combined data, and finally, sends the model back to the sources for deployment. In ML, where data is collected on a central server without advanced protection, the computation host can directly access incoming data.

Moreover, all three stages of the ML life cycle can be the targets of attacks such as stealing models, model inversion, model and data poisoning, or data reconstruction by breaching the model parameters. According to Fan Mo, Zahra Tarkhani, and Hamed Haddadi, in their publication Machine Learning with Confidential Computing (2022), attacks are of two types: 1) attacks on confidentiality and 2) attacks on the integrity of the ML process. Attacks on confidentiality put the privacy of data and intellectual property of models at risk, where attackers are mostly interested in exploiting unauthorized sensitive information or the model itself. Integrity attacks aim to actively ruin the model by adding, for example, calibrated noises to data to produce wrong prediction results.

Privacy measures must be reliable in order to guarantee that no breach can occur while sensitive data is handled or models are trained and deployed.

Symphony will develop PPML models that protect the entire machine learning life cycle of Symphony AI-based tools to ensure data and client privacy protection.

Ensuring data security

Historically, data has been secured at rest and in transit, but remained at risk while in use. Only two case scenarios can protect data in use: We can either store the data on local hardware (no data can be shared), or centralize it in a confidential, secure space.

Trusted Execution Environment (TEE) is the new technology used in PPML that avoids sharing sensitive data and centralizes it.

TEE is a hardware-based confidential computing approach, relying on specialized hardware features to create secure and isolated enclaves within a computer. These enclaves are small regions of memory that are protected from other software and hardware on the system, including the operating system itself. When a program executes within an enclave, it can only access the data and code that are explicitly loaded into the enclave, while all other data and code are inaccessible. Data is kept encrypted in memory as it is being processed within the enclave. This means that, even if an attacker gains access to the operating system, they won’t be able to access the enclave.

The principal cloud providers, Google, Microsoft and Amazon, offer confidential computing as a service, enabling developers to use their technology without needing to set up and manage their own infrastructure.

Ensuring model performance and security

Though the security of TEE is promising, integrating ML in such an environment raises technical challenges. The implementation of TEEs can pose problems of compatibility with software like programming languages, libraries, and hardware like GPUs that play an important role in ML. Furthermore, the use of secure enclaves can introduce overhead in terms of memory and computation, which can further impact the performance of ML workloads. Nonetheless, cloud providers are continuously working on optimizing their confidential computing offerings to provide better performance for secure processing of sensitive data.

While data and models are protected inside the TEE, it is during the third step of the ML deployment – extracting results from the enclave – that the model parameters can be tracked back and attacked. These so-called ‘inference attacks’ enable attackers to reconstruct the clients’ local data. To protect privacy, careful algorithm design and privacy analyses are a necessity. Typically, some noise can be added to the datasets to guarantee privacy of results or specific architecture of deployment can be designed.

In a nutshell

Symphony is currently evaluating how we can best guarantee the performance and security of our ML models. Their integrity will need to be thoroughly analyzed and tested against benchmarked attacks. We look to achieve high performing AI tools without security tradeoffs by designing secure architecture and robust models, aiming to guarantee results accuracy and privacy on the top of data and model confidentiality.

Securing the entire ML pipeline will allow us to integrate conversational AI into Symphony and build innovative models, pushing the boundaries of existing AI techniques. Our AI tools will run on messages and calls exchanged on Symphony, as well as on public data like social media, or private files and documents submitted by our clients. Our solutions will drive actionable insights to portfolio managers, research professionals, analysts and other financial markets participants.

The goal is that Symphony AI tools will help our customers cut through noise and provide needed business intelligence in real time in a trusted infrastructure.

Innovate 2024 is Live. Watch Now

Cloud9 wins Best Trading Floor Communication System Provider 2024