Blog

Data Security & Privacy Challenges in the AI Era

At the ISMG Financial Services Cybersecurity Summit in New York, Ron Arden spoke about the data security challenges the Financial Industry faces with the rise in AI use, especially in the growing hybrid workspace.

 

Today, I want to address the challenges and potential solutions surrounding data security and privacy. Fasoo specializes in unstructured data, the documents and files we use every day, as compared to structured data in databases. This data is growing exponentially, scattered across various locations, and often poorly understood. Managing and securing it has become a critical challenge.

Organizations are grappling with increasingly complex data security challenges. The rise of generative AI, private LLMs, and public LLMs only adds layers of complexity. I’ll explore how companies are trying to cope and adapt to these new realities.

The Way We Work is Changing… and More Complexities with the Emergence of GenAI

A rapidly changing IT environment has been a constant in every presentation for the past 20 years, and it’s no different today. One of the biggest shifts in recent years is the rise of the hybrid workforce, which has become the norm. Global companies now have employees working everywhere, anytime, requiring access to sensitive data.

This introduces a number of challenges, particularly from insider and external threats.

The cyber threat landscape can be broken down by vector or direction from which the threat comes – insider threats and external threats:

  • Insider Threats:
    • Malicious insiders – These include disgruntled employees or hackers targeting organizations from within.
    • Oops, moments – Accidental data leaks, like sending sensitive information to the wrong recipient, are common and can be just as damaging.
  • External Threats: Hackers, nation-states, and others with motives to steal, disrupt, or damage.

There are only a few reasons someone would want your data: financial gain, damaging reputations, or disrupting operations. All this data typically falls into three categories: sensitive and regulated (e.g., financial or health data), intellectual property (IP), or low-value information/non-sensitive data. These are the realities organizations must navigate.

Are You Vulnerable? Minimize Consequences by Protecting the Data Itself

Organizations generally take two approaches to data protection: perimeter-based or data-centric. A perimeter approach is essentially trying to put a moat around the data. The problem with that is there is no moat anymore. Changing the perspective to a data-centric approach focuses on protecting the data itself, regardless of location. This means that Security teams don’t have to play “whack-a-mole” trying to secure every place data goes. The security travels with the data wherever it goes, whether corporately managed or not. Since data winds up on personal devices, in cloud locations, and on unmanaged networks, controlling the data itself is a safer approach.

  • Perimeter-based protection:
    • Focuses on securing boundaries around data.
    • Ineffective in today’s environment, where data resides everywhere—on phones, in the cloud, and across various systems.
    • Often compared to playing “Whack-a-Mole” as new vulnerabilities continually emerge.
  • Data-centric protection:
    • Focuses on protecting the data itself, regardless of location.
    • This approach becomes especially critical for compliance with regulations like NYDFS, GDPR, and HIPAA, as well as mitigating financial losses from breaches.
    • Generative AI introduces additional risks, such as hallucinations and misuse of sensitive data.

In heavily regulated industries like finance, compliance is a major driver for adopting robust controls. Regulations such as NYDFS in New York require encryption and other measures. Beyond compliance, there are financial implications, such as losing customer trust after a breach or incurring costs from negative publicity.

Generative AI adds further complexity, whether dealing with public or private large language models (LLMs). AI’s “hallucinations” can introduce risks. A recent example involved lawyers who were disbarred after relying on AI-generated legal briefs that were entirely fabricated. AI is just another layer of complexity in managing ingress and egress points.

The Era of Advanced Data Security

The era of advanced data security has been shaped by the cloud and hybrid work. It used to be easier to control where data resided, but that’s almost impossible today.

Some organizations attempt to lock down data in specific storage locations or use numerous technologies to identify sensitive data. But perimeter-based methods often fall short. Instead, it’s important to focus on the data itself, not its location, since that can change frequently.

Compliance remains a global issue. For instance:

  • GDPR governs data in the EU.
  • In the U.S., there are constantly changing state and federal regulations (e.g., HIPAA for healthcare, SOX for public companies).

Exceptions also make it hard to enforce policies. For example, a major bank once abandoned its DLP system due to excessive exceptions. At that point, they didn’t have security.

Cyber threats are also increasing in volume and complexity. Many organizations rely on detection and response tools, but these often react after the fact—once data has already been exfiltrated. A better approach is ensuring exfiltrated data has no value through proactive measures like encryption.

Know Your Data, Protect and Manage It

Understanding your data is critical. Most organizations don’t know what data they have because it’s spread all over the place, like shadow IT, cloud environments, and forgotten storage. For example, someone might create an S3 bucket in AWS for testing purposes and leave it unmanaged.

To secure data effectively, organizations should focus on these steps:

  1. Identify Sensitive Data:
    • Understand what data you have and where it resides.
    • Include shadow IT resources (e.g., forgotten S3 buckets or unauthorized cloud storage).
  2. Label and Tag Data:
    • Classify data based on sensitivity (e.g., confidential vs. public).
  3. Apply Controls:
    • Use encryption, dynamic access controls, and policies to secure sensitive data.

This approach is what we refer to as advanced data security.

Advanced Data-Centric Security

All data exists in three phases:

  1. Data at rest – Stored on devices like hard drives, phones, or cloud servers. Many technologies, such as endpoint encryption, protect data at this stage.
  2. Data in transit – Data as it moves between locations, whether over networks or through copying to external drives. TLS and similar technologies can also help.
  3. Data in use – Actively being accessed or edited.

Classifying data is a foundational step. Start simply—e.g., non-public vs. public data—and layer controls over time.

One emerging approach is Data Security Posture Management (DSPM), which governs data in both cloud and on-prem environments. DSPM emphasizes data lineage, tracking who accesses data throughout its lifecycle.

The ultimate level of protection comes with data encryption and digital rights management (DRM). This ensures access controls are dynamic—validating users each time they attempt to open a file and adjusting permissions based on real-time policies. This shifts the focus away from perimeters to the data itself.

Start with Data Security Governance

From a business perspective, sensitive data protection revolves around risk. Organizations must balance risk while maintaining productivity. Start with business strategies that are intended to grow the business, then move to governance strategies to meet a lot of these requirements. Within the financial industry, there is a lot of compliance we need to consider. Then beyond that, I want to make sure my customer and internal data are safe, including from a breach. Everything in IT and security is balancing risk. I want to make sure I can minimize my risk, but I don’t not want to impact productivity to the point where my customers or my user base can’t work.

Once you finish that, you want to start identifying and prioritizing your data. Focus on high-value data first (e.g., customer data or intellectual property) and avoid overcomplicating classification; start with broad categories and refine over time.

From there, you want to set manageable policies. Avoid spending years on overly granular data classification that delays progress. Start simple and adopt a zero-trust approach that validates and authorizes every access attempt. Policies should dynamically adapt to changing business conditions, such as a document’s sensitivity shifting over time.

With the security side, we break it into a couple of different buckets.

  • Discovery – how do I find the information that is important to me
  • Classification, Tag – label or tag to identify if the file is sensitive
  • Encryption – encrypt the files themselves and assign access controls to them
  • Audit, Alert – audit to see who accessed what, when, and where

The last thing at the very bottom is to look at where the data is. At the top, I was talking about finding it, but now I need to start looking at where it physically is. Is it in the cloud, on-prem, or endpoints? One thing we found is no matter how much an organization tries to get its employees to save their documents into SharePoint or to a file share somewhere. They still end up with files everywhere. The goal is to be able to secure and address the documents no matter where they are.

Minimize Risks with a Data Security Platform

This is a high-level look at a platform approach that includes all the phases I mentioned before. Starting with Universal Control:

  • At rest – means when a document is encrypted, and it’s sitting on some hard drive somewhere, it’s protected.
  • In transit – TLS technologies protect data in transit over the network, but in transit, it could also mean I’m copying a document to a thumb drive. Protecting the data at all times means I don’t have to worry about that.
  • Access – control access to a document at all times, regardless of where it is.
  • In Use (Copy, Print, Capture) – protects the document while being viewed or edited, as well as controlling who has access and what they can do with it.
  • After Use (Monitoring and Auditing) – this is important, especially for regulatory compliance, because a lot of organizations need to validate document access. This is essential for compliance and maintaining an audit trail.

An Ideal Platform – Data Security Posture Management

Then below are security capabilities that provide the ability to find, identify, protect, and manage documents securely in an easy-to-use way.

  • Discover
  • Classify
  • Manage
  • Protect
  • Share
  • Audit
  • Monitor
  • Analyze

With these capabilities, you don’t need specialized tools. Nor do you want to disrupt people and their work. You should be able to open your files while still maintaining all of the security that is in place.

Additional controls address unconventional risks, such as physical exfiltration via printouts or screen captures. Most security and IT teams do not think about paper as a major threat. When it comes to screens, employees could take photos of sensitive data displayed on a screen. Security can be layered into the idea of data protection.

To track data lineage, Fasoo uses a “content ID” or unique identifier for each encrypted file. This ID follows the document, even if copied or converted, allowing real-time tracking. Our CRO likes to compare it to a GPS tag. Any time a document gets encrypted, there’s a unique identifier associated with that document. If you make a derivative or a copy of that document, “save as” a PDF, or just make a copy, that tag follows it. And that’s how to track data lineage. The next time a user opens a document somewhere, I can see that and track that particular tag.

Having flexible centralized policies is key. They minimize reliance on end-users as security experts and allow department-level managers to set policies for their teams. Consolidating logs into a single source of truth simplifies regulatory reporting and incident response.

Change management and exception management are two other important areas. We need to allow flexibility with the lower levels and their ability to control things. Policies for one department need to be set by that department. You don’t always want to go back to IT or the security organization to change policies that are going to affect people on a day-to-day basis. Having flexible policies that are central but can be decentralized, I think, is very important.

The other thing that I think is a challenge for most organizations is every tool has a log. Even if you consolidate everything into your SIEM, you still have a million pieces of data that you have to sort through. At least in the case here, if I can say that I’ve got one set of logs for all of my sensitive data, I have one source of truth, which I think just makes everything a lot simpler. If you need to prove something to a regulator or you’re just trying to understand how people are using sensitive documents, you have everything in one location.

Fasoo Data-Centric Security Platform

As I mentioned, Fasoo focuses on data security at the file level. We really don’t do anything with databases. We’re focused on the documents and the files that everybody uses every day. We have a suite of tools that focus on the entire life cycle of a document, from finding it to tagging it to managing it and from the creation to destruction of a document.

Tags
Keep me informed