What Unstructured Data is Sensitive?
Sensitive Unstructured Data

“Threat actors are having more success with breaching and exfiltrating sensitive unstructured data targets.”


Your organization’s sensitive unstructured data is a rapidly growing threat surface increasingly targeted by cybercriminals and threat actors. While more attacks are directed at structured databases, cybercriminals are having greater success in stealing sensitive unstructured data.

It’s because this typeof data poses a unique series of security and privacy regulation challenges, many of which are not addressed by today’s investments in network, device and application security, cybersecurity frameworks or traditional vulnerability management strategies.

Unlike structured data that resides in well protected IT perimeters, sensitive content exists in unstructured formats such as office documents, CAD/CAE files, or images and are distributed and published via file sharing, social media and email. You generate it when HR collects personal employee information, your sales teams add customer contact information into your customer relationship management (CRM) system, your engineering/security teams collaborate with third-party intellectual property (IP), and so on.


  • New product plans
  • Product designs
  • Customer information
  • Supplier information/third-party contracts
  • Competitor research
  • Customer surveys
  • Software code
  • Job applications, Employee contracts
  • Internal processes, and procedure manuals
  • Data Analytics: Google Analytics, Tableau and Salesforce reports


  • California Consumer Privacy Act (CCPA)
  • General Data Protection Regulation (GDPR)
  • Health Insurance Portability and Accountability Act (HIPAA)
  • Gramm–Leach–Bliley Act (GLBA)
  • Personal Information Protection and Electronic Documents Act
  • New York State Department of Financial Service
  • Payment Card Industry Data Security Standard

“A dangerous gap has emerged …”

Sharing and storing sensitive information in free-form documents that live outside carefully monitored or secured databases is now a widespread practice. This creates a gap that presents countless opportunities for unauthorized disclosure through inadvertent handling by employees, actions of malicious insiders, and cyberattacks.

Businesses are mobilizing to combat these threats. The first step is to ensure your organization understands the character, significance and challenges surrounding sensitive unstructured data. Focus on these topics to drive better organizational insights into why and what can be done now to close the gap.

  • Who cares about it?
  • What are the types?
  • How are sensitivity levels determined?
  • What are the next steps?

Who cares about sensitive unstructured data?

Unauthorized access or loss of sensitive data hurts your competitive advantage, damages your brand, and can incur significant regulatory penalties.




of customers will stop spending for several months after a breach



will never return to your brand



In addition to customers and potential loss of revenue,

  • Breach of partner information exposes the business to legal damages and seriously impacts the relationship and reputation of both parties.
  • Regulators are responding to increased threats and individual rights. Over 80 countries now have published privacy laws. Non-compliance penalties are increasing and more strictly enforced. Your data may be subject to overlapping and often conflicting requirements.
  • Corporate Governance, Risk and Compliance(GRC) committees define the level and handling policies of sensitive information. New threats and trends must be reflected in policies to guide activities to implement systems and procedures to safeguard this content.
  • Security and IT professionals have spent considerable time focused on network perimeter tools and gap analysis shows shortfalls in safeguarding unstructured data. To fix this, they are turning to data-centric approaches and tools to protect the data itself rather than its location.
  • Employees create and share unstructured office documents, PDFs, CAD/CAE, internally and externally daily, and should work to protect content appropriate to its sensitivity level (e.g., confidential, internal, public).

What are the types of sensitive unstructured data?

Sensitive data is any information that must be safeguarded from unauthorized disclosure. The broadest categories are regulated or unregulated. The former, as required by laws, must be handled as sensitive. Unregulated data includes both business sensitive and publicly known information. It’s up to the business to determine what content it deems sensitive.

Regulated data arises from:


Privacy Regulations: Information that personally identifies an individual and associates that individual with financial, healthcare, and other data.


Industry Regulations: Industry sensitive data. An example would be a weapon system or critical infrastructure governed by the International Traffic in Arms Regulations (ITAR) and North American Electric Reliability Corporation (NERC).


Personal Health Information (PHI), Personal Identifiable Information (PI), and Payment Card Industry Data Security Standard (PCI) continue to be the traditional definition of individual privacy. By gaining access to this valuable data, cybercriminals can steal identities and/or compromise bank accounts to easily earn a quick profit.

Modern day privacy regulations, such as GDPR and CCPA, have broadened the definition of what information is subject to regulations to include individual interactions in the digital space, putting companies under significant new obligations.

Unregulated data of a sensitive nature is determined by the business. It is data the business doesn’t want exposed and can be strategic, competitive, financial or operational in nature. Examples include:

  • IP: Patents, trademarks, formulas, R&D programs, source code
  • Strategic: Pending financial releases, on-going M&A transactions, internal risk deliberations
  • Operations: Inventory levels, pricing policies, customer lists

Today’s cybercriminals are opportunistic and look for companies involved in a current event or have an obvious vulnerability they can exploit for the most value. Examples include: stealing data about important drugs or vaccines being developed or exposing damaging information from an ongoing legal proceeding.

Interestingly, unregulated sensitive content breaches are often a hidden secret. It’s not subject to disclosure like regulated data so organizations often choose to avoid the reputational damage associated with publicizing a breach.

How are sensitivity levels determined?

Regulated data is always sensitive. Most unregulated is not as it includes publicly known information.

Your corporate GRC team or chartered committee determines what data is sensitive. They consider all internal and external mandates, the nature of the data, how it is being used, the likelihood of a breach, and its overall impact on your organization (financial and reputational).

Helpfully, policies have become standardized across industries with “templates and toolkits” that leave little to risk that you can implement with reasonable effort.

Best practices recommend three classification levels (e.g., confidential, internal, public), four at most. Any greater number have shown that the distinctions are too finite for employees to assess and result in subjective and inconsistent application.

To move from templated policies to meaningful execution, its critical GRC team help security and IT professionals in your organization prioritize sensitive unstructured data tasks by directing attention to such factors as:

  • Not all data leaks are equal: The business impact varies depending on the sensitivity of the data and the extent of exposure. Determine what sensitive data, if lost, would hurt your company’s finances and reputation the most.
  • Identify how your sensitive data is shared and stored: What data is at highest risk of being stolen? Not all threats are external. Insider threats are responsible for some of the costliest breaches.
  • Employees: Verizon’s 2020 Data Breach Investigation Report states “employees mistakes account for roughly the same number of breaches as external parties who are actively attacking you.” Education, automation, and centralized controls are critical.

The dynamics surrounding sensitive unstructured data can be daunting. Focusing on a few key steps provides a meaningful path forward:


Consider current trends and update best practices. Most organizations have some form of GRC policies, but the focus has been on structured data security and handling. Locate all potential sources of unstructured data, independent of sensitivity. This helps operationalize the process and keeps your project on task.


Look for gaps in the security infrastructure, taking advantage of data-centric approaches, processes, and tools that safeguard data rather than where the data is (servers, laptops, mobile devices).


Employees need one thing – to get their work done. They will benefit most from automated sensitive data classification that minimizes impact to their workflows. They will be more receptive and committed to the effort if the policies are clearly communicated and outlined for them.

Six trends impacting your sensitive data right now

Explore the latest article

Sign up for emails on new Sensitive Unstructured Data articles

Never miss an insight. We’ll email you when new articles are published on this topic.

Six Vulnerable Points In Your Data Security Architecture and How You Can Protect Them
Sensitive Unstructured Data

Do you know where you are most vulnerable? Now is the the time to check these key trends:



Hybrid and Multi-Cloud




Insider Threat


Security Gaps


Remote Workforce


Third-Party Collaboration

1. Hybrid and Multi-Cloud Environment

According to Flexera’s “State of the Cloud, 2020 Report”, organizations use an average of 2.2 public and private cloud providers. This exposes your data to the following risks:


Identity and Access Management (IAM): You may have heard the phrase, “identity is the new perimeter”. This “new perimeter” is the intersection of users, devices, and cloud services. Due to the COVID-19 pandemic and increasing regulations, many companies across the globe have had to reconsider how much access their employees have to their systems, applications, and data.


Security: Educate your Governance, Risk and Compliance (GRC), IT security, and Human Resources (HR) teams on the latest risks and make sure they have the data-centric tools they need to combat them. Ultimately, a breach will significantly impact your organization’s reputation and finances.


Data Residency: Cloud environments are boundless and can be located anywhere in the world. Legal and regulatory requirements are imposed on data in the country or region it resides. Review where your sensitive unstructured data is stored (on or off-premise) and make updates accordingly.


A data-centric approach identifies files and secures them in a centralized management system to provide consistency across all channels. Using discovery tools helps locate your data and classifies it with specific tags to control their cloud location.


2. Privacy

Today’s privacy regulations demand greater visibility and control over an individual’s data.

Regulation types include:

  • Responding to the Rights of Individuals: Regulations such as General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA) give individuals greater rights to their personal data. Data subject and consent rights must be associated with all information collected on an individual.
  • Access and Revoke: Every file access (system and user) must be traced for data collected. Individuals can elect how and when their data is used. The “right to be forgotten” requires total removal of all data and most transactions. Your organization’s staffing department must respond promptly to any individual privacy and audit requests. Breach notifications timelines are tightened (GDPR and CCPA is 72 hours).


Deep visibility tools accumulate access information during the entire lifecycle of the sensitive unstructured data. You should avoid traditional tools that provide limited visibility and require forensic action to correlate and search across multiple log files.


3. Insider Threat

While external threats from hackers and cybercriminals make the headlines, trusted insiders can pose a greater threat to your sensitive unstructured data. A traditional security infrastructure focuses on external threats using firewalls, anti-malware, intrusion detection, and other security solutions. These solutions may not prevent an employee, contractor or third party vendor with access from sharing it with unauthorized users.

There are three types of insider threats that require your attention:


Accidental: An employee or contractor may accidentally share a document with the wrong person exposing sensitive data. Once out of the person’s control, the information could go anywhere, violating privacy regulations and compromising your competitive position.


Negligence: An IT or security administrator forgets to apply a security patch or update to a firewall rule, exposing your sensitive unstructured data to theft. This is most likely an oversight, since many IT and security groups are overworked and understaffed. Another example would be for a user to deliberately circumvent security policies.


Malicious: Employees, contractors or partners who want to harm your organization or make money selling valuable information to competitors. This type of insider threat is difficult to stop because many have a legitimate need to access sensitive unstructured data.


Encrypt files and apply rights management to decrease the likelihood of unauthorized users accessing your sensitive unstructured data. If hackers and cybercriminals exfiltrate protected sensitive data, it will be useless to them. The same goes for employees or contractors who want to take sensitive data.


4. Security Gaps

Despite significant investments in security infrastructure and the deployment of data loss prevention capabilities, breaches are at all-time highs. Threat actors have greater success exfiltrating information on endpoints and servers where sensitive unstructured data is common.

What you need to acknowledge and have teams address:

  • Beyond prevention: Data Loss Prevention (DLP) blocks and prevents sensitive data activities but doesn’t protect the data itself. Data breaches continue. Organizations and regulators are recommending the increased use of encryption to address the challenge.
  • Not a breach: Many regulations take into account if encrypted data was considered a breach or not. Fines can be significantly reduced depending on the status.
  • Ransomware: While companies may still be subject to disruption, often the most significant risk is sensitive data being exposed to the public or provided to others for financial gain. Data protected with encryption eliminates this risk. Encryption is mandated in modern-day regulations such as GDPR, CCPA, and New York State Department of Financial Services (23 NYCRR 500).


Enhance existing DLP investments by encrypting files with sensitive data. Use centralized encryption key management to maintain protection and control wherever the file travels.


5. Remote Workforce

This is a significant trend that’s been recently accelerated by COVID-19. Security and privacy implemented in corporate offices can’t be replicated at each home. Review your current policies to see if they address:


Home office/Virtual Workspaces: Work is more likely to happen on unmanaged and shared devices, over insecure networks, and in unauthorized or non-compliant apps.


Increased downloads: Slow network traffic, the convenience of working and sharing files - all result in increased volumes of sensitive unstructured data on endpoints.


Insider threat: Unintentional errors disclosing sensitive content increases without safety precautions. Malicious intent from at risk employees with access to home-based, non-sanctioned portable drives and printers is particularly concerning.


Use strong data-in-use tools like rights management capabilities that restrict printing and storing content on removable media.


6. Secure Third-Party Collaboration

Customer information shared with others remains your responsibility, regardless of who leaks the data. The challenges here are:


Loss of control: Once outside your organization, highly sensitive information can be shared either unknowingly or for improper business advantage that hurts your competitiveness.


Screen sharing: Zoom, Skype, WebEx, Google Chat and Google Meet, Microsoft Teams, Free Conference Call, and similar applications expose sensitive information to screen capture by others.


End of project: Sensitive information often remains with third parties long after the project or relationship ends, often unprotected.


Deploy agentless browser collaboration with file tracking and protection. Screen blocking of sensitive information during collaboration sessions prevents losing sensitive data. Revoke access of sensitive files if shared with third parties once no longer needed.


Proactive organizations stay ahead of these vulnerabilities by acting early to evaluate the impact of safeguarding their sensitive unstructured data.

Recommended best practices include:



Update GRC policies to reflect new guidance


Perform security gap analysis of current infrastructure


Implement employee awareness training as new risk and threat vectors emerge

Educate and empower your organization to stay one step ahead of hackers, cybercriminals, threat actors, and those with malicious intent.


What Unstructured Data is Sensitive?

Explore the latest article

Sign up for emails on new Sensitive Unstructured Data articles

Never miss an insight. We’ll email you when new articles are published on this topic.