Information classification versus data classification – We were here before IT!

Home / News / Information classification versus data classification – We were here before IT!

Due to the varied value of information, people learned to categorize it to „separate the grain from the chaff” and to focus their efforts on preserving the information very early. Whether it was about Caesar’s struggle against Hannibal, the commercial importation of amber from Polish territories in medieval times, or even the list and proportions of ingredients needed to produce gunpowder in ancient China – all this information was available only to a selected few. People subconsciously sensed who to entrust this information to, and from whom to hide it. This „feeling” of the confidentiality of information, was major underlying cause for creating conception of information as we see it today.

Information classification – is applying labels to the data, indicating who and how can process (and the information contained inside).

Attributes – popular CIA (although it is also worth knowing the ACID model):

Confidentiality – or „secrecy”, the intensity of this feature defined as high measures of protection of information seen in order to protect it from the unauthorized first.

Integrity – or „immutability”, the intensity of this feature determining how high the information levels of the news would block the content from changing.

Availability – „availability” – telling about who we can share it with and on what terms.

Information classification

Classification is a fundamental information, securing information information from the interpretation of information and organization. It is the process of identifying and assigning predetermined levels of sensitivity to different types of information.

Individual organizations define the types of information that fit each category in their own way.

Usually, as is good practice, they have a unified sensitivity rating system:

Critical

Protected

Internal

Public

If your organization doesn’t classify your data properly, you can’t protect it properly.

Watch also:

Information classification requires knowledge of its location, content, volume and context. IT resources are, of course, the priority location in the era of forced technological transformation. This is where IT security comes in, i.e. the process of ensuring the state of compliance with the security policy for the computerized part of the information system. Information is leaked, information about vulnerabilities allows it to be exploited, information exfiltration is the target of an attack, etc.

Each of the aforementioned scenarios results in financial, operational and reputational losses. If you want to learn more, a good starting point is the risk analysis prepared by your IT department.

Today, all modern companies store major share of their information in the form of data – they process it, store it, share it, lose it and obtain it. They are distributed in many repositories, to which immediate (often poorly controlled) access is obtained, from various devices, by various (not necessarily authorized) users:

Databases – local or in the cloud

Microsoft SharePoint platforms

Cloud Storage services

Files such as spreadsheets, PDFs, Word and e-mail

Let’s say you are a security analyst in a financial or public institution. An organization where users create millions of files containing information every day. Some of this information is very confidential and if it is in the wrong hands, you could lose from hundreds of thousands to millions in penalties, damages and lost sales opportunities. However, this does not change the fact that most of the data created each day could be easily run in the TV news bar and it would be without incident.

The purpose of data classification is to capture these few percent of the critical data among the organizational „noise” and ensure their visibility. However, this is not the only goal.

In the File Analysis Software Market Guide (which integrates data classification systems), Gartner lists four general areas of utility for this software:

Risk mitigation

Restrict access to information containing personal data (PII)

They allow you to control the location and access to intellectual property (IP)

Reduce the attack surface to sensitive data

They allow you to add an additional rule execution parameter in other programs, eg DLP

Governance / Compliance

Help identify data subject to GDPR, HIPAA, CCPA, PCI, SOX and future regulations

Apply metadata tags to protected data to allow for additional tracking and control

Provide for quarantine, legal hold, archiving, and other regulatory actions

Much easier to implement the „right to be forgotten” and data access requests (DSARs)

Performance and optimization

Provide effective access to content based on type, usage, etc.

Locate outdated or redundant data

Help to optimize processes – e.g. identify heavily used data for transfer to faster technologies or infrastructure in the cloud

Analytics

Tagging metadata to optimize business operations

Inform the organization of the location and use of data

Different organizations define the types of information that fit each category and the areas (from the above) that they want to improve. They often contain a common hierarchy of sensitivity: protected, sensitive, confidential and public. However, the types of data classification are a much wider aspect.

Data classification can be done based on content, context, or user choices:

Content-based – includes scanning of files and documents and their classification based on what they contain / represent

Context-based – This includes classifying files based on metadata such as the application that created the file (e.g. MS WORD), the person who created the document (AD account), or the location where the files were created or modified (e.g. . repository)

User-based – is the user creating or editing a document / file

The above classification schemes are used in File Analysis Software tools that allow companies to take the first step to protecting information. You cannot protect something that is not precisely located and defined. It has been known for a long time that the inventory, especially of data, is a tasty morsel for auditors.