What is PII?

Learn industry best practices on protecting PII data through data governance strategies

PII (Personally Identifiable Information) is any information that someone else can use to try and impersonate you or impact your life without your consent. Your personal PII data can be captured through interactions with companies, healthcare providers, banks, or anywhere that your identity needs to be verified. It is important that we, as a data community, not only think about protecting our own PII, but the PII of our customers, clients, and colleagues. It is not just about protecting your company from reputational, legal, or compliance risks; it is also about protecting each of us as individuals.

Protecting PII Data

By now, most of us understand that there are an overwhelming amount of data leaks each year - from all different industries. It may just be an email address from that company over there, credentials from this company over here, and the cell phone or computer information from yet another company you did not even know had your information. Because we are in a time where we are trying to learn so much about prospective customers by using AI/ML modeling, the ease with which these programs can scrub the internet for all these leaks and form a little personal portfolio, would scare most people. Articles for light reading on the subject will be presented at the bottom.

There are several challenges with safeguarding personal information that fundamentally start with the need and requirement to positively identify a person to prevent impersonation. How meta is that? Systems thinking tells us that sometimes we try to plug a hole in one place and the pressure sprays out through another hole. The general rule of alleviating the pressure is to reduce the volume, so let us get into that.

The common data elements that most companies focus on are:

Common PII – Full Name, Date of Birth, Address, Email Address, Phone Number and Social Security Number, Account Number, Passwords, IP address, Device ID, Biometrics, Full- face Photographs, Driver’s License Number
PCI (Payment Card Information) - Card Holder’s Name, Credit Card Number, CVV
PHI (Protected Health Information) - Demographic Information, Medical History, Test and Lab Results, Menal Health Conditions, Insurance Information, Genome Sequencing, Genetic Markers, Ancestry

Reducing Risk

The first step is ALWAYS going to be to tokenize, obscure, remove, information from the general data community within your company. This can be done through processes, tools, permissions, or architecture. There are MANY articles and best practices out there already. The plan that I recommend in most scenarios is:

1. Get rid of all the information you do not need. Not only does that reduce financial and customer exposure risk, but it typically will also reduce data processing overhead and cost.

2. Replace it with more appropriate data. For example, if you are segmenting on age or generational preferences, calculate the birthdate to an age inflight before committing the data to a hard disk or cloud account.

3. Keep the raw data as close to the source as possible. There is little reason to use most forms of PII within reporting and analytics layers. Especially if the focus is with unbiased interpretations of the data behavior.

4. Collect as little sensitive data as possible. If it is not there, it is not at risk.

5. Use AI (Artificial Intelligence) to identify areas where AI is biased. Recursively checking to see if patterns are self-fulfilling, or if there is room to challenge the models training up to this point. AI personal growth. AI is not just for targeting revenue opportunities but can also be used to target reputation opportunities.

6. Ask yourself and collogues questions during design:

If we all hate the onslaught of marketing emails and ads, why do we continue to participate in sending them out?
If the only information an AI model needs is spending habits and the location of the store to be able to pinpoint an individual, are we still protecting them by not providing the model with their name?
Will the data that I am collecting really provide enough advantage to outweigh the cost of risk to the customer?
Is there an obligation to let the individual know how we are using their data and if they consent to being targeted in a certain way? How would that simple action change things?

Next Steps

If you are interested in more creative ways to reduce or eliminate the risk of PII, let Curate Insights know and we can create a custom solution for you. Ultimately, we hope that fellow dataticians will start taking on the responsibility to make sure we are headed in the right direction on a daily basis. Small minute corrections to our trajectory will have major impacts the further into the future we survive.

What is PII?

Protecting PII Data

Reducing Risk

Next Steps

Related Articles

Let's Connect Your Dots!

Quick Links

Headquarters

© Curate Insights 2023