Originally Posted Here

18 Data Scientists & Security Pros Reveal The Most Common Pitfalls To Data Discovery And Classification

A data classification process makes it easier to locate and retrieve data, and it’s an important process for any data security program, as well as for risk management and compliance. By leveraging tools that can automatically locate and identify your sensitive data, companies can gain a deeper understanding of what data they possess, where it exists within the organization, and how sensitive it is, allowing them to apply the appropriate level of security to protect the company’s most sensitive information.

Despite its importance, many companies struggle with the data discovery and classification process. To gain some insight into the most common pitfalls companies face when it comes to discovering and properly classifying data, we reached out to a panel of data scientists and security leaders and asked them to answer this question:

“WHAT ARE THE MOST COMMON PITFALLS TO DATA DISCOVERY AND CLASSIFICATION AND HOW CAN YOU AVOID THEM?”

MEET OUR PANEL OF DATA SCIENTISTS & SECURITY LEADERS:

APARAJEETA DAS
@AparajeetaDas

Aparajeeta is Co-Founder & CDO of ThirdEye Data. She is also the Founder & CEO of ClouDhiti, an analytical insights company targeted to SMBs. Aparajeeta has 20+ years of hands-on experiences in data warehousing, business intelligence, historical, real time and predictive analytics combined with project delivery and management skills.

“The number one issue is that…”

Companies don’t spend enough time in data discovery to profile the data sets that they deal with on a daily basis. Most of the time, we don’t get the time to do it as businesses are over-demanding about knowing high-level knowledge from the data sets, which is understandable. The second pitfall is a lack of qualified data analysts or data scientists who love data more than algorithms or the technologies. If one doesn’t know to profile the data, how will they how to classify the data? They blindly use some random processes and algorithms and normally miss out on 20 to 25% of the truth.