Data Classification Concepts

Here are my thoughts on how data can be classified at a University

Definitions:

  • Data or Information is anything recorded in an electronic format or on paper.
  • Data or Information security consists of ensuring that only authorized persons have proper access to data.
  • Access is the ability to view and/or modify data when needed.
  • Security is validated by proper audits.  The audits can be done manually or through automated programs that monitor actual access and report on who has what access to the data.
  • A role has specific access to a specific set of information.

Data Classification POVs:

There are many different kinds of data at a University.  Data can be classified by looking at

  • how sensitive the information is
  • how it is regulated and
  • how it needs to be protected.

The above are NOT classifications by themselves but different points of view when looking at data that needs to be classified

While the above are overlapping ways to classify, they are different.  The same piece of data may be public, highly regulated and still need a lot of protection. A simple example is a student’s name.  Usually a student’s name is not sensitive information unless there is an order of protection or a student has chosen to assert their rights under FERPA to keep it private.  A student’s name is highly regulated by many laws.  A student’s name is (usually) not protected from viewing but is highly protected from modification by almost everyone.  Another examples is a social security number(SSN).  An SSN is sensitive, highly regulated and becomes highly protected from viewing but only when it is associated with another piece of information like a name.  A press release may not be regulated at all but may contain time sensitive information and be highly damaging if released prematurely.   The campus web page does not have sensitive or regulated information but it needs to be highly protected from modification by unauthorized persons.  There are many regulations that require us to publish data (e.g. certain research data, Clery act etc.).

Roles:

In addition to classifying data, we also need to mention various roles that different people play at a University.  A person usually has multiple roles and in that case a person has additive access to all the roles that the person has.  Most of the roles are granted in an ad-hoc manner and are only tangentially related to a person’s official HR  title or job description.  An example of ad-hoc role granting is the role granted to a specific instructor to grade a specific course. Another example is the role granted to a student to view their own grade.

 

Data Sensitivity

One way to look at data at a University is by how sensitivite the data is.  Some popular classifications are:

  • Public
  • Restricted (FOUO)
  • Confidential

Data can and often does move between these sensitivity levels.

Public data

Most public data starts out restricted or confidential while being drafted but is then released for public consumption. Examples are public web sites, books, brochures, class lectures.  While this information is not protected from viewing, it is still protected from modification and there is reputational harm that can happen if information is modified in an unauthorized manner.  Because of this, the location of this data should be highly protected from modification.

Restricted (FOUO)

A vast majority of the internal deliberations of the University including administrative emails, budgets, research work and class discussions, homework, class projects etc. could be classified as what the Federal Government calls “For Official Use Only”.  It is not confidential information and release of it most probably will not cause any serious repercussions or reputational harm but it is just not anyone’s business to pry into.  In quite a few instances this information would be available under Freedom of Information Laws. Some of the information under this category could be regulated by various laws. This information needs a moderate level of protection from unauthorized access, both for viewing and modification.

Confidential

Release of Confidential information could cause serious reputational or/and legal harm to the University or its members.   Most items under this classification are highly regulated although it could include items like pending press releases or donor relations that while not regulated are very sensitive.  Regulated items include personnel actions, HIPAA, FERPA, FISMA, export control data and related data.  Not all regulated data is however confidential as some regulations actually require or allow disclosure.  Another category of unregulated but highly sensitive data under this classification is credit card (PCI) data.  Passwords are yet another example. Unauthorized access in all cases will however be very harmful and so the data requires a high level of protection.

Regulated Data

Data can also be classified by what regulation(s) it is covered under and at a University we have a lot of regulations to contend with. These regulations that require us to collect, store and protect data.  In all cases we need to make sure that all regulated data is protected from unauthorized modification and in almost all cases very few persons are to be allowed access to even view the data.  The data is also highly compartmentalized e.g. one patient should not be able to able to view another patient’s record but be able to view their own. Major regulations that affect us are:

  • HIPAA
  • FERPA
  • Financial aid and banking data
  • NYS Archives laws
  • NYS Education laws
  • FISMA
  • Export Control

 

Data Protection

From a user’s point of view, ultimately data has to be stored somewhere and the protection level it is afforded also is a way to classify data. Data is kept protected at a University based on various laws and the University’s own perception of risk management and costs. Confidential and regulated data is usually accorded the highest and most expensive level of protection.  This data is usually kept in either systems like ERPs, on protected file shares as well as email systems.

Data protection is achieved both by proper access control and by proper systems. Data protection consists on not only protecting the data from unauthorized access but protecting it from loss because of fire, theft, disasters, hardware failure etc. so it is available for access when needed.

  • Public access but highly protected from modification (e.g. web sites)
  • Highly protected (e.g. PeopleSoft database, File shares, most regulated data, PCI)
  • Well protected but accessible by the University members and guests (e.g. Google Apps, BlackBoard, Yammer, You, ePortfolio)
  • Long term archival (e.g. Research data)

Roles

Every piece of data and every kind of data will have different data access roles associated with it depending on the kind of data.  Most of the roles will have explicit or implicit delegations associated with them. The term “person” below may include a group of people with overlapping responsibilities for different parts of the same data. From a risk management perspective, there are a few generic roles that  can be summarized as follows:

  • Data Custodian: The person legally liable for the data
  • System Admin: The person who actually controls access to the data and is responsible for securing it.
  • Developer: The person who makes the data usable.
  • End user:  The person who accesses the data to actually use it.

References

Stony Brook Policies:

Stony Brook FERPA policy
Stony Brook Export Controls

Data classification policy based on sensitivity of data (Public, restricted, confidential):

Carnegie Mellon University Data Classification Policy
California State Data Classification
Columbia University Data Classification Policy
University of Georgia Data Classification Standard
Kansas State Data Classification and Security Policy
University of Florida Data Classification
Purdue University Data Classification and Governance
University of Iowa Institutional Data Access Policy
Indiana University Standards for the Management of Institutional Data
Boston College Data Security Policy
University of Notre Dame Highly Sensitive Information Handling Standard

Regulated data guidelines based on type of data (Export control, HIPAA, etc.)

Georgetown University Data Classification Chart
Cornell University Regulated Data Chart

Cornell University Regulated Data Definitions
Boston College Regulated University Data Chart
University of Michigan Sensitive Data Guide to IT Services
University of Texas Austin Data Classification Standard
University of Oklahoma Data Classification Standard
Indiana University Classifications of Institutional Data
Stanford University Data Classification, Access, Transmittal, and Storage

 

A risk embracing organization

Yesterday I wrote about control and service.  Increased control can reduce risk.  However to really move forward and provide better service we need to embrace risk: “Nothing ventured, nothing gained”

All ventures are risky.  An organization has to embrace the taking of risks by its employees and reward them for taking “reasonable” risks.  Few organization encourage risk takers because they are afraid that risk takers will eventually commit blunders. The definition of what a “reasonable” risk is will vary widely.

Most organizations are risk averse.  Universities are especially risk averse as most of their value is in their reputation. A reputation built over centuries by a University can be destroyed by one stupid mistake.  Tenure and permanent appointments at Universities are a small way to overcome this attitude at a University but the time involved and the process of earning that status makes most employees sclerotic and very risk averse.   A probationary employee is too scared to take risks to begin with and seven years of training in not taking risks embeds that attitude “permanently”..

This loss of enterprise  is specially significant as organizations and people get older (you can call me ageist) because they have more invested and have more to loose. This tendency has to be countered by a risk embracing organizational culture or it invites stagnation and decay.

Information Technology: Service versus Control

In my post yesterday I had written about security and efficiency.  Closely related to that topic is service and control.

I like to tell my staff that IT is in the service business and we are not in the control business.  IT staff has to exercise control but the goal of that control should always be to provide better service. The goal should never be “I will control because I can”.  Certain areas of IT tend to attract control freaks and I have to confess that the power that control gives can be quite alluring.

IT employees also like control because it makes their life easier.  It is easier to say “go away” than to provide service by actually listening to what the user needs and take on the inherent risks in implementing anything.

Organizations delegate a lot of control to IT for many reasons including security and cost containment but that control usually gets in the way of the rest of organization from getting their job done as efficiently as they would like.  The solution to that is openness and communication.  If the rest of the organization knows why IT is required to say “no” to their latest idea, hopefully it will lead to a friendly joint exploration of alternative solutions that would be better for the organization as a whole.

On Security and Efficiency

Security and Efficiency are closely tied.  When I use the word “efficiency”, it includes terms like productivity, work done, profitability, performance or any other term that indicates results that an organization wants.

Qualitatively the relationship between efficiency and security can be described by the following graph:http://www.rosshudgens.com/wp-content/uploads/2010/12/Bell_Curve.gif

Where efficiency is the vertical axis and security is the horizontal axis.

When you have no security, no work gets done and efficiency is zero.  When there is no security, all your work is stolen or compromised.  When security is very high, no work gets done and efficiency is also zero.  A powered down computer system is very secure and is immune to hacking.

The trick is to find the level of security that maximizes efficiency.  This will vary with the organization, the individual, with the kind of security and the kind of work being done.

Traditional security personnel invariably tend to be on the right hand side of the curve.  Users, developers and traditional user support tend to be invariably on the left hand side of the curve.

The goal of an organization should be to be at the top of the curve.  Not too much security, not too little security but the right amount to maximize productivity across the organization.