Sunday, July 26, 2009

Data Governance

As per recently sponsored survey of 50+ Global 5000 size businesses regarding their investments in “data governance” and the challenges they are facing.

  • 84% believe that poor data governance can cause: limited user acceptance, lower productivity, reduced business decision accuracy, and higher total cost of ownership
  • Only 27% have centralized data ownership
  • Fully 66% have not documented or communicated their program, and
  • 50% have no KPIs or measurements of success

What is Data Governance?

As per official definition Data governance is a set of processes that ensures that important data assets are formally managed throughout the enterprise. Data governance ensures that data can be trusted and that people can be made accountable for any adverse event that happens because of poor data quality. It is about putting people in charge of fixing and preventing issues with data so that the enterprise can become more efficient.

Data Governance is the application of policies and processed that:

  • Maximize the value of data within an organization
  • Manage what data is collected and determine how it is used

Why Data Governance?

“You can't protect data if you don't know what it is worth.”

To know what it is worth, you have to know where it is, how it is used, and where and when to integrate and federate it.

Lots of time data governance initiatives are driven by a desire to improve data quality. However, they are also more often driven by external regulations such as Sarbanes-Oxley, Basel II, HIPAA and a number of data privacy regulations. To achieve compliance with these regulations, business processes and controls require formal management processes to govern the data subject to these regulations.

Common themes among the external regulations center on the need to manage risk. The risks can be financial misstatement, inadvertent release of sensitive data, or poor data quality for key decisions. When management understands the value of data and the probability of risk, it is then possible to evaluate how much to spend to protect and manage it, as well as where investments should be made in adequate controls.

A best practice within companies successfully implementing data governance is the collaboration between IT management and business leadership to design and refine “future state” business processes associated with data governance commitments. Moreover, a strong data governance function is very important to deliver reliable and usable business information.

Such a corporate data governance function can help businesses avoid these symptoms of poorly executing IT organizations:

  • Overly complex IT infrastructure
  • Silo-driven, application area-centric solutions
  • Slow-to-market delivery of new or enhanced application solutions
  • Inconsistent definitions of key corporate data assets such as customer, supplier, and product masters
  • Poor data accuracy within and across business areas
  • Line-of-business-focused data with inefficient or nonexistent ability to leverage information assets across lines of business (LOBs)
  • Redundant IT initiatives to re-solve data accuracy problems for each individual LOB

With an operational data governance program, businesses are more likely to benefit from:

  • Uniform communications with customers, suppliers, and channels due to the accuracy of key master data
  • Common understanding of business policies and processes across LOBs and with business partners/channels
  • Rapid cross-business implementation of new application solutions requiring shared access to master data
  • Singular definition and location of master data and related policies to enable transparency and auditability essential to regulatory compliance
  • Continuous data quality improvement as data quality processes are embedded upstream rather than downstream
  • Increased synergy between horizontal business functions via cross business data usage – e.g., each LOB is able to cross-sell and upsell its products to the other LOBs’ customers

What are components of the Data Governance Framework?

  1. Organizational Bodies and Policies
  • Governance Structure
  • Data Custodianship
  • User Group Charter
  • Decision Rights
  • Issue Escalation Process

2. Standards and Processes Data Governance

  • Data Definition and Standard(Meta data management)
  • Third Party Data Extract
  • Metrics Development and Monitoring
  • Data Profiling
  • Data Cleansing

3. Technology

  • Metadata Repository
  • Data Profiling tool
  • Data Cleansing tool

The Data Governance Structure
A Data Governance (DG) structure is defined based on the following roles and responsibilities:

Data Governance Council

Membership of this council consists of executives from various divisions who have an interest in the management of asset data. They are responsible for endorsing policies, resolving cross divisional issues, engaging the IT council at the strategic level, strategically aligning business and IT initiatives,and reviewing budget submission for IT and non IT related projects.

Data Custodian

Asset data is managed by the data custodian on behalf of Company A. It is responsible and accountable for the quality of asset data. The data custodian is responsible for resolving issues raised in user group meetings. If issues become political and impacts stakeholders from other divisions, they are escalated to the DG council level. They are also responsible for endorsing data management plan, endorsing data cleansing plan, ensuring data is fit for purpose, converting strategic plans into tactical plans, change management, and stakeholder management.

Data Steward

Data Stewards have detail knowledge of the business process and data requirements. At the same time they also have good IT knowledge to be able to translate business requirements into technical requirements. They are led by the Data Custodians and are responsible for carrying out the tactical plans. They also act on behalf of the Data Custodians in stakeholder management, change management, asset related information systems management and project management. They manage user group meetings, train and educate data users.

User Groups

Data stakeholders from various divisions are invited to the user group meetings. These key data stakeholders consist of people who collect the data, process and report off the data. Technical IT staff is also invited to these meetings so that their technical expertise is available during the meeting. This is also a venue where urgent operational data issues can be tabled. The data users are responsible for reporting any data related issues, requesting functionality that would help them collect data more efficiently, and specifying reporting requirements.

The Data Governance structure should have the business engagement with IT at the strategic, tactical and operational levels. This level of engagement ensures that IT and business are kept informed and IT initiatives align with the business data governance objectives.

Other Related Terms (Source CDI Institute)

Data Governance

The formal orchestration of people, processes, and technology to enable an organization to leverage data as an enterprise asset.

Master Data Management (MDM)

The authoritative, reliable foundation for data used across many applications and constituencies with the goal to provide a single view of the truth no matter where it lies.

Customer Data Integration (CDI)

Processes and technologies for recognizing a customer and its relationships at any touch-point while aggregating, managing and harmonizing accurate, up-to-date knowledge about that customer to deliver it ‘just in time’ in an actionable form to touch-points.

Master Data Integration (MDI)

Process for harmonizing core business information across heterogeneous sources, augmenting the system of record with rich content by cleansing, standardizing and matching information to provide high data quality in support of a master data management initiative.

Good Articles on Data Governance:
http://www.b-eye-network.com/view/630
www.hds.com/pdf/wp_199_data_governance.pdf

Saturday, July 11, 2009

Cloud Computing

Gartner Says “Cloud Computing will be as Influential As E-business.”

Forrester’s advice to CFOs: Embrace Cloud computing to cut costs.

Is cloud computing new evolution of Software-as-a-Service? How is it different from SaaS, PaaS, grid & Utility Computing? How will it impact BI ?

What is Cloud computing?

The wikipedia entry states "Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them.

Cloud computing refers to computing resources being accessed which are typically owned and operated by a third-party provider on a consolidated basis in data center locations. Consumers of cloud computing services purchase computing capacity on-demand and are not generally concerned with the underlying technologies used to achieve the increase in server capability. There are, however, increasing options for developers that allow for platform services in the cloud where developers do care about the underlying technology.

Other Related Terms
Software-as-a-service (SaaS): A software solution hosted by a vendor as a fee-based BI service.

On demand: The ability for users to have instant access to a BI service and pay for it based on usage.

Cloud computing: The computing capacity for supporting SaaS BI processing. This may be provided by the SaaS vendor, or a third-party.

Hosted: An alternative approach to running a BI application in house.

Subscription based: A business model for a pay-as-you-go BI service.

Platform: A set of integrated BI software tools that may be offered as a SaaS or on-premises solution.

Grid computing: A large virtual computing platform that provides scalable cloud computing for on-demand SaaS BI processing. Grid computing is a technology approach to managing a cloud. In effect, all clouds are managed by a grid but not all grids manage a cloud.

So a grid is a huge amount of scalable computing power made up of multiple systems that may, or may not, be in the same data center. Grid computers are used to provide the resources for cloud computing, which in turn supports SaaS computing needs.

What does it mean to BI Customers?

There are several cloud-based data warehouse options available to customers in market today. From pure SaaS or DaaS (data-as-a-service) offerings which provide a full software stack, to PaaS (platform-as-a-service) and IaaS (infrastructure-as-a-service) solutions on which you can build your own data warehouse, the cloud has very quickly shown to be fertile ground for managing increasingly large volumes of data.

Several SaaS or DaaS providers offer data analysis services. Using these services, you can employ powerful, full-stack data warehousing technologies with little effort and pay just for what you use. Companies such as 1010data, LogiXML and LucidEra offer various focused solutions for everything from a really big spreadsheet in the cloud to frameworks with built-in extract, transform and load and dashboards all the way through fully developed analysis tools customized for different verticals.

These solutions require no large outlay to get started. You can sign up using a Web browser, in some cases with a free trial, and start uploading data right away. These full-stack solutions include ETL tools for migrating data and automatically building out visualizations to slice and dice your data.

If you need full control over your data warehousing, or the volumes are larger than SaaS providers can handle, you have the option of rolling your own. If building a data warehouse sounds daunting, building one in the cloud would seemingly only complicate matters. But, in fact, the cloud is simple by comparison.

Some of the questions which are still remained unanswered in my mind are as follows..
1. How data will be stored and secured in Cloud environment?
2. Is this environment really suited for data warehousing applications where volume of data is very high?
3. How will be licensing of Cloud based BI applications? Will it be same as SaaS licensing?
4. What kind of bandwidth required to use Cloud based application?

I am sure we will find anwswers to all of these questions once this technology matures.

Some of interesting links:

http://en.wikipedia.org/wiki/Cloud_computing

http://www.cio.com/article/192701/Cloud_Computing_Tales_from_the_Front

http://www.cio.com/article/426214/The_Dangers_of_Cloud_Computing

Wednesday, July 8, 2009

Dashboard Vs Scorecard

If scorecards are supposed to be balanced, are dashboards innately unbalanced? What is the difference between scorecards and dashboards?

The popular concept seems to be that there is no difference. The terms are used
interchangeably in most of the marketing collaterals and performance articles. Perhaps there should be a distinction as a scorecard for a college semester feels like it’s addressing a different problem than a dashboard for an automobile.

What is Scorecard?
A scorecard is an application or custom user interface that helps you manage your
organization's performance by understanding, optimizing, and aligning organizational units, business processes, and individuals. It should also provide internal and industry benchmarks, goals, and targets that help individuals understand their contributions to the organization. This performance management should span the operational, tactical, and strategic aspects of the business and its decisions. You can use a methodology derived from internal best practices or an external industry methodology. (For example, the term "Balanced Scorecard" is a specific reference to the Kaplan & Norton methodology.)

What is Dashboard?
A dashboard is an application or custom user interface that helps you measure your
organization's performance to understand organizational units, business processes, and individuals. Conceptually a subset of a scorecard, it focuses on communicating performance information. Just like an automobile dashboard, it has meters and gauges that represent underlying information. A dashboard may also have some basic controls or knobs that provide feedback and collaboration abilities.

Industry Conceptions
Although many people use the terms "dashboard" and "scorecard" synonymously, there is a subtle distinction that is worth understanding.

Dashboards Monitor and Measure Processes.
The common industry perception is that a dashboard is more real-time in nature, like an automobile dashboard that lets drivers check their current speed, fuel level, and engine temperature at a glance. It follows that a dashboard is linked directly to systems that capture events as they happen and it warns users through alerts or exception notifications when performance against any number of metrics deviates from the norm.

Scorecards Chart Progress Toward Objectives.
The common perception of a scorecard, on the other hand, is that it displays periodic snapshots of performance associated with an organization's strategic objectives and plans. It measures business activity at a summary level against predefined targets to see if performance is within acceptable ranges. Its selection of key performance indicators helps executives communicate strategy and focuses users on the highest priority tasks required to execute plans.

Whereas a dashboard informs users what they are doing, a scorecard tells them how well they are doing. In other words, a dashboard records performance while a scorecard charts progress. In short, a dashboard is a performance monitoring system, whereas a scorecard is a performance management system.

Scorecard can access the quality of execution whereas dashboards provide tacticalguidance. Scorecards inherently measure against goals dashboards need not.

Industry Perceptions





Bringing Balanced Scorecards & Dashboards Together
Customer relationship dashboards use lots of measures that give you data about how your team is operating, but provide little insight into progress towards your goal of reaching maximum resolutions. Its measuring/monitoring, but not managing. Like wise, customer relationship scorecards presents a quick picture of which strategy you need to concentrate to improve customer satisfaction but lacks any detail as to why are you struggling in bringing up maximum resolutions.

However, there are ways to ensure that Dashboards include the critical connections to strategy. Once you have identified the troublesome measure on the scorecard, you can drill down into maximum resolutions dashboard that contained detailed measures like average call resolution time, call queues and hold time.

Sunday, July 5, 2009

Operational Analytics


Operational BI is no longer simply theory as teams (not necessarily on the bleeding edge of technology advancement) are starting to investigate how to more closely link analytics to their operational activities.

The objective of the operational BI environment is to provide the analysis infrastructure from which people from both inside and outside the organization can make better, faster and more informed decisions. Operational analytics, once opposite ideas now comfortably joined, represent the techniques by which organizations are leveraging the BI infrastructure. Companies employ operational analytics to improve business decisions by directly supporting specific operational processes and activities with analytics. They provide an environment where the organization can learn and adapt based on analysis of operational processes. And they reduce the latency between business events and the organization's ability to react to those events by closing the loop between analytics and operations.

How does this actually work? First, we require an event detective or BI service that can continually monitor business events or transactions that happen in the operational world. These events can be generated by applications, by customer transactions or by service personnel. Once the events have been detected, the related information must be pushed through analysis applications to validate the event and determine the course of action. This generally involves moving the event information into the data warehouse in real or near real time. More traditional data warehouse analyses will also be required prior to starting the initiative in order to determine which events are going to be part of the program. Lastly, the results of the analysis must be married with an operational process so the relevant action can be taken.

An example: a bank wants to serve an immediate need for customers who need cash for emergency purchases and have insufficient cash reserves. With timely, relevant communications, the bank plans to offer these customers a credit card with cash advance capability, overdraft protection or a personal loan. First, the bank must detect that the event has happened - a customer has attempted to withdraw funds at the ATM or at a bank branch and has been rejected due to insufficient funds in the account. This requires continuous data mining for rejected transactions at ATMs and branches. Once the event occurs, information about a specific transaction must be pulled from the transaction system and moved through the data warehouse into a special analysis application designed to determine the significance and relevance of the event for that particular customer.

Habitual offenders, potential fraud perpetrators and people who have simply misjudged their account balance (and can get the funds from other accounts within the bank) must be eliminated from consideration. Eliminating these customers requires a look at past behavior patterns and recent transactions to determine that the account has not had an insufficient funds transaction in the past six months, that the card has not been reported lost or stolen, and that the customer does not have another account from where they could withdraw a similar amount in the 24 hours following the initial attempt. Further analysis to ensure an honest need for emergency cash can include verification that the account does not have a paycheck direct deposit due but is simply late arriving (again requiring past account transaction detail mixed with current state information). Once the bank confirms the need for cash, it must perform a credit score on the customer to ensure that the products offered are appropriate. Next, the bank must deliver the resultant product offer back to the operational process where the customer is experiencing the rejection - in this case, to the ATM or branch interaction.

Implementing the operational analytics program in the example requires a closed-loop environment between the operational world and the analytical one. It offers the possibility of tremendous benefit and will become the norm as organizations harness the intelligence in data warehouse for competitive advantage.