February 28, 2011

Why Companies Have To Trade “Perfect Data” For “Fast Info”

Companies have been trained to think about data all wrong, say Ali Riaz and Sid Probstein, CEO and CTO of Attivio. "Some topics require complete accuracy, but in many cases analytics don't have to be based on super-precise data," they say. "Some reports don’t have to be perfect. They need to capture the essence of behavior, not thetotality of it, but you do need to get them fast – faster than your competitors."


Your company collects data. You want to act on it. First, though, you really, really, want to make sure that data is accurate. So you focus on getting it right. Better to wait on a decision until you have the absolutely correct information than act based on partial information.

That might make sense, but it’s the wrong way to go, say the top two executives at Attivio, a privately-held enterprise software company that focuses on unified information access to help its customers find and understand vast amounts of content and data. The problem with concentrating on getting the numbers too right is that most companies sacrifice speed for accuracy.

Ali Riaz, Attivio CEO, and Sid Probstein, CTO, are “practically relatives” at this point, according to Riaz. “I think I saw his first child being born, the second child and the third child,” he says. They met when Probstein interviewed for and then initially “refused to work with” Riaz at FAST Search & Transfer, a company Riaz was President and COO of (it’s now owned by Microsoft).

Probstein “understood something I didn’t understand right away, that FAST, at the time, didn’t have its strategy quite right — something I didn’t understand because I’m kind of a hopeless romantic,” says Riaz. “When I realized that he actually got it, he got that this company was not yet on the right path, I thought, ‘That’s a smart guy.’ I called him personally and begged him. He was a big contributor to FAST’s success, and we’ve been together since.”

Riaz and Probstein spoke with MIT Sloan Management Review editor-in-chief Michael S. Hopkins about the stifling downside of the quest for perfect data, why “eventually consistent” is a concept every company should take to heart, and how to deal with the need for speed.

Where do you think tech-driven information and data trends stand in terms of how companies understand them? How has the capture and use of information changed most in recent years?

Ali Riaz: Let me go back in history. I used to work at Novartis Pharmaceuticals, and one of the things that was really bothersome for me at the time was that we could never agree on the data. We got to the management team meetings and one system would say we have 17,500 employees and another would say we have 17,300 employees. Or one system would say we have 400 patients enrolled on this trial and another would say 800. These might not seem like big issues, but they ended up consuming a lot of our leadership time and frustration.

We never got to really be an intelligent company, in the sense that we were seeing the right things and being able to act and collaborate based on them. But this isn’t unique – I’m not throwing Novartis under the bus. I would say that this is a problem that most companies have had, and still have.

Sid Probstein: I think that’s exactly right. I worked in financial services, and 20 years ago the issues were all around the things Ali’s talking about. We couldn’t agree on how many units were sold, because there were 12 different products and 12 different systems storing the information on them. How could we get a unified view of our customers?

One of the first projects I worked at at a big financial services firm was to do the traditional Pareto breakdown, looking for the 20 percent of customers who were providing 80 percent of the revenue, to figure out if we could eliminate focus on some unprofitable customers. Classic modern business theory, right? It was a decade-long project just to unify the data.

But then the company grew. And part of the challenge of what makes it so difficult to achieve an intelligent enterprise is change. That financial services firm bought another company and they had yet another ERP and another CRM system. Resolving all that becomes a huge challenge.

So what I think we’ve seen developing over the last ten years is the value of what I’ll call interim steps. The idea is, look, let’s not try to move all the data together, let’s not worry too much about putting it together in one coherent way. Instead, let’s figure out what lives where as interim first step, so that when we perform an analysis we can know the provenance of data.

Let me make sure I understand the concerns about where the data came from, what you call the provenance of data.

Sid Probstein: Well, that’s definitely another thing I’d say has changed. People today are very concerned with provenance. Before, you used to argue about which report is right. Now, you want to know where that piece of data comes from.

People are focused on understanding if data is trustworthy. What assumptions might this source have made? For instance, it’s very common in a company that has newly acquired another company to trust its reporting less than their own. That’s a very natural, human effect. You think, “Well, that seems interesting, but I don’t know how they calculated revenue.”

The thing is, if two companies, before they even get into discussions about how their pieces fit together, start asking how did they collect the data and what led to the data, they’re probably going to convince themselves very quickly that it’s going to be hard to put this all into one view.

That’s why an interim set of representations is so appealing. It’s all about how the intelligent enterprise responds to the need to move faster. It’s important to integrate and understand the data, but managers are accepting that they can start to do all that without necessarily having to push all the data into the same technology stack.

I think companies thought 10 or 15 years ago that the systems they were putting in place would deliver uniform, universally accessible, trustworthy, analyzable data. And yet, here they are all these years later, after significant investment, often feeling no better off.

What’s your sense of what people expected back then and what they’re most or least frustrated about now?

Ali Riaz: First, I think we can’t ignore human nature in corporations. If I get data that says, “Ali, you did a really good job this month,” then I trust it. If the data says, “Ali, you did a bad job this month,” I may not trust it. I may question it; I may want to know more about it. People only select the information that supports their beliefs, so using dispassionate analytics is the only way to dispel this problem. The early transaction systems didn’t contain the “why” of information, just the “what”. It’s the more recent ability to merge all the sources that makes for better information and better decisions. Triangulating on a fact or an event validates it and also lets you discover what you might never have known by looking at all your data sources separately. Where we are now is that if I get information that says, “Ali, you did a good job on A, B and C, but you could have done a better job on X, Y and Z,” and if that information is complete, analyzed, and presented in a timely fashion, there’s not a lot of places I can hide.

And none of us expected two things: the amount of scale that we need, and the speed that we need. Companies like Comcast and Verizon have millions of clients, and every day, hundreds of clients move from them to competitors. There’s no point in finding out tomorrow why my customers left me yesterday, but it would be great to know who is about to leave me a week or two from now.

Sid Probstein: That’s a really key point. Early reporting was backward looking. We need to use reporting to predict what is going to happen, and how to act on it. That means massive amounts of data so that we get a good sampling of what’s going on, and it also requires speed. People thought they were going to fix reporting: before, maybe it would take a week to run a report but we didn’t know if the data was correct or not, so our focus was on getting the data to be accurate. Today, managers don’t just want the report to be accurate, they want it accurateand they want it every ten minutes or in a dashboard that updates continuously. Or they want it plus a report analyzing the hundreds of millions of emails inside the company. The systems that have to start to address that kind of performance are not changing fast enough to keep up.

Even if I’m an old brick-and-mortar company, I start up a website where I’m making sales; all of a sudden the tempo of my business has changed dramatically. I have a store that’s open 24/7. I collect information about what these people are doing on my site, but if I don’t crunch it and analyze it and come up with the best offer for people each time they arrive at the site, they’ll go to another website that does a better job, and they’ll do it for free since there’s no switching cost.

These are things that we didn’t even know to ask about 10 years ago.

So let’s look at where we are now. It sounds like you’re saying that even if you solve the challenge of making your data perfect, you might be doing it too slowly to act on. Should we be asking differentquestions of our data, and therefore of the tools we use to parse and analyze it?

Sid Probstein: Yes, you’re exactly right. One of the most important questions is whether we should even worry about whether this report is exactly right or not.

There’s a term called “eventually consistent” that grew up around a whole fleet of open-source-type technologies for crunching the huge amounts of data generated by website click-throughs. If you’re an e-commerce site, you want to understand the convergence of what the user is looking at and why he is clicking on it. Amazon, of course, is really good at this, asking, “For this customer at this very moment, what’s the best thing to show them?” They have high, high rates of success on recommendations, on product bundles, on follow-on advertising.

Amazon is good at this because they don’t worry about everybody. They develop a model where they’re eventually going to get a consistent model of the world, but at the moment they need to do it, they don’t care that they can’t roll it out for everyone. They’ve got hundreds of millions of clicks a day, and they figure, why don’t we just look at 20 percent of them? The key thing is to do it quickly and to make sure that whatever we conclude, that there are many observations for it.

This is when the term “analytics” becomes interesting. Analytics doesn’t have to be based on super-precise data. That doesn’t again mean wrong data, but it might mean some outcome that wins for the customer. If you profile a jazz CD that people didn’t know they wanted, and some people buy it, great. The fact that some of the 100,000 people that you showed it to didn’t buy it is irrelevant.

I think of that as an incredible innovation, to be able to say that the report doesn’t have to be perfect. It needs to capture the behavior, not the totality of it.

Ali Riaz: For this to actually work, we need a whole new philosophy around leadership and decision-making and performance management. People spend a lot of time worrying, “Hey, did I earn my bonus? Was I at 103 percent of the target, or 97 percent?” That worrying takes a lot of energy. Those conversations take a lot of time.

Now, really, as a CEO of the company, should I really be focusing on whether a valued employee’s bonus is 97 or 103 percent? Don’t I just want the employee to be happy? Personally, I don’t want a disgruntled employee, I want them to get the benefit of the doubt and go out and be happy and meet clients and be productive. But we are trained; we have this in our DNA, that we fight about 103 versus 97. Our boards want to know if it’s 103 or 97. Our management wants to know. Our line managers want to know. That’s just the way this tail is curled.

But for us to live with the realities of information growing more and more and speed getting faster and faster, we need a new way of thinking about not having precision but having a good understanding. And being able to live with that.

This is really interesting. I guess the obvious question at this point is how to bridge these gaps?

Sid Probstein: One thing I’m hopeful about is that I think managers get that they need to understand the frame a lot better. Say you’re a brand manager and one item is selling well and then slows down. You need to consider if that’s because you stopped promoting it, or because a competitor has a better product, or because users’ social media comments and blog entries that cover this stuff are negative.

Yes, the sales figures are relevant. Yes, a breakdown of features is relevant. But understanding the outside context is huge, too. What do the customers think? What’s the trend in the marketplace? What’s the buzz?

You’re saying there is an understanding among the executives you have contact with of those distinctions?

Sid Probstein: Absolutely. Ten years ago, it would be perfectly normal to participate in a meeting where nobody had done any — and I’ll use the term directly — “Googling” of the larger environment. They wouldn’t have looked up news stories, or looked up trends, or tracked down what people were saying about it. Now I think it’s rare to have a get-together where people haven’t educated themselves on the larger frame. And that’s a significant change.

Ali, can you say more about the leadership challenge that you began to describe, which is at odds with the very metrically driven way that people evaluate performance and lead organizations. The 97 percent versus the 103 percent is driven by trying to parse distinctions which, in the end, don’t really matter to a company’s overall thriving and success.

Ali Riaz: Generally speaking, most corporations are inefficient in a lot of ways, given the human factor, the data factor, the change factor. There’s a lot of factors involved. But in order to abandon that, managers have to come to believe that having a range of information is better than having one piece of information.

Attivio’s chairman and main investor, Per-Olof Söderberg, is an instigator for this type of dialogue. There have been times when one person or one team didn’t reach their goals, and we talked to him and we said, “They didn’t reach that quantitative goal, but qualitatively speaking, they’ve done a tremendous job.” They got their full bonus. So we are living it and breathing it today. But it has to come from the top.

I would love to see MBA programs that talk about what to do when two people come with two different sets of data for the same issue. I don’t think we have baked the reality into our education programs that this is going to happen, that you may never get the exact number right, and that as a leader, as a manager, as somebody who has to actually deal with this process, you have to figure out how to still move forward.

I want to play devil’s advocate for a minute. If I’m an executive, maybe I know how to capture customer feedback and external information about the competitive landscape. I get all this stuff. But it was hard enough for me to take my apples-to-apples report and make meaningful choices based on it. How the heck am I going to take all this other stuff and actually put it to meaningful use?

Sid Probstein: Right. That is exactly the role of leadership, which is to deal with uncertainty in this information age. Maybe the question is whether you should be so concerned with comparing apples to apples. Maybe the better question should be, “What is the result?”. If I can produce an analytic that ignores 10 percent of my customer data but increases my conversion rate 2 percent, should I focus on fixing the problem so that I include that extra 10 percent of customer data, or should I just try to get that extra 2 percent?

At the end of the day, and you cite this in the intelligent enterprise survey, innovation is a key driver. Dealing with uncertainty in innovative ways. You don’t throw out the analytic that’s producing a 2 percent improvement just because it’s not 100% thorough.

And what happens — if I can just follow up on that — when you talk to a company about the kind of approach you’re describing, and they have previously been focused on trying to get their data right. What’s their response?

Ali Riaz: There are enough stories today about companies that are focusing on trends, not perfection, and winning against their competition as a result, so that we can demonstrate the value of this approach. We haven’t told our customers not to compare apples to apples. What we have done is to say, “Sure, look at how many apples you have, but what does that tell you about the market for apples?”. So you may have grown 10 percent, but every competitor grew 30 percent. That’s important context. Or your apples have a shelf life of five days, but everybody else’s have a shelf life of 15 days. Or the clients you’re acquiring have a drop-off rate of 30 percent, while other companies have 2 percent.

Setting quantitative goals and measuring quantitative goals is human nature. I don’t think the capitalistic world would function without goals. I couldn’t function without them; I have personal goals that are quantitative, and I monitor them. That’s just the way we work. But providing more context, providing more sources of intelligence so that you are not only looking at the apples, is better. An Atlanta team delivering 97 without any local support offices may be fantastic compared to a Boston team delivering 103 with headquarters right behind it.

You’ve got to get the data right, and not just data, but the range of data, and then you have to have context for what the data means. Then you have to have leadership and business processes that allow for a dialogue. Do all that and you’ll actually be making intelligent decisions and not political CYA, all those things that happen every day in organizations and governments. Having a wider set of data and content, structured and unstructured, will allow you to learn to paint with colors that are new and old.

No comments:

Post a Comment