Watch the closing keynote from Zero Gravity 2022: 30 min fireside chat between Microsoft Corporate VP of Azure Data Rohan Kumar and Incorta Executive VP of Strategy Steve Walden. They discuss key trends emerging across the data analytics industry, and the impact these shifts are having on companies, data teams, and the future of data analytics.
Steven: I'd like to welcome Rohan Kumar, CVP of Azure data at Microsoft, Rohan.
Rohan: Hey, thank you, Steven. It's great to be here. Thanks for having me.
Steven: You bet Rohan, it's so great to have you here. One of the things that I find really fascinating about Rohan..24 years at Microsoft and he started, he's a smart guy. He started as a software engineer working on the Windows kernel. Is that right?
Rohan: That's right.
Steven: Yeah. So, you know, he's been here from the beginning, you said you shifted into databases in the early days as file systems and databases, were starting to converge a little bit of Microsoft. And so I think, you know, we talked about this, just how the early days, things in a lot of ways were so much simpler than they are today, right? You know, there's a couple of databases out there. ETL was getting data from a flat file, typically and the analytics was done typically through SQL. But it's so great to have you here today. You've seen a ton in your long career at Microsoft. And I think, yeah, it gives you a unique perspective, because you've seen it from the beginning. So how do you see the analytics ecosystem evolving? And what are some of the key trends that you're seeing starting to emerge?
Rohan: No, thank you Steve, I think it's, uh, you know, if I sort of reflect, I think it's interesting, you mentioned, you know, back in the day, you essentially maybe had, you know, one or two databases, you I mean, a lot of companies I remember used, you know, maybe one database for their operational stores, and then, you know had, like, you know, another one just to sort of get that analytics done, operational, reporting, and such, you know, as the volumes increased, we had the whole MPP architecture, you know, that sort of came into play that is sort of interesting, when you look at the world, and now, though, I mean, it's fascinating, right? Like, I sort of joke about this where, you know, decisions around the data platform, and how we may at the board level, like, everyone's are sort of so worried about, you know, how data is going to be used in their specific industry, to foundationally transform, you know, what their business looks like, right? So it's no longer just a I'm doing some operational reporting and analytics, and maybe it helps me with some decision making later on.
It's literally becoming life itself. And now, you know, with that as the background, right, which you do your point, I think we made a very interesting one, where the expectations have gone up very significantly, you know, as something where, you know, the business requirements would be good this to me, maybe in the next six months to one year to have sort of boiled down to I need it like right now. And, you know, needed, everything sort of has to sort of happen in real time. And so that's a huge strand, which is, you know, like, as decision making as data culture coming along, you know, how do you sort of significantly simplify, you know, what it needs to sort of get the data in the right place in the right form, to basically serve the requirements that are needed for decision making. So there is that trend we see very, very clearly, I mean, the expectation is just to do.
The other big thing, Steve, that we see is around, you know, while there's a lot of data, especially data that is collected on behalf of customers, that's been used to sort of figure out newer products, newer business models, newer ways to do things. And there's this increasing secular trend that we see around data privacy, which is, you know, around like governance, you know, what started off as GDPR in Europe, and it's no longer just a large social media companies, really, it's every enterprise, you know, for them to be relevant, have to rely on predictive analytics in this manner.
But how do they do that, without violating any of these regulations, because those are becoming every country, essentially, is becoming, you know, very, very careful about making sure that the privacy of the data sets. In some sense, these the big things now, which leads to interesting architectures, you look at you map that out into every business, you have like scaled out architectures, data mesh, we see is becoming extremely popular. And in all up, I'd say, for each persona, you said data engineers, data scientists, business analysts, you know, how do you significantly reduce the amount of time it takes to get any job done? So that's those are the trends that we see for sure.
Steven: Yeah. And that's those are great points. And I think of those expectations, to me feel like they're one of those things that really have changed. You know, we've all become a little bit more ADD as we sit in front of screens all day long. We expect because of search engines and just the nature of the internet, that we should be able to get those answers immediately. And I think that's really translated into the business. And again, I think for this audience, I think the expectations are really on their shoulders, because the business is expecting this agility, the speed to be able to get the answers for the business that they need. And often those expectations are unrealistic, given the volumes of data that people are people are facing. Okay, that's excellent. So one of the things that we hear from a lot of our customers I've heard on some of the sessions earlier today is that data architecture is really just too complex. There's too many choices. Are you seeing any fundamental shifts in how companies are thinking about overcoming that complexity, these data architectures today?
Rohan: No, absolutely. You know, if you look at, you know, one of the interesting trends that happened as a part of this cloud transition, is I sort of joke about this, there is a service for everything, right? If you want to move data to the service for that, data integration, there's a service for that, transformation to, there's a service for that, you know, warehousing, business analytics, ML Ops, you name it. And by the time, all these need to, you know, get stitched together to produce an end to end solution, oh, my god, now you you've basically gone from like having a simple system, a single maybe database, an MPP system to like this plethora of services that have to be stitched together to make the solution work. Now, you need that you need the you know, the transition that we see in the cloud, just based on the volume of data, the scale, you know, the speed at which things need to happen.
But we've sort of we went from like having, you know, a single sort of monolith system, if you went to this disparate set of services that need to work together. And what we hear very clearly, again, this is from, you know, the me, I'm sure the, you know, hopefully the audience is going to appreciate this is, man, the amount of time it takes for a data engineer to get the data integration piece, right, like just being able to move data reliably, is extremely hard. You know, while the whole notion of, you know, doing machine learning ops and AI, and getting predictive analytics gets a lot of attention, and there's a lot of value that gets derived from that, you know, one of the things we've heard is about 70 to 80% of the actual time gets spent in doing that integration and making sure that it's reliable, it runs all the time, it never fails.
Yeah, because, you know, it's important, right? Like, you're if your company, essentially, it's it relies on it, then it can never sort of go down. How do you sort of build that? So the big, you know, thing that view seeing our customers ask us is, how do you reduce this, like, there's so many things that we need to stitch together that caused this, you know, fragility in the system? Can't you sort of make like a unified product that, you know, where all these become capabilities, really, instead of different services that we need to stitch together? Yeah, that's a trend that we see often and, you know, that's one of the things, you know, we've been working on our side is, hey, how do we sort of create maybe a single product, you know, for analytics, you know, that essentially, you know, gives you that, you know, very simple experience, it's very easy to secure, it's very easy to update without taking any downtime and things like that, because a lot of those problems are falling in the hands of the data engineers, you know, mostly today, and, and then you look at, like, the cool collaboration element that comes between the systems that data engineers manage, and then data scientists are really trying to use the data, you know, to train machine learning models, there's a lot of tension, they're like, Hey, don't do anything to take my system down. Right? Because there's so many other things.
So how do you ensure that the right isolation happens, and, you know, do that in a turnkey manner. And then the final thing I'd basically say, is even just around, you know, the operational reporting piece, it still remains a very important element. I mean, it's just the expectations become a lot more real time. But where do data engineers and data scientists leave off and how the business analysts, you know, sort of, you know, are the ones who are sort of creating the_____, etc, that the business users are using, all the three personas have to really start working a lot more closely together, is where we believe, you know, the world is owing to that that's, you know, one part, you know, Steve, that we see, the other thing, you know, I sort of say is, there's a lot of expectation that analytics itself is not just a back end thing is not just, you know, where you're sort of doing operations reporting, or maybe doing some predictive analytics for some, you know, future decisions. It's becoming an integral part of the product experience.
So you look at, you know, the applications that are built on our databases, you collect a lot of interesting information as a part of their transactions. How do you learn from that in real time? How do you basically, you know, that I believe, is going to become much more of an expectation that the developer community is going to have based on what they are hearing, right. So, you know, just to give you a concrete example, you know, go through a mobile app or an E commerce site and you search for something, and let's say you that doesn't convert into a sale, you want that learning to happen within seconds or minutes, not days, because you'd lose a lot more customers in that timeframe. So these are some of the trends we see like the complexity, you know, how do you sort of significantly reduce that like and Not and really increase the collaboration element, if you will, between data engineers, data scientists and business analysts, I think it's going to be key for them to meet the expectations of business in the future for sure.
Steven: Yeah, that's a great point. I think the collaboration, we talked about this, and in a call a couple of weeks ago, I think the collaboration amongst the team is one of those things that at the core of a data culture needs to become a lot more prevalent than maybe it has been in the past. Great points. Okay, so you're out in the field talking to a lot of customers all day? Probably many of the folks that are in this audience, you've been speaking with them? What do you think is top of mind for customers that are moving their data pipelines to the cloud? Is it? Is it something around maybe simplicity? Is it around? Maybe speed? or access to the information or bringing innovation to the business? Or is it something else completely?
Rohan: Yeah, you know, again, I'd sort of share my perspective and see what I'd love to hear from you as well, in terms of like, you know, what you heard him? Because it's an interesting one, right? You know, the biggest thing that we hear you know, what motivates is, in the past you there's always about cost, right, we say any decision that needs to be made has to be about saving something to go invest somewhere else, if you're seeing any of that always is the case, right? It you know, I always sort of joke about this reducing costs never goes out of fashion, that's always the case.
But there is a lot more desire to really reinvent, you know, like I said, what products are being offered to the customers business models, when you know, we do these envisioning sessions with a lot of our customers, where they come and talk to us about this is really what we want to do. It's got nothing to do with technology, then that sort of comes, but how do we get the insides to make decision A versus decision B, then we sort of back it into like, hey, you know, this is the these are the systems that you need to create, you know, for analytics, you know, and, and things of that nature. So it, to me it, what I see is, cost is definitely a factor like, Hey, can you run things, which were running right now a lot more efficiently? Because, you know, it's not like budgets are sort of increasing significantly, right.
But they want to save money and go invest in more of the transformative elements, you know, which is where investing in newer pipelines, investing in things that are a lot more real time to learn from, which helps them sort of give a lot better product experience to their customers. So it's both, you know, how do you transform ourselves, our business, our products, and do that in a way where we were saving costs from what we currently use? I mean, some of these systems just wouldn't even meet their scale that they're moving towards. But I'd love to hear from you as well, Steven in terms of.
Steven: Yeah, Rohan, those are great points. And we certainly see some of the same, I think where we sit, one of the things that we're seeing is people don't often realize how bad the problem has become. So there's a lot of fragmentation of systems. It is extremely complex and the complexity of and often causes architectures to be to be somewhat brittle. So I think that's one thing that we see. So I think simplicity is a big part of this. But I think the other thing that people are looking for, and they may not even know it is, is speed. And speed is one of those things that is a bit subjective. It's always comparative against something else. But if people are used to getting information back in a few minutes, that creates one set of expectations that I can't rapidly follow up with repeated questions. When you get sort of sub second responses to questions that you ask, that often leads to the next question, and it causes people to want to explore deeper. So I think this combination of I think our customers are looking for simplicity on the one side, and maybe, you know, really a difference in the way that they're interacting with their data becomes the other big piece of what we're seeing out there in the market.
Rohan: Got it. Yeah, that's absolutely makes it and said the speed be aspect is interesting, because, you know, that sort of speaks even to the whole real time nature of like, how do I learn from customer actions as an example in real time?
Steven: Yeah, it was a great, that was a great use case that that you brought up just a bit ago. Okay, so, one of the things that's near and dear to me is this whole topic around AI, ML predictive, right, everybody wants to do it, everybody's talking about it. But I was I was just at another event. And one of the key themes that came up there was, as much as we want to do that. We're not ready yet. We don't even have descriptive, you know, properly enabled in our own business. So getting to predictive becomes, becomes quite a bit of a challenge. So why do you think that there's this disconnect between what people want to do and what they feel that they're capable of doing today?
Rohan: Yeah, I You know, it's a great question. And I completely agree with that sentiment, like I said, you know, why there was a lot of talk about everything being predictive and how that's going, and I do believe that's going to change, you know, and you see that and, you know, in some industries where it's happened, the reality is the complexity of even getting like, the architecture, right, like, you know, pointed out, Hey, have you set up the pipelines, right, is so daunting that a lot of our see this a lot of time that's basically spent on this stuff that I'm doing today, you know, on premises, you know, in terms of core things to just run the business, as we modernized the cloud, like the basic expectation, is that all that needs to work, right, all that needs to work and even during the transition, and how do you do that.
As we've worked with a lot of customers, and I've spent a lot of time even, you know, getting feedback from, especially the data engineer and the data scientist community is, it's very hard to build reliable pipelines. Right? You're collecting data from multiple sources, you know, not just your line of business applications, your CRM, ERP, but maybe that is your ecommerce site, maybe there is if you're a manufacturing company, maybe that devices, physical IOT devices, which are sending data, and now you look at like the back end of it, some of them is bad, some of them is streaming pipelines, now you look at the pieces that need to get put together, just to get the data in the right format for you to me, that's its lands have been very hard. And it's where actually we've seen a lot of time being spent.
Now, once all of that stuff is done there clear business processes that, you know, which the word that's sort of happening today, what we do see is once that level of maturity comes in, right, which, which takes time and you know, different companies are in different, you know, levels of maturity, have their own journey that they're going through, then the next step is okay, you know, what's the, you know, like, how do we sort of, you know, leverage this whole notion of doing predictive analytics, then this whole discussions around the ML ops pipeline, okay, on the datasets that we have, you know, how do you create that? How do you basically empower data scientists to really, you know, experiment in ways that, you know, help us figure out what are the better products? Are we learning from actions that our customers are taking? The challenge of this team that we see is how do you do that in ways that you don't land up impacting the systems that are running your business, right, and this is where things like, you know, creating the right isolation, at the compute level, to make sure that, you know, any sort of job that's being done by the data scientist doesn't interfere with maybe the warehouse that's serving the operational reports that the business users are using at any time, that ends up becoming important.
And any, you know, and you see this with a trust breaks is when you know, that doesn't happen, then they're like, oh, you can't do this. And, you know, which is where, you know, there's a lot of friction, and we see, customers challenge, what I will say, you know, what I will say, though, is if we play out, you know, the next, you know, 18 months, three years, you will start up our belief is the amount of investments that our customers are going to do in terms of having the ML Ops workflows deeply integrated with their Analytic Systems is going to significantly increase, we seeing that trend very clearly. Right. So and, and then if you look at the investments that we're making as a company as well, is around having you know, this color collaboration boundaries be ready clearly call now.
Steven: Yeah, that's really, that's really useful. And really helpful. I think, for a lot of a lot of folks in the audience's, as you're thinking through your own AI and ML journeys is, you know, to your point, 80 to 90% of most organizations time is spent getting the data ready to push into training, various predictive models. So great, really great insight. So I'm going to ask for a prediction. So I signaled early on in the in the conversation that I was gonna ask you to predict what might happen in the future. So the cloud has obviously had a really big impact on data architectures. What do you see as the as the next big shift? Maybe in the next five years or so that could fundamentally impact data architectures?
Rohan: Yeah, you know, it's interesting, Steve, if you if you as I reflect on, like the, I mean, there's so much of innovation that's happened in the industry in the last 10 years, right. And you look at the climate, it's fascinating, like, you look at, you know, analytics systems with like, 10s of petabytes of data being run, like you couldn't even fathom anything like that, you know, you're doing on prem. So the state of art, you know, clearly has been pushed. What I will say is if you look at the way the innovation has happened, really, it's around ensuring that you know, You have a clear separation of compute and storage, right? Where the other volume of data that you need to manage doesn't really drive your cost. It's the usage. So I think there is a lot of work, you know, things that have happened in that dimension, which is great.
I mean, there's a lot of scalability, there's a lot more flexibility, there's a lot more simplicity that, you know, every company is sort of, you know, building towards, but foundationally, if you look at the jobs to be done, and what the data engineers do, what developers, developers, though, I could argue their lives, you know, are becoming better at a much faster pace. But what the data scientists do, what business analysts do, that hasn't fundamentally changed? Yes, they're dealing with very large volumes of data. Yes, they're dealing with very complex systems, and things like that. But foundationally, what they do, hasn't become very easy, in my humble opinion. Right. And, and I think that's the paradigm shift. If you look at, you know, what every industry is expecting, right? You know, when I talk to customers, they're like, look, we just horizontal platforms are not going to be enough, right? If I'm a retail customer from a financial services customer, help me with everything that you know, about my space, whether they are datasets, whether it is schemas that I need to be organizing stuff, and whether the models that have been created turnkey ones that work on these public datasets are used, and how do you make that into a turnkey experience? Right.
So if I'm setting up my pipeline, boom, boom, boom, just make that super simple, which is not the case today, right. Then the other big area that we believe is definitely going to happen is this, the governance piece, like I said, is going to become increasingly important. I actually think, you know, if you follow the trends, today, none of the tools that we have understand what it means to comply with GDPR, right, you could be using data, but you don't have use rights as an example. And that's a problem that I see a lot of our customers are solving in a very bespoke manner, you know, with custom code, custom scripts, and, and there's a huge opportunity to create a unified platform that makes that really easy. And finally, what I'd say is, if I print out the next five years, the dichotomy that we see today, between our databases, analytics, and BI, and the high level things I actually think is going to significantly reduce, you're gonna see a lot of simplicity come there.
Steven: Yeah. So really a convergence then, of a number of technologies to simplify what the stack looks like. Yeah, yeah. And I think from, from my perspective, I think, five years out, if we look out at that sort of time horizon, I think there's a whole bunch, that's, there's still a mess in five years. If I look backwards, it was a mess. But I think the challenge that people are seeing is, the technology continues to get better. It just does the computes cheaper, it's faster. But the challenge is the data is growing at such an enormous clip, and people's expectations are ever increasing. And I think that adds this extra level of complexity. So I think for the folks in this audience, your jobs are all safe.
So I'll start with that. Because there's never going to be a shortage of folks that that really understand the data in the business. It's going to be complex. I think that the added complexity that you mentioned around data governance, I would agree is one of those things that people should definitely be paid paying close attention to. That's great stuff. Absolutely. Okay, so Rohan, there's a ton of advice, I'm sure that you could give to folks in the audience. But if I asked you to narrow it down to just one thing, what is one piece of advice that you could give to this audience that they might be able to implement tomorrow in their day to day job that's gonna set them up better for the future?
Rohan: Wow, that's a that's if it's just one. I mean, you know, in this, it's probably a cliche, Steve. But I'd say it is important to understand that a lot of where the time was being spent, is going to change. Right? Like, yeah, and I mean, cloud itself has done that with managed services, where, you know, if you look at, you know, typically, let's say, with a data engineer, a DBA, or, you know, at a developer, you would sort of spend time is, is going to drastically change based on the innovation that is going to come from, you know, not just the hyperscale cloud providers, but even a lot of the, you know, ISVs, etc, who are building, you know, on top of, you know, the platforms that you provide, right, so, the thing I essentially say is, yeah, 100% agree with you, which is the value that equation that comes from data is only going to increase, Right, like one of the biggest challenges that we see with our customers is that they don't have enough data engineers, you don't have enough data scientists, you don't have enough business analysts.
I mean, it's really there's a lot of projects are stalled, because there's just basically not enough. And maybe, you know, and the thing is there. But in each of these roles, I might my one piece of advice would be see what's becoming a commodity, and where you can add value, right? Because my expectation is a lot of the repetitive stuff is going to get automated, a lot of the complexity that exists with systems that Steve called out and I agree with you a different set of complexities will be there in five years. I mean, that won't change, but the nature of it is going to evolve. Right? So then the question is, if you take a look at, hey, I have, you know, so many hours in the day to spend, my belief is where you spend the time is going to significantly change and just being ahead of that. And really understanding that is going to be important, as you all think about your careers.
Steven: Yeah, really great advice. And I think, you know, maybe just amplifying one of those points. Is the question why? Is one of those things that, I think is one of the best questions that people can ask in business. And so for the data engineers, the data architects, the data analysts, the data scientists, is trying to get to the underlying, why are we doing this? And, you know, cutting costs, increasing revenue, obviously, are good things. But knowing the customer making sure that the company stays out of trouble, because they're doing their financial reporting correctly, whatever it might happen to be. It's just understanding the fundamental why, behind some of the questions that may, they may come up. Hey, one, one quick thing. I just wanted to add in there, you just made a big announcement at Build, that relates to some of the stuff that we that we were just talking about, do you want to spend, you know, 30 seconds on, on what you guys just did?
Rohan: Yeah, I mean, thank you, thanks. Yes we announced, you know, would have called the Microsoft intelligent data platform. And in some sense, it's a pretty, it's something that we've been working on for the last I'd say, three, four years, you know, if I just take a look at the evolution, you know, between like databases, analytics, and BI systems and, and data governance, our focus has been to build out the best in classes in each of these categories to you. But the friction really is when end to end solutions get put together that span them. Right. And, you know, when the ISP, like I said, one of the big trends that we heard is developers, really saying, hey, I want to learn from my transactions, I get committed in my databases in real time.
So that's a huge challenge today, how do we go fix that, you know, like governance essentially being embedded wherever your data is, right, not just as an afterthought, where you basically scan your entire state and create a map and a catalog, we believe it needs to sort of happen in a synchronous manner, almost as data is being changed, because there's so much valuable information that's going to be needed while you're using data to train. So the intelligent data platform really is all the work, you know, that we've done to sort of enable that real time learning that embedded governance we're working on. And, and it's also a very strong signal about our point of view, our belief that these, these this dichotomy, or the silos that exist today, need to be broken.
Steven: Yeah, know, Rohan. That's amazing. And I think, you know, many in this audience are probably already using many of the components that you have today. And putting those together in a more seamless platform, I think is going to be of great value to a lot of folks. Just wrapping up, Rohan, I really want to thank you for joining me today, I took down, you know, a few highlights from some of the comments that you made, that I wanted to reiterate. So one is around the high expectations that the business has, for folks that are in this audience. It's often a thankless job, I know, to be able to assemble the right information for the decision makers in the organization or the business people in the organization. So heroic work there. This trend towards privacy, I think is a big trend that we're going to start seeing a lot more of making sure that the data doesn't end up in the hands of bad actors. Number one, but it's being used in a in a responsible way.
That's within the bounds of the law. I think your comments about there's now a service for that. That's these services just seem to proliferate. And, you know, maybe add a little bit to the complexity. And then the final thing is, again, this audience, there is never going to be enough of you out there in the world. And so, you know, thank you for everything that you do. We're really glad to have had Rohan in this particular session today. So from all of us here on the keynote stage, we thank you so much, Rohan and I look forward to our next conversation. Thanks so much.
Rohan: Thank you Steven. Thank you so much for having me.
Steven: You bet.
Corporate Vice President
Head of Strategy