Watch the opening keynote from Zero Gravity 2022: 30 min fireside chat between Google Cloud CEO Thomas Kurian and Incorta CEO Scott Jones. They discuss the rapidly-evolving cloud ecosystem, where it’s heading and what it all means for data architects, engineers, and end users.
Nick: Welcome, everyone. Thank you so much for joining us at the first ever zero gravity conference for modern data pipelines in the cloud. We've got an amazing lineup for you today, taking in all that's new and exciting in the fast moving world of data and analytics in the cloud. I'm Nick Jewell, part of the team here at Incorta and I'm going to be your guide for the event today. As you know, our industry is constantly evolving, and we are thrilled to share with you three dedicated tracks to help you learn from industry pioneers, and thought leaders in diverse topics such as data engineering, data architecture, and from the folks who have delivered amazing value to their organizations, genuinely something that everyone. You'll also get to hear from luminaries such as Thomas Kurian, CEO of Google Cloud around the growing momentum of Cloud Migration and Development, and Rohan Kumar, corporate VP of Azure data, to learn more about how we all become stronger through partnerships in the cloud ecosystem. Now before we get the event started, a few housekeeping points for a virtual event today. Our keynote stage is where you are right now. And our keynotes will start in just a moment or two. From there, zero gravity will break out into three tracks with many great sessions. Be sure to check out the agenda to see what's coming up live over the next couple of hours. Now, I'm sure you're going to have some tough decisions on what to watch. But don't worry; every session at zero gravity will be recorded and available for replay after the event. So there is zero fear of missing out. Now, let me give you a few technical tips for a great experience today. First, we recommend using Google Chrome, turning off any VPN software you might have for the best streaming experience. Now, if you do encounter any issues, try giving your browser a quick refresh. This usually fixes most of the problems. But if an issue persists, please email firstname.lastname@example.org and we'll get things fixed. Now in the live sessions themselves, we want to hear from you. Session chat will be open for every track, including sending private messages to really keep the conversation flowing. As always, let's make sure zero gravity is an inclusive and welcoming place to discuss topics of interest. If you have a question for one of our speakers, we'll be opening the Q&A panel during each breakout session. So make sure you get your questions in early, and we'll try and answer them live. And finally, a big thank you to all our sponsors for making this event possible. Now, I'd strongly recommend checking out the solutions over at the expo during one of the sessions breaks to see what our partners have been up to. But with that, it's my great pleasure to get zero gravity well and truly underway by introducing the CEO of Incorta, Scott Jones.
Scott: Thank you, Nick, thank you so much. I am so fired up to be with everybody today. We have an amazing day planned for everybody. Zero Gravity is designed to be an industry conference. Not a vendor conference, not a commercial for any one company, or sets of companies. This is not a seminar or a lecture on any particular data strategy. Our goal is simple. To bring together a variety of ideas, innovations, and innovators to share insights that you the data engineers, data architects, data analysts and data scientists may use to build and manage your modern data pipelines in the cloud. It's a conference built for practitioners like you with content delivered by practitioners who really understand the real world unique challenges of managing data today. What a crowd we have with us, over 7000 participants from around the world. In fact, 140 countries are represented today. Amazing. So why did we choose to call this Event Zero Gravity and what does it mean? How does it apply to data? Well, as you all know, the importance of data has grown exponentially with no end in sight. More people work working with data, services and applications, dependent upon data, and still more data flying in from everywhere. All this naturally creates weight and gravitational pull that actually serves to restrict the power and flow of data. This gravitational pole creates a more complex orbit, which becomes tougher to understand, tougher to manage, and more challenging to deliver data with the speed and agility that is needed today. So you may be wondering, who bears the brunt of all this? Well, that's a rhetorical question. Well, of course, it's you, the data experts who bear the brunt of this challenge, the challenge of orchestrating all of these moving parts to allow data to move faster and more efficiently. The movement of data to and within the cloud is fundamentally changing how we think about data architectures. We can access that data today from many different places for naturally reducing its gravity. Making data available to many unique architectures and applications opens up a whole new set of possibilities. Applying the best approaches and techniques based upon the particular tasks at hand, and the nature of the data you're working with. Allowing for weightless data atmosphere, is really all we see. And one thing we have found, and I'm sure you know, as well, is that as this generation of data we all live in, continues to evolve and accelerate. It is this, there is no one size fits all when it comes to managing data pipelines in the cloud. In fact, we are seeing a cosmic explosion of innovation and creativity in this area. Data Lake houses, Kafka, DBT, and direct to source data mapping are just a few. With digital transformation, becoming an urgent requirement for every organization, along with economic uncertainty, geopolitical unrest, and the impact of pandemics, the pressures around accessing data, and delivering data are greater than ever before. As you'll see here at zero gravity, there are some key themes that are taking hold and gaining tremendous momentum. So what are these key themes? The first is Data Mesh. Many companies are investing in next generation data lakes in the hope of democratizing data at scale. However, data lake architectures have largely been unable to fulfill their promise of data availability and accessibility at scale with speed and cost efficiency. To address these challenges, we need to shift to a paradigm that draws from more modern distributed architecture. That's where the concept of data mesh comes in. A type of data architecture that embraces the pervasiveness of data in the enterprise, by leveraging a domain oriented self service design. Imagine if you will, a design that treats data as a product, and each data product with its own uniquely designed purpose built data pipeline. There are several sessions at the conference that explore best practices for implementing a data mesh. As you can see, we have three great ones that you're not going to want to miss. Data mesh from concept to code with Ryan from ThoughtWorks, from data mess to data mesh, with Roy from Upsolver. And lastly, can data mesh help fix your data mess. How's that for a mouthful? And that'll be presented by Craig from PM square. The second major theme is that of purpose built data pipelines. Today, companies are operating in a world where urgency and uncertainty are at all time highs, creating a surge in demand for real time access for data analysis. Delivering data to meet this demand, as you know, it's no trivial matter, particularly when considering the variety of data types and workloads you're faced with. For instance, you can be built an ETL data pipeline to get 300,500 plus petabytes of Clickstream data ready for analysis. However, billions of rows of operational data with hundreds of table joins will crush that same pipeline. We have several sessions at zero gravity that will allow you to explore whether you need to consider implementing purpose built pipelines based on data types, workloads, use cases. We have a great technical session that will be led by Shane Collins, a data engineer at Meta. We have another session around operational reporting that will be led by Bharath from Keysight Technologies, and of course, our closing keynote that you are not going to want to miss. Deliberate data engineering, new solutions to notorious me under underline that..notorious data challenges. And that will be led by Matthew Halliday, who's the co founder and EVP of product at Incorta and Jack Nadeau, co creator of substrate, Apache arrow, and co founder of Dremio. Jack, what do you do in your spare time? Oh, my gosh, don't miss that session. And the third major theme is about building real time data pipelines. As you know, from an engineering perspective, making data available to the business in real time requires a new way of thinking, and how you build and maintain data pipelines. Streaming ETL is used when data freshness is of paramount importance when data is continuously generated, and in very small bursts. However, even when batch processing data from departmental operational systems such as finance, CRM, HR , supply chain, it used to be acceptable to have a delay of a day or half a day, or sometimes when you're reporting financials a month, or even a quarter. That is no longer acceptable. The timeframes for delivering data have shrunk dramatically. We have several sessions here that dig into the requirements for designing and managing a data pipeline for real time or near real time analytics. We have a great session on building near real time reporting from Comcast led by Prabha. We have another session led by Dhruba, the CTO of RocketSet. And then lastly, a session on real time analytics going beyond stream processing with Apache, you know, led by Mark and Karin from StarTree. So that's just a sampling of the incredible sessions we have. I'm sure you'll find them really, really insightful. We've designed zero gravity to help you demystify this complex landscape, while informing and educating you on the latest investments. I know you're going to enjoy the sessions. I'm really, really excited to kick off the day with our first guest, Thomas Kurian, President of Google Cloud. Thomas, welcome and thank you. Thomas, it's great to be here.
Thomas: Good to be here. Thank you for having me.
Nick: Oh, you bet. So you have.. You and your team at Google Cloud have a front row seat with some of the leading companies in the world who are making this transition to more of a digital oriented company
Thomas: That's right.
Nick: and really putting data at the center of this digital transformation. Tell me a little bit about what you're hearing for companies. What are some of the challenges? How are they thinking about this? Thomas:- I mean this shift to digital is fundamentally a technology and business shift, right? It's, first of all, in almost every industry, we see people wanting to go direct to customer. If you're a media company, used to build entertainment that was in delivered through networks, and increasingly, they're streaming. If you're a car company, increasingly, they're allowing the products and services to be bought online. If you're a doctor, increasingly, you're doing telehealth and serving your patients directly. So in every industry, not just in retail, we see that organizations are going direct to their customer. And because of that they want to use data to understand their customer. They want to use data to understand what products and services to offer. They want to understand their supply chain, if they're manufacturing products, given all the volatility that everybody is facing. They want to understand how to link all of these elements of data better. And it's that shift to digital that gives them much greater customer intimacy and much faster ability to serve and differentiate themselves. But they need to stand on the foundation of an amazing data platform to be able to understand and to optimize and transform their business.
Nick: Yeah, it's interesting. Customers expectations of the businesses they do business with, have grown exponentially based upon some of the experience they've had with some of the leading companies. And so really, knowing everything you can about your customer is really based on how accurate the data is, what kind of access and agility you have, to the data, how the data ecosystem and all of the innovations that we've really seen over kind of the last five years, and we continue to see that I think it's actually accelerating. How do you think about the data ecosystem, sort of where we're at today, and how it's evolving and how your team plays a role in that?
Thomas: Yeah, we at Google have looked at that data ecosystem, through four different lenses. First of all, a lot of data analytics, historically has focused purely on structured data, we now bring in all the unstructured data, we're helping credit card companies understand customer satisfaction, by taking their voice records from the call center, and allowing them to compare that with what they're doing with their actual billing, right? So bringing structured unstructured data, you were just talking about real time, we believe that you should not have any delay in understanding data. And we're helping bring streaming and traditional forms of data analysis together. So that allows, for example, a telecommunications company to monitor in real time 1.5 billion network elements second to second. The third is we think that people when data is a core asset, they should be able to access data from wherever it's stored, they shouldn't have to move it all to our cloud in order to do analysis and whether that's in an ERP system, and another cloud environment, etc. So we want to allow you to analyze data and no matter where it's stored. And then lastly, everyone's got a favorite style of analysis, right? Because data has become such a central asset to an organization, you have data scientists, in one part of the company, wanting to run analytical or AI models. You've got, you know, analysts using their favorite analytical tool, running query processing. So we've integrated into the data platform, all the styles of analysis, from query processing to self service to heavy AI models to prediction and inferencing. And so these four themes, if you can put all your data, structured, unstructured, real time, you can open the platform to an ecosystem of different styles of analysis, and allowing it to access no matter where it's stored. We think that's the core of making data, the real asset that it can be to every organization around the world.
Nick: Awesome. That's great. Well, you know, you mentioned the importance of really understanding the customer, and all of the data you can get about the customer being really at the heart of successful companies that are transitioning into a more of a digital player. And most of that really important data is unstructured, right? It's voice, its video, it's digital, so that that represents an enormous challenge. Let's sort of shift gears for a moment and talk about multi cloud strategies and how do you see that evolving? And why is that really important for companies that are putting, you know sort of data at the heart of everything they do?
Thomas: I mean, for us, if data is the core asset that enables digital transformation, you've got to make it open in three different ways. First, you should be able to use a data platform that can access and allow you to calculate data, no matter where it's stored. The Johnson and Johnson, for example, is a large customer of ours, they use BigQuery, which is our core analytical and data platform for their core data environment. But they access and analyze data from AWS and other clouds, using our platforms. That's an example of something where we felt you shouldn't have to move all the data to analysis. The second thing is we see if you want to be able to use the platform, you should support a variety of open technologies for people to use and do not feel that their data is locked into a particular programming model. So we've invested in as Google in many open Source technologies, Apache Flink, Spark, etc, we have our own contributors to a number of those projects and we've integrated into the platform so people can use these open tools to do analysis.
Thomas: And the third thing we feel is it's important that we integrate with a whole ecosystem of analytical capabilities. Incorta is been a great partner of ours and to allow people to access this from their preferred tool or platform of choice, we don't see, we don't want people to feel like their data is locked into one particular platform, one particular silo. Even two years ago, there was a big debate is data warehouses, is data lake, the right solution. In our view, that those are all different styles of calculations that people want and platforms that people want. And so we've integrated into one system, an open architecture to allow people to use any style of analysis, no matter where their data sets, and to open it up through both open source as well as collaboration with a great ecosystem of partners so that anyone can access this data. And most importantly, find the insight they want to run the business and change it.
Nick: Great. What about security? Big issue out there today. How should our audience and the companies that you're working with think about, you know, securing their data, securing their systems, securing their users? And what role do you guys play in that?
Thomas: We've always felt that as data becomes the core asset of a company, you know, cyber attacks to get access to that data just gonna go up, because there's more value in the cyber attacks generally go wherever there's valuable assets, right? Our approach is threefold, just as you said, it's got one, protect the infrastructure. So we have very, very strong capability in protecting the infrastructure from cyber attacks, encrypting data, ensuring that customers have sole control the encryption keys, preventing access, you can actually deny access to anybody in the system. All of these are meant to secure that infrastructure. Second, protecting users. So for us, we've long held since 2006, Google is operated on a zero trust architecture. And it's designed to say, Trust nothing, verify everything. And it has kept our system secure. All of our cloud systems operate in that model. And the value of that is that even if your infrastructure is secure, we don't want people to have a compromise of a username or password, or some other way that through the front door, somebody can come in and get access. So that's the second thing. Third area that we spend a lot of time on, when you have a myriad of tools, people struggle with how do you govern the data and manage permissions on who has access to what. And so we've consolidated all of that in a single place. So that governance of who has authorized access to what data is becoming a bigger and bigger issue. So we've consolidated that. And then finally, our view is that certain kinds of technology have now evolved far enough that you should be able to encrypt data at rest, encrypt data in transit, and encrypt data, even when it's being processed. So these are all things that we have put into our platform and one of the things we firmly believe is, the reason so many security breaches happen is that it's hard to make your system secure. So a big part of our focus has been simplifying, simplifying, simplifying, so that if you make security, really simple, people can just use it as part of their day to day and you can secure all your data all the time and not have to worry about Did I leave this, you know, security hole or
Nick: ..Sort of automate more of the process.
Thomas: automated, simplified, if you simplified, everybody gets access to it.
Nick: Right, great. We've talked a lot about some of the challenges with data, some of the challenges with companies becoming more digitally oriented. You know, there's, there's issues with culture, there's issues with business model versus the technology, and are they aligned and are there in sync? Are there other sort of data management issues or challenges that you guys are thinking about or that customers are really struggling with?
Thomas: I mean, there's a lot of things that we are working on simplifying for people, like two or three different examples. First one is making if you can make data processing faster, really fast, then people can apply it in so many different contexts than they used to think of before. You know, recently there was a study and a project we did with UPS and we enormously improved working in close collaboration with UPS as engineers, the way that they route schedule their trucks, which bring packages to everybody every day. And, you know, part of it was just that they could compute things so much faster, that they could bring in a lot more factors into calculating their routes, you know, whether it was traffic or other kinds of parameters. So first thing we say is, if you can make things really fast and efficient, you can now allow people to use their data in a much more effective way. The second thing we see a lot of organizations struggling with is data definitions. Right?
Thomas: You know, I run my report, and it shows revenue is growing and he runs his report and shows revenue is not growing.
Nick: Single version of the truth. That's right.
Thomas: So the second thing we're working on is, how do you make data definitions, governance, all of these things easy? And also, how do you allow things that, for example, AI models, historically, people have had to extract the data from the data platform and run an AI model separately.
Thomas: We have integrated into BigQuery, the ability to run an AI model within BigQuery itself. There's over a million people using it every day and the reason is that it's so much easier to administrate. And we've always felt if you make AI so easy, that there is no you know, intelligent analytic system and non intelligent analytic system but AI is just another tool that you can use to understand your data. I think it'll be enormously powerful. And so all our work is focused on, make it easy for people to get a single version of truth, make it really fast and efficient for them to run calculations so they don't have to wait to get an insight and integrate the most powerful analytical capabilities into the platform. But make it so easy that it becomes part of everybody's ability to express what they want to find.
Nick: Yeah. So you get people beyond that concern about, hey, this is something that I don't know about, or I'm afraid of. So really help them flatten out that learning curve. We've talked a bit about ecosystem. And we've talked about, you know, your adoption of our open standards. We've got a lot of folks in the audience right now that either work in the Google ecosystem, or want to. Give us a little bit of insight into how to work with the Google Cloud ecosystem, and how to navigate what you're looking for when you know, have partners join the ecosystem and work with you?
Thomas: That's a great question. We look at our ecosystem, particularly around data in three important ways. And they're all judged by they're all shaped by what customers are asking us for. Right? So part one, data is the core asset of a company. How do you give people the broadest array of tools to be able to use whether it's to load data, to analyze it, to understand it that to build models, we have very open ecosystem. And we actually have technical engineers who integrate and optimize them on the Google platform. So if there are partners who are like, Hey, I have a great, you know, new AI model I want to build or a new platform, we'd love to hear from them on how they want to integrate into the platform. The second thing we've seen is, there are many, many places where people want datasets, you know, whether that's in the financial markets and sustainability indexes, there's all kinds of datasets. So we have a data exchange where people can get pre predefined datasets, making it super easy for a customer who wants to do analysis, to say I want to combine this data with my data to understand things. Very simple example, when people in consumer packaged goods are trying to understand whether they are sourcing products in a sustainable way. They want geospatial data about where the farms from which they're sourcing, are they causing an impact on the environment. So we have public datasets that make it extraordinarily easy for people to combine public data with their data to the analysis. Again, if you've got a public dataset you'd like us to integrate with, we'd love to hear. Third one is many organizations are still building the core competency of analysis. And whether that's data engineers, AI experts. We also have projects underway to bring in broad ecosystem. We have training, we have enablement, we have certifications and remember, this is a global issue. So we are doing it in over 50 countries around the world. And if you have a need, you can reach anyone from our Google team, and we'll be happy to support you.
Nick: Yeah, Thomas, I love what you said. It's all about listening to your customers, and it's all about supporting the customer s needs is at the heart of how you think about this. So last question for you. I'm sure our audience would love to hear, you know, a couple of cool stories, couple of cool companies, stories, customer stories, things that you've worked on, or your team's worked on in the last couple of years that are really exciting and inspiring.
Thomas: Yeah, you know, I, first of all, it's a joy to be here at Incorta with you. I've known several of the founders from many years ago, almost 20 years ago. So it's a real thrill to see all the success you guys are having, you know, two great stories. One very topical, Broadcom company that's in the news today, joint customer Incorta and Google Cloud, they've acquired a number of companies wanted a big one.
Nick: It s been a big one.
Thomas: Yes, and including a big one today, they've wanted to do analysis to understand how the different companies are working their whole, you know, M&A portfolio, and how efficient those organizations are operating. Using Incorta and Ark and Google Cloud and BigQuery, they're able to get insights to help their executives run the business. Similarly, smaller company, but example, in financial services, you know, Redstone credit union, great story is a company that serving lots of consumers in a time of great financial volatility that all of us are experiencing, they are providing the ability for getting every one of their organization, the ability to see the data, to understand it, and to serve their customers better. And that's an example of an organization that's using Incorta and Google Cloud together to serve customers to serve and using data as the power enabling them to serve customers better. So we're thrilled with the partnership. There's so many other customers Henkel, Keysight, etc, who are all using our platforms together. It's a testament to the great work your teams have done. And it's a, you know, our teams find it really great to work with you.
Nick: Yeah. And we really appreciate the partnership. I mean, one of the things that's really been unique about our partnership with Google Cloud is that we don't have unlimited resources to build, what all the things that our customers are looking for. Your team has been really, really helpful at really partnering with us to build out capabilities to support our joint customers. Again, thanks so much for joining us. I really appreciate great insights and thanks for making your way over here today.
Thomas: Congratulations on the conference. Thank you. Thank you again for the partnership.
Nick: Yeah,you bet. I think we're going to take a quick break from one of our sponsors has a quick message and then we ll kick off the rest of the sessions. Alright. Thanks, everybody. Thank you.