Is your company burdened by legacy BI platforms and not able to deliver the right data to their operational users at the right time to run their business today? You might have heard that operational reporting platforms have become key enablers to delivering insights into your organization's performance. But how do we define what operational reporting is, how does it fit in your analytics stack, and how does it impact your data architecture? Learn from Keysight's experience with modernizing our operational reporting to deliver insights to our users across the company. We discuss how we were able to accelerate reports with 40+ joins across tables with hundreds of millions of records each to deliver operational reports for finance, supply chain, service, sales and other functions that run or refresh in only one minute (versus the 6+ hours required before). All without physical star schemas or unnecessary data transformations.
Transcript:
Ashleigh: I'd like to go ahead and introduce our speaker Bharath Nataraja. Bharath is the head of business intelligence and Intelligent Automation at Keysight Technologies. He has extensive experience building intelligence, data engineering and robotic process automation solutions and teams. Let's go ahead and give him a warm welcome to the stage. Welcome Bharat.
Bharath: Thank you. Thank you, Ashleigh. Looking for my slides, I'm ready to go. Thank you. Morning. So today I'm going to talk about how to modernize your operational reporting platform. So before that, I like to talk a bit about my company, Keysight Technologies. Keysight Technologies is a leader in electronic test and design. And we have been in business for over 80 years. So everybody, I'm sure knows Bill Hewlett and Dave Packard, they started pretty much Silicon Valley. And the products they invented are now the modern Keysight. And we are into hardware, software, and services. A bit about Keysight. So in FY 2001, when us close to 5 billion, we have 14,500 employees, we are heavily research and development focused companies producing very innovative products, especially in the communication area, we are 12 Plus market ready 5G products.
And we have we operate in 100 plus countries, and 29 of the top 30 tech companies use Keysight and 78 of the global 100 companies are Keysight customers. So just to give a background, we have a very complicated ERP. And we have a complicated set of supply chain requirements, financial requirements, and that's leads to complexity in our business intelligence, as well. So as a data leader today, you are faced with this to pick the right solution. This is from a venture capital company, First Mark, and they peruse the whole landscape of data solutions today. And, as you can see if you want to pick a BI solution, you have to pass through this to find out what it is. So you want to be very clear, what is it you're trying to do? What are your goals and how you want to approach the right solution?
So let's dive into what is operational reporting? So operational reporting is, reporting on your current business activity that supports your day to day operations. So for example, your supply chain can say do we have enough parts? Or what is our current backlog? Finance might want to do recon reporting, reconciliation reporting, they have to do external financial reporting. There is regulatory, that is tax. And there is daily and monthly business performance reporting as well. Your sales might want to know what orders are booked? Is my orders booked correctly? Is it on time? And how's my team doing against the target and forecast? And are we your service team might want to know Are we closing the cases on time? And what kind of leads are we generating marketing can ask? So this operation reporting even though it is not really considered modern, or very cool thing to do, it is absolutely essential to run your business.
And operational reporting needs to be published and shared with an entire organization or function. So why is it so challenging to do? Because 10 years back, if you look, you're you had one ERP or one big CRM, and now but nowadays, your data is coming from multiple ERPs, multiple cloud systems and also from machine data or from external data that you have to it's coming outside your company. As you all know, the volume of data is growing and is growing exponentially. And each of these sources that you have, they all have disparate Shape Data shapes, because it's not nicely coming as tables. It can be files, it can be XML files can be JSON, and they are all coming from multiple sources and that needs to be joined to produce any single ends sight. So it's very clear, I mean, even four or five years back that star schema base traditional ETL, will not work for this scenario anymore. So you absolutely have to find a way to modernize your platforms. But before you modernize, you should be setting up what should be my goals for my modernization strategy. The first thing is, you need to establish goals for speed of change. So for example, it is not acceptable anymore for taking three months to add a new column, it is not acceptable to add a new source in nine months or one year.
So businesses are expecting extremely change of your BI platform. Secondly, we need to be able to combine data from multiple sources. So sources are constantly coming new sources as new systems or new SAS system especially. And we need to be able to combine data from all of these different sources. Third thing is reports cannot take hours or even minutes to run, we are expecting seconds and sub second performance. Number four is self service. So our BI team is just 20 people and we are supporting 5000 users. So there is no way we can support these kinds of users if you don't allow self service. And this is pretty much common across industry. Number five is you need to be able to support multiple data latency requirements. So for example, you need to be able to support real time you need to support operational reporting, and also analytical reporting, I will talk a bit more about this soon. And then you need to be able to reduce the overall cost. So you might have terabytes and petabytes of data coming in. And you cannot really have exponential storage and performance cost, it has to be much less than what it was 10 years back, because we do have the tools now. And finally, you need to have a strategy to move your entire data stack to the cloud. And I don't have to explain this. I'm sure everybody on the call understands that.
So moving on. So before you modernize, you need to be able to segment your BI reports and use cases. So you have let's say all your existing BI reports and segments, reports and use cases, how are you going to segment them, so you have multiple architecture to go to. The first one is real time. So what I mean real time, you usually have just one data source. And the latency is none because the business user wants to see data as it is today. Right now, the second, the volume of data is usually low. And the transformation is very, usually pretty simple. The UI needs are also very simple. It's usually just a tabular report. So once you determine this report belongs to a real time, then you need to have a real future real time modern data stack. And the users usually are line level employees, people who are shipping, people who are in the call center looking at how many cases need to work on and in our case, only 10% of our use cases fit into the real time scenario. If you can see, the second one is the major use case 60% of our BI use cases fall into operational reporting. So what is operational reporting? So operational reporting, usually has one or more data sources. And they need to be joined at a very detailed level, the latency of this data that can be acceptable of few hours to a few days.
So you need to be able to load your data several times a day. And you need to be able to load it from multiple systems and join them. The volume of data has just exploded over the last few years. So you not only should it be able to bring this data you need to be able to join them. And the transformations are usually medium complex, because you need to reshape the data before you join them. And then finally, the UI needs or I would say medium complex. Mostly it will be tabular reports, pivot reports, basic charts, and filters. So that's where our operational reporting needs. And you need to have operational reporting data stack and this is what we're going to look at in the presentation. Finally, there is analytical reporting, which is you're looking at long term trends. You're just snapshot in your data. The latency of data could be days, months, weeks or years. And the transformation is usually pretty complex. Because you're taking data over many years, and then you're aggregating them, and then the presentation especially needs to be is very advanced because the users usually are your senior managers and executors. And what I've seen as in our case is about 30% of our use cases is belongs to the analytical data stack. So what we're going to do see in this presentation as what we achieved as part of our operational reporting data stack.
So in our case, and I think this can be widely applied, as well as you can take two approaches to how to set up your operational reporting modernization, the first approach is the single stack approach. So in a single stack approach, you have a unified data analytics platform.
And what you see here is you load the data from your enterprise sources, like on premise ERP, cloud apps, like sales force, or data can come from files, you need to stage them in staging tables, then you have to prepare your data, and then you load it into your warehouse. And then once if data is in the warehouse, you need to model data. So when you're modeling data is where you're reshaping the data. And also joining the data between systems. And once you do the data modeling, you can do the next level of data modeling, which is like you make the have a flat presentation, you expose business friendly schemas. And finally, you have the presentation layer. Now, in this scenario, all these capabilities are available in just one tool. So this tool can also offer administration layer where you can do monitoring, you can do governance, security and also it exposes an API for external applications to, to consume. So this is one approach of a single stack approach. The second approach for modern operation reporting is the best of breed. So in this case, you are using modern pipeline tool to extract data from all the sources. And then like modern pipeline to like Matillion, Fivetran , you can use any of those tools. And when you ingest that into a modern Cloud Data Warehouse, like snowflake, and then you stage the data in this modern data cloud warehouse, then you transform it. So transformation are many tools like DBT, pretty advanced today.
And then you do you do the transformation and load it to a target tables. And then you can use the best of breed BI tools, whether it is Tableau, there is Power BI or whatever BI tool of your choice on top of the modern cloud data warehouse. So the difference between single stack and best of breed is all that features that you need for your entire stack is available in one tool, whereas the best of breed is now you're using multiple tools, but you're achieving the best performance and capabilities in each of these three stacks. Now, what did we do in Keysight. So, so we use both. The first the single stack we use Incorta for that. And we load used Incorta direct data mapping from our primary enterprise systems like Oracle EBS, we use to load from Siebel, we are loading from our consolidation one stream. And then we also load for our revenue data and Salesforce and CPQ and tankers, all very common enterprise applications and all companies. And then we load them into Incorta. Incorta provides all the four layers that I mentioned the physical schema, the transformation layer, which is the alias schema and business schema. And then finally, the visualization. So when do we actually use the single stack architecture. This is very important to understand. First is the major sources are on premise ERP, CRM, and cloud enterprise systems. So if you see these all have very fixed data models, so you're not going to have 20 new tables, or the entire data model is not going to change all the time, and only when Oracle is upgraded. So it's going to be a planned scenario when major changes are going to come into the system. Second is there is large volumes of data from each of these systems and our business to do their operation reporting they need at least six times a day refresh.
And we also have very complex data model requirements where we joined 40 Plus tables. And out of them at least 10 of them could be 100 million or more rows. And requirement from reservation perspective is relatively simple for aware, is just like pivot tables or reports charts. And in this case, we go with a single stack..so And then what is the next one, which is an what actual benefit we achieved here using our by moving our operational reporting into the single stack, we are able to replace Oracle BI Brio system 11, Informatica and Oracle data, exa data, which is very extensive database to one system in Incorta, we used to run eight hours on our legacy platforms on critical month end queries. So imagine you're waiting to close the month. Report is running for eight..queries running for seven hours fails, then we rerun it. So it will take another five hours to run six hours. So you're already half a day that finance team is waiting for to have the critical data. From there, we move to less than 10 seconds, we used to have 20 plus support resources to maintain the legacy platform. Now we are just have two people part time, we used to take three months.
So it's all star schema based. So imagine having a new system coming in, you need to go redo all your star schemas. So just to add one column, it used to take three months, or it used to take nine months or never for complex changes. So we had a system for sales reporting, where we didn't make any major changes for 10 years. And those kinds of scenarios was not going to work anymore. So now we have monthly releases, for very complex changes as well. We have multiple legacy tools and then ends we have multiple legacy tools skill set, just to keep building new solution. Right now, in our Incorta platform, we just have, we just need one person who can build an entire end to end solution starting from ETL to data, transformations to data modeling, to reporting. So one single person can build an employee entire complex solution. So here is an example of our AR Aging report. Anybody in finance will know what aging report is very critical for closing a month. And so here, this was joining 44 tables, in our Oracle ERP out of which 10 of them are easily above 100 million rows. So this is one of the reports we used to run for eight hours or mostly fail in our previous reporting, and now it comes back in just 10 seconds. So here is the data model of that from Incorta of the AR aging report. So you can try something like this sequel in your traditional Oracle database and this not never going to come back.
And then the second one is the best of breed architecture. How are we using that? So we move down from traditional reporting, there is lot new use cases that are coming into the system. So for example, our supply chain has data coming from so many different partners. It could be external API's like findchips, or resilinc. It could be hundreds of files of different shapes, like XML, JSON, CSV, we have data coming from Oracle or SQL Server databases. And we also from enterprise apps like Incorta. So once you get suite, and then new sources are getting added all the time. So what we are doing here is we're using the best of breed architecture, we're using matillion and we're using Python to do complex transformation, and loading the data into snowflake. And this allows us to exports the data for very advanced visualization using Power BI and Spotfire. As well as use machine learning on top of the data that is there in snowflake. So the key use case here is supply chain usability. As you all know that we have huge global issues in supply chain and we need to react fast to any changing parts visibility, including events like pandemic, pandemic and Shanghai is going to impact our supply chain quite a bit,could be natural disasters, it could be pricing changes, it could be available inventory.
So what used to take three days to make to react to any of these is now taking three hours. The reason because it's we are loading all of this data into a central data lake so that we can react and get the insights on top of using the modern best of breed visualization tools. So when do we use the best of breed architecture? So the major sources are external to key site. The data has to be hundreds of structured or unstructured data. We need to augment the external data with the Enterprise data. And we also need very advanced visualization. So this is a good use case to use the best of breed. Now, there are advantages and disadvantages of this approach. So the single stack is very easy to develop, as I mentioned, just one person can build the entire solution, whereas the best of breed is complex because you need three skill sets, at least to run do that, like pipeline skill, snowflake scale, and also the, the visualization skill like Tableau or power BI is very, I wouldn't say very easy, but it's much easier to support a single stack. So when you're doing complex operational reporting from enterprise sources, I would say the single stack is best. Best of breed is pretty complex, because you need to monitor now multiple tools, visualization capabilities, why single stack is good, whatever the tool provides, you can use, whereas the best of breed is you can go pick what you want out of the market, which is best for your use case, the loading and transformation, the capabilities of the single stack is what you will be using, whereas the best of breed is you can go shop for the best transformation tool, whether it's DBT, you can use that.
And finally, the data science and ML very similar, whatever the thing is tech provides you use that whereas the best of breed, you can use the best package out there for your machine learning. So just to summarize, the steps to modernize your operational reporting is first is very critical to define your goal for what modernization. Second is, you need to segment your use case to real time operational and analytical, at least that's what we did at Keysight, you might have different segmentation that you need look at and then see what is the best stack. So one stack is not going to work for all use cases in redefine your single stack, or best of breed architecture for your operation reporting. And number four, start moving your legacy reporting to modern data stack in an agile manner. Do not attempt to do this as Big Bang. So have a plan. It took us three years to achieve this. So have a plan and start doing that on an agile manner. That's all right. I'm ready for a Q&A.
Ashleigh: There we go. All right. Thank you so much for all. I really appreciate that. That was a great presentation. We do have a few questions now from the Q&A panel here. First question is when you are thinking about a modernization strategy for operational reporting, how do you prioritize the list of criteria between single stack and best of breed?
Bharath: Yeah, so for single stack, actually, this insight didn't come before we started after reflecting based on our three year roadmap, now it's becoming clear single stack works. And you have big operational source systems. Whereas the best of breed works when you have lots and lots of sources coming in every month, like different kinds of files, we single stack works when it's more rapidly to keep on loading every day to get insights. So that's how I'm looking at segmenting to do.
Ashleigh: Great, thank you so much. Last question for you here. How do you manage data governance in your architecture?
Bharath: So both? Both of the stacks don't are very good for data governance. So there's no. So you what we do is we have row level security, we have content level security in both Incorta and our best of breed platforms. And, yeah, there's no issue in both the tools regarding data governance.
Ashleigh: Great. I see another question just came in. We'll go ahead and answer this one here. Question is, are you supporting ML use cases using this architecture?
Yeah, so we do ML use case on the best of breed architecture on top of snowflake. So we have use cases in supply chain where they want to predict what is actually they need to ship on a daily basis. And they've been using machine learning. And they kind of close the gap using the different algorithms to very narrow what is actually they need to ship on a daily basis. So it's really working out very well.
Ashleigh: Awesome. We do have a few minutes here. Are there any other questions? Please feel free to drop those in the Q&A panel.
All right, one more question here for you Bharth. How did you explore the benefits Incorta could provide to your business? For example, did you run proof of concept? What did that look like? And how long did it take? And finally, what advice would you have for another other companies who have yet explored Incorta.
Bharath: So we were customer number five for Incorta. We started in 2017, we definitely did a proof of concept. So, Incorta is a fabulous tool for your operational reporting, I would say when you have large ERPs, CRMs, all these kind of enterprise systems. So the world's power is the joining of these tables, forty plus tables, and we got from eight hours to four to 10 seconds. So it's a really good tool, where you have large reports, you need to join across all of these multiple applications. And especially, I would say, the speed of change is fabulous, because you can..because you're retaining the shape of the source system, you don't have to modify it to star schema. And also, the biggest benefit I see is one skill, one person is all we need to build the entire solution. So you don't need three people, four people, 10 people to build the solution.
Ashleigh: Great, thank you so much. Questions keep rolling in. Next question we have is, What are you using for visualizations and analytics dashboards?
Bharath: We use Incorta for 60% of all our reporting, we have multiple BI tools, like all user suspects, MicroStrategy, Power BI, Tableau,Spotfire. So from a Modern BI platform, you need to be able to support all of them, because nobody's going to just use one system from a visualization perspective.
Ashleigh: Great, thank you brought in the AR Aging example, could you describe the methodology you use to convert from Legacy ETL? Data Model BI to Incorta?
Bharath: Yeah, so in that case, we load all the 40 tables that are coming from Oracle ERP as it is, we don't change the shape. And then we create a data model on top of that. All those 40 tables, we have a central materialized view, which kind of acts like the fact table and we build a huge data model on top of that. So the main goal is to keep the shape of the of the ERP tables as much as possible. So without reshaping that we can move faster.
Ashleigh: Okay, thank you so much,Bharath. Last question here. Which architecture do you suggest for the companies who want to opt for Incorta?
Bharath: Which architecture, I would say you probably need two different architectures. For the most point, if you're a large company, you probably just one architecture may not be enough. So you'll end up with one or two. If you can keep it at two, you're in really good shape.
Ashleigh: Great. Thank you so much Bharth. We understand there are a few questions we weren't able to get to but we will do our best to follow up with you within 24 hours to get these questions answered. Thank you all so much for taking the time to do that. And thank you so much Bharath. That was a fabulous presentation. We really appreciate your time today.
Speaker:
Bharath Natarajan
Head of Business Intelligence and Intelligent Automation