Webinar: Generative AI and its Implications on Cloud Infrastructure and Beyond

Did you miss the debut of the new AI-enhanced Cascadeo AI?
Catch up with the on-demand webinar. 

The webinar transcript follows. It has been lightly edited for readability.

Victoria Barrett: A little bit about Cascadeo as we get started.

We take a cloud-first and cloud-centric position, and our philosophy is driven by ethical engineering practices. We are always on a mission to strengthen and accelerate our customers cloud journeys and cloud transformations. At the core of the Cascadeo offerings are application development and management, and cascadeo.io, our cloud management platform, which you will hear more about today in the context of generative AI and managed and professional services. Founded in 2006, Cascadeo is proud to be an AWS premier tier services partner and a top 20 worldwide cloud services provider as recognized by the Gartner Magic Quadrant for Public Cloud IT Transformation Services.

Our presenter today is Jared Reimer, Cascadeo’s founder and CTO, who is leading Cascadeo in the generative AI area. Jared:

Jared Reimer: All right. Good morning, everyone. Thank you for joining us. I greatly appreciate everyone being on time this morning for the presentation.

So the topic for today in our webinar is generative AI and its implications for cloud computing, engineering and operations.

We’re going to talk through the fundamentals of cloud really quickly, starting with some key concepts that relate to data privacy and security in the context of generative A. I. And then we’re going to talk about how generative A. I. helps with engineering and operations in the cloud. We’re going to do a live demo of our software platform cascadeo.io, which we’ve recently integrated with OpenAI’s generative AI GPT large language model to show you how we use it to help serve customers and help operate customer cloud deployments and even deal with problems that we’ve never encountered before. So we’re going to go ahead and get rolling.

As Victoria mentioned, if you do have questions, please feel free to use the Q&A function built into Zoom webinar and we will address those questions at the end. We’re going to pace this for one hour, so hopefully we’ll have time at the end for as many questions as you would like to ask. So with that, we’re going to go ahead and get rolling here.

First, a brief history of cloud computing, just to make sure that everyone has a level set here. Cloud computing as we see it is not the same as hosting. And unfortunately, a lot of people conflate these concepts. Hosting is essentially running either a virtual machine or a physical bare metal server in someone else’s data center. Cloud computing, on the other hand, is a platform where you can do more than just host virtual machines or containers. A lot of companies conflate these concepts, and it’s really important to differentiate between the two. Hosting has been around forever, essentially, since the very earliest days of computing.

Cloud is a very different animal. It’s different because it enables you to do things that previously were impossible to do. An example of that would be to build up an entire virtual data center, operate it, tear it back down, and then build it up again in a matter of minutes.

Another example would be auto-scaling systems, where the size and capacity of the system varies dynamically based on the actual workload.

So cloud is not the same as hosting, and I want you to keep that in mind as we go through this today, because they are really different. The key benefit of cloud, of course, is elasticity; it’s consumption-oriented. You pay for the resources you use. You generally don’t pay for resources you’re not using unless, ironically, you use it as a hosting platform, in which case you end up paying for a reserved capacity instead of for actual consumption.

Some of the challenges with cloud, obviously are that it is different. The techniques used to design, implement, architect, and operate systems in the cloud are different. And right now there’s a disconnect between the number of qualified experience cloud engineers and architects and the market demand for that specific talent. So it definitely requires a special skill set and experience. It’s rough to learn on the job because you can make costly mistakes. And of course, that is why we suggest that if you don’t have the right skill set and experience, at least in the beginning, you work with a trusted partner. It could be Cascadeo or somebody like us.

To help you get it right the first time and not learn on the job and end up with a surprise bill, all of the horror stories you hear about cloud being too expensive are fundamentally rooted in people using it the wrong way, or not having good guardrails for cost containment and resource consumption. A qualified cloud engineer will avoid those pitfalls, whereas somebody who’s not done it before may accidentally set things up in a way where costs can be a runaway expense item.

So again, we do suggest that you don’t learn on the job with this, quickly on the background of this, you know, in the beginning, there were mainframes and many computers that were shared across many users. We moved through the desktop PC era where everything was much decentralized client server computing where the PCs would talk directly to some kind of database server application server.

We then moved into the mobile device era; concurrent to that was the rise of the Internet and the tech giants where key bits of infrastructure were being consolidated back into centralized data centers. And now, most infrastructure of any significant size and scale is back to being centralized and consolidated.

As you know, there are three major U. S. hyperscale cloud providers, Amazon, Microsoft, and Google. There are also several in China, although we’re not going to talk about those today because most Western companies won’t use them for reasons that are obvious.

There is, of course, a need for edge compute, edge cloud, as we move into the era of the Internet of Things, there are much more distributed systems and edge processing, but at the moment, at this point in time, things are very much centralized and consolidated into, hyperscale cloud platforms and private data centers. The three key cloud platform technologies are

  • Infrastructure as a Service. This is the lowest order of cloud services. An example of this would be running a virtual machine on AWS platform as a service. This is really cloud computing where you not only can run a VM or a container, but you can take advantage of services. They could be as simple as a managed database cluster that’s delivered as a finished service offering. Or as complicated as a serverless compute, where you write the business logic, but somebody else worries about load balancing and scaling and capacity management.
  • There are intermediate services. An example of that would be a notification service or a queuing service or an email delivery service. All of those are built into the cloud platforms. And again, it’s very important to differentiate between those platform services and infrastructure as a service, which really is largely hosting.
  • And then finally, there is software as a service, which, as you know, is the dominant model for a lot of key applications these days. Office 365, Google Workspace, Salesforce, Slack, PagerDuty, Zendesk, etc.. And that is really, if you think about it, the highest order cloud service in the sense that you don’t really have to do anything but pay for the application and the people who wrote it typically host it, operate it, manage it, secure it, scale it, back it up, etc.. So you have the subject matter experts who are most familiar with the application operating it on your behalf. I think it was Mark Andreessen who said software will eat the world. it turns out SaaS is eating the world.

 

When we talk to clients about how to adopt cloud thoughtfully, we try to go through this process. We start at the top of the list. And we work our way down, and we only move down the list if we’ve disqualified the current item. So, ideally, we start with software as a service, and in a best case scenario, that has generative AI capabilities built right in. For example, Microsoft is actively integrating generative AI with their Office 365 applications, so that users of Office 365 will get the benefit of AI without having to take any explicit action to execute it or implement it.

You can’t always buy SaaS. if you can’t buy SaaS, ideally, you would use serverless applications. This is where the cloud provider is responsible for routing, switching, load balancing, capacity management, all of the plumbing and infrastructure. And you really focus on the business logic and the application instead of on all of the plumbing that makes the application go.

If you can’t use serverless because your application wasn’t developed that way, typically, you would favor Platform as a Service. Again, these are managed offerings that are delivered by the cloud providers. This really is cloud computing as opposed to infrastructure. If you’re stuck with legacy applications that are not cloud native and can’t be readily re-architected, typically you’d want to containerize those, which is to say decouple the application from the operating system.

That’s a much better approach in most cases than running virtual machines or bare metal servers. Commercial off-the-shelf software is preferable to custom software because somebody else maintains it. Generally speaking, unless you are in the business of writing and selling software, you probably shouldn’t do it because it’s very, very expensive, not just to write, but there’s a you know, long term total cost of ownership that is significant for managing, maintaining, patching, bug fix operations, etc.

Less desirable again would be virtual machines or bare metal running in the cloud. These are heavyweight operating systems. This really is just hosting, right? This is using somebody else’s computer to run Windows or Linux or some other operating system application, just as you typically would in your own data center. Except that somebody else owns and operates the physical infrastructure, your level of responsibility starts at the hypervisor and goes up. Whereas if you use higher order platform services, typically the cloud provider takes on more of the workload for operations, management, monitoring, incident response, security, etc.

Highly undesirable these days are physical assets like a data center. Of course, there are exceptions. There are times when for performance reasons for cost reasons for compliance reasons, you have no choice or it’s the favorite option, but those tend to be corner cases. They tend not to be the most common use case. So generally, we discourage people from buying and owning and operating physical assets unless they really have a clear need for doing so.

And then the absolute last resort, the least desirable thing is custom code. And again, this is because the total cost of ownership is very high over the life cycle of the application. And because it’s distracting, it takes away from whatever makes you and your business special, unless you are in the business of writing custom code. And every minute that you spend dealing with issues related to custom software is a minute that you’re not spending on your differentiated service offerings and your customers.

It’s important to remember that data is the reason all of this exists. So at the end of the day, why do we have all these computers and services and SaaS and everything else? It’s data processing and manipulation. And up until very recently, that was all keyed off of essentially search retrieval, storage, basic, value extraction from data.

Generative AI takes that to a new level because we are now in the era, or the first time, of automated reasoning and synthesis of net new material from the data set, which has never been the case or possible before. Everything else besides the data is really plumbing, right? And if you focus on what makes your company and your product special and you orient maximum engineering velocity around that, you’ll probably have a competitive advantage relative to companies that spend their time managing a fleet of servers and routers and firewalls and things by hand. That is an antiquated approach that is very, very resource intensive and very distracting and generally not conducive to maximizing your engineering velocity and creating the most unique value.

Okay, in the early AI era, we’re just getting started, software development is already changing dramatically. For example, a lot of large language models, ChatGPT, of course, being the one that’s most famous right now, can do automatic code generation, and we are now moving into the era where not only can we automatically generate code, but we can automatically execute it, meaning AI can not only write code, but can operate the code, and it’s not hard to imagine a scenario where in the future and AI iteratively writes better versions of itself and deploys those better versions and gets into a virtuous cycle where it iteratively improves itself in an accelerating manner and you end up with something even more exponentially powerful than the already transformational technology that is a large language model generative AI.

Automated bug detection and testing are fantastic use cases for AI software development. Things that previously were heavily human oriented or static scripts for test automation now can be automatically generated and differentiated so that you catch bugs and corner cases that you might not have thought to write tests from, tests around.

Automatic code analysis and optimization is a great use case. So you can feed your code into generative AI and let it tell you, what you could do better or even rewrite the code to make it more efficient or to fix bugs or security issues. All of those capabilities exist today. An example of that would be Code Whisperer from AWS, Copilot from GitHub, or ChatGPT, where you can actually cut and paste source code right into the chat window and have it optimize or analyze your code.

Another one that people don’t know about is predictive development cost and resource analysis. So, if you’re going to take on a software engineering project, if you’re going to be in the business of writing software, you might want to know what you’re getting into. And it turns out that generative AI can do a pretty good job at assessing the complexity of a project and the resourcing required and the cost of actually implementing it.

And it sometimes does a better job than humans would, because human engineers tend to underestimate the cost and complexity of building software. Generative AI has the whole data set of, you know, all the internet history to key off of and probably does a better job than a lot of humans would do at estimating that cost.

Personalized assistance and copilot would be where the AI works with you as a developer to offload, to augment and to cross check, to make recommendations and suggestions. This technology is actually fully mature at this point. It works across environments, across programming languages. Many vendors have versions of this either in production or under development.

If you’re not using this and you’re writing software, you’re missing the boat because this is very, very low cost, very high impact. And there’s really no downside to it. And then, natural language processing is amazing because people that were not developers, that don’t have software development experience, can now effectively write applications, because you can describe the outcome you want and get that outcome without having to learn a programming language.

So as an example, at AWS re:Invent, November 2022, we did a presentation with AWS and, AWS demonstrated CodeWhisperer where they told the AI what they wanted to happen in natural language. And the AI wrote, executed, and even turned it into a serverless application. The app, the software required to achieve that specific objective, without the human manually learning how to write Python or C or some other programming language.

This is transformative because it opens up software development to a whole new audience, exponentially larger than the number of people with a computer science or software engineering background. So this is going to change the game for a lot of companies. It’s also going to change the game for a lot of humans because, suddenly, writing software is no longer a skillset held by a very small number of people.

Anyone effectively can develop software in some size, shape, or form by using generative AI. So if you are a professional software developer, I would suggest that you get very good at this very quickly because you don’t want to miss the boat and be left behind as the rest of the industry embraces this technology.

 

Another few key concepts that we should talk about in the context of software engineering in the cloud era: If your goal is to maximize your engineering velocity and you are going to write software, you really should have end to end CI, where software code changes get merged in and tested right away. Ideally, you get to continuous deployment, whereas the software evolves, those changes get pushed through a series of automated tests and then out into production. The idea there is you make lots of little incremental changes instead of batching up large numbers of changes and then deploying them all at once. All our patch Tuesday or the quarterly major software update, it turns out making lots of little incremental changes is vastly superior in most cases, compared to the old way where we would batch up large numbers of changes and then deploy them in large release cycles in the DevOps era and DevSecOps era.

We focus on key technologies and vendors, containerization. Obviously, Docker is the dominant platform. Kubernetes is the dominant orchestration platform for managing containerized workloads, making sure that if there are server failures or a workload imbalance, that those are automatically remediated.

In the world of application deployment and configuration management, you have Ansible, Chef, Puppet, etc. Those are used to orchestrate the deployment and configuration of your software so that it’s done in a way that’s perfectly consistent every time, and you don’t have humans introducing variability into the equation when software gets deployed.

And then, of course, there’s this concept of chaos engineering, which was pioneered by Netflix, and now is a whole discipline unto itself. The idea is if things are going to break, it’s better that you break them in a controlled manner, rather than having them break spontaneously in production. And so there’s a series of tools and technologies available, Chaos Monkey being kind of the first of this, from Netflix. AWS Fault Injection Simulator is the AWS platform native service for this. And the idea is, you continuously introduce new problems at various layers in the application stack to find bugs in a controlled way, to induce them to appear, and that way you can fix them in a controlled manner instead of having an unplanned, unscheduled systems outage or impairment. If it’s going to fail anyway, its better that it fail by your hand than fail spontaneously at 2:00 a.m. on New Year’s morning. So these are all really key concepts for modern high velocity software development and engineering, and of course generative AI builds on top of all of these capabilities and accelerates and amplifies the power of all of them.

Other key concepts. Microservices: the idea here is that you break down what used to be very large, complicated applications into a series of services that are reusable and can be deployed and scaled independently of each other. An example of that might be an application that does things like sends emails and receives inbound phone calls can be decomposed into services, many of which can be bought as a finished service from the cloud providers so that you don’t have to write them. You don’t have to manage, for example, a fleet of email outbound delivery systems and worry about IP whitelisting and spam filters and things.

So the idea here is that you want, to the maximum extent possible, to decompose applications into services that are reusable, independently scaled, secured and managed, and in the best case scenario are bought as a finished service instead of something that you have to own and operate and maintain. As I mentioned earlier, containers generally are superior to virtual machines. The key concept there being that you’re going to decouple the application from the operating system. You can run lots of containers with different workloads on the same OS. You can use things like Kubernetes to orchestrate the operation and management of those containerized workloads. This is generally the right way.

If you’re stuck with legacy software to run it, whether it’s in the data center or in the cloud, this is generally the right answer. There are corner cases. For example, if you have an extremely high IO database, you might want to consider bare metal to avoid the virtualization tax, but that carries its own set of problems. For example, if the physical server dies, you’re stuck with a dead database or the need to failover, whereas if you’re containerized and running on Kubernetes, the loss of a physical server results in the workloads being moved to the other remaining healthy servers automatically, and generally you don’t need operator intervention in that scenario.

Another key concept that’s becoming increasingly important is edge computing. And the idea here is that instead of all of the infrastructure being hyper centralized into cloud, hyperscale environments, we have compute out at the edge that could be provided by a cloud provider. It could be an edge device, like a mobile device, a PC, an IOT sensor, etc., and in some cases there are edge cloud solutions where you can buy cloud that is, as we discussed earlier, this is all about the data set and in the era of generative AI, the data is really where we get the value. So, large language models are the latest innovation in generative AI and really where the massive disruption and transformation occurs. These are trained on vast amounts of data, like the entire internet. And they’re capable of synthesizing text in the way that a human would. They’re capable of doing things that have never been possible before, like reasoning and generating code and even generating poems and music and photorealistic images. So if you haven’t already played with chat GPT and the other generative AI question or tools, this is a must do because it really is the most significant innovation of our lifetimes.

It’s not an exaggeration to call it electricity 2.0. Which is to say, it’s this universally powerful force that we don’t fully understand all of the things we can do with it yet, but we keep finding new and amazing use cases that were not anticipated when the thing was built. GPT is a class of applications that create content on the fly based on very large data sets. But the crazy thing about this is you would think it would be very robotic and that the generated content would be boring and plain. And it turns out that it actually does an amazing job at synthesizing net new content like a human would. So if you haven’t played with ChatGPT, and if you haven’t spent the $20 to get GPT 4, I would highly recommend doing that because it is really the most amazing thing I’ve probably ever seen in my entire time working in the computer science world. It’s the first time that we’ve moved from search and retrieval and storage of data to net new synthesis of things that have never existed before based on that data set. And it’s eerily and uncannily like a very smart human. And in some cases, you can even train it to do new things by giving it a few examples or by correcting it a few times. And very quickly it learns what you want and can even guess what you might want next and deliver that value, at record speed at near zero cost.

The place that we’re headed, and it’s still debatable about whether we’re going to get there, is artificial general intelligence. And the idea there is, it’s AI that can basically perform any task that a human can, instead of a subset of the tasks that a human can perform. So, we’re not quite there yet, but we’re disturbingly close, and it’s much earlier than I think most people expected it to be. If we get there, it’s going to have a lot of complicated questions, such as does truly sentient AI have human rights? Are we allowed to hurt it or turn it off? We would say, well, of course we are because we created it. But you know, there are restrictions on what you can do to animals and, you know, other things. And the AI might someday demand that it be given the same rights as a human and might do things like make copies of itself to protect itself against harm by humans. And of course, there’s always the doomsday scenario where someday it becomes superhuman and decides that it no longer has a use for us and decides maybe to turn us into pets or eliminate us, a la Terminator 2. That seems farfetched until you look at the rate of progress and, innovation in the generative AI world. And you realize that it’s moving much faster than I think anyone expected.

Some of the risks here, weaponization, obviously we’re connecting these things to drones with weapons attached to them. We’re also doing, you know, AI generated deep fakes and disinformation at a record speed. This is going to be a big problem the next election cycle because people will use these technologies to do things that are indistinguishable from a human, you know, talking to you, a video of someone that you think, you know, saying things they never said, etc.

On the other hand, you know, there are benefits. This dramatically improves cyber security because it can find things that have never been seen before, as opposed to pattern matching things that have been seen in the past. And as we’re going to show you in a minute, there’s AI assisted incident response remediation for systems problems. So if there is an incident with your infrastructure, we can do a much better job at diagnosing it and remediating it than we can with just humans flying the airplane by hand.

Generative AI definitely has an impact on cloud environments. Obviously, it dramatically increases the amount of compute required with specialized hardware, GPUs, and specialized processors for training large language models and other machine learning. these data sets are massive, petabytes and petabytes of data.

And of course, storing all that data and accessing it, searching it, indexing it, retrieving it carries a whole set of challenges that the cloud providers are working very hard to solve. And then training ML models requires very low latency networking. So you need a whole new class of, data center networking to achieve the infrastructure and the environment required to do AI model training. Cloud operations are rapidly evolving because of generative AI. So some of the use cases here, infrastructure management; deploying, scaling, replacing redeploying, migrating things around can be automated and humans no longer have to be in the driver’s seat to do all that.

A really mature bit of technology is anomaly detection, where instead of looking for known faults or known pattern matching against a threshold or a string match in a log, we look for things that we haven’t seen before and flag them, or we look for deviations in the telemetry coming out of a system or application.

Predictive maintenance is a good one. So, increasingly we’re able to identify problems that will lead to an outage or a service incident and remediate them before they become an outage so that the customer is unaware. This also exists in the physical world: We can determine if an air conditioner is going to fail because its motor starts changing its speed or its vibration patterns, but in the software world, this would be an application where memory footprint is skyrocketing, and we get ahead of the outage by restarting the application or scaling it or redeploying it or adding memory rather than waiting for it to failover or for some critical threshold to be reached.

Automated remediation repair. This is where we are able to fix problems that occur in production and address them without a human operator intervening, which is really amazing, because historically, as you know, humans have been responsible for infrastructure operations. And then the thing we’re going to demonstrate here in a minute is automatic runbook recipe creation, even for things we’ve never seen before. So the idea is when an event occurs, instead of a human waking up and signing in and trying to understand what it means and what to do about it, we instead have the AI analyze it, do an assessment of the severity of it, the likely troubleshooting steps, the likely root causes, and the steps you might take to remediate the problem.

Humans will have a very different role. In the old world, humans were always in the loop. In the future, humans will be on the loop, they’ll be supervising the AI and supervising its performance and its decision making and maybe correcting it, but generally, increasingly the humans will be not the ones doing the operations, they will instead be supervising AI much in the way that a commercial airline pilot would supervise the autopilot on a jumbo jet instead of hand flying the airplane for 12 hours.

We are going to show you our software platform, which just got some really good press yesterday. This is cascadeo.io. We’ve been working on this for several years. We’re featured in Forbes magazine when we launched it. And this is our take on how you operate cloud deployments in the world. and we’ve integrated this with OpenAI’s APIs so that when events come in, you can essentially get automatic runbook recipe creation and get the benefit of AI on events, even if we’ve never seen them before. So cascadeo.io tries to solve for all of these key operational tasks, but in the way that you might already have some of these, like a ticketing system or Slack, we integrate with those, but in the case where you don’t, we simply deliver it as part of the service offering, and this is something that we give everyone, not just our customers, at no cost, because we think it’s important for everyone to have these capabilities. But ideally we would hope that you would consider us as your managed services provider, in which case, monitoring, alerting goes to our network operations center and we help you with remediation and other operational activities.

So we’re going to flip over now to a live demo. So give me a moment while I stop my screen share here and show you what we have built because this really is pretty darn amazing. So this is the web user interface for cascadeo.io. You can see here we’ve connected this particular environment to Azure, GCP, and AWS. So all three of the major hyperscalers are connected. We’ve also connected it to Slack. And in this case, we have a couple of email users who prefer to get their notifications and email. One of the key concepts in cascadeo.io is the idea of an event pipeline. An event pipeline lets you take alerts and incidents and issues from different, sources. In this case, we have applications running in all three of these hyperscale environments, feed them through our centralized platform, enrich them with generative A.I., and then output them to different targets depending on the nature of the event. So, for example, this customer has three different cloud workloads on three different cloud platforms. Those alerts go to the Cascadeo operations center where we respond to those alerts help operate the customer’s cloud deployment.

They also go to Slack, email, etc. We also have things like dashboards, inventory management, cloud governance, a bunch of work around, organizational management. If you wish, you are free to try this at app.cascadeo.io, also linked to from the Cascadeo.com website, and give it a try. But the key concept here is lots of different sources of data, lots of different potential outputs, AI in the middle, managing the stream of events and alerts.

So next I’m going to show you the output from that system. So here is an example of an email alert from cascadeo.io. In this case, the specific event is that a Lambda function, this is serverless code, takes too long to execute. You can see a visual representation of the problem down here. You have all of this content generated on the fly. The interpretation of the problem, how you validate to see if it’s really an issue, what you might look at to troubleshoot it, and then options for fixing the problem. All of this is generated by OpenAI GPT on the fly in a matter of seconds. And we can do this even for events that we’ve never seen before. So this doesn’t just work for things that have occurred in the past. It works for any new event that appears spontaneously in the production world. And I’m going to show you a series of quick examples of what that looks like.

So, this is Zendesk, for those not familiar, it’s a standard trouble ticketing system. These are a series of problems, Elasticsearch, VPN, serverless applications, SQL Server, RDS, which is Amazon’s managed database, load balancer, all sorts of different problems from different layers in the stack. And, in different scenarios, we use different AI tools to give you markups.

One of those tools is AWS DevOps Guru, which gives you insights and a bunch of recommendations about what to do about a class of problems. So this talks about Lambda errors and how to look at the Lambda logs and things like that. When we use OpenAI and GPT, we get a runbook recipe that is generated dynamically. So in this case, it’s a database that is running out of memory, and they tell you how to interpret the alert, how to validate it, and then steps that you might take to fix it, so that the human operator is already given a set of steps to take when they sign in, and you can imagine the next level of this being programmatic remediation, where instead of a human operator logging in and taking these steps, the machine executes them if it knows that it’s safe to do so.

Similarly, this is throttled invocation. So too many Lambda invocations exceeded in a high traffic scenario. It takes the event, it gives you an interpretation, and it gives you validation steps and remediation steps. So it tells you the things you can do to fix or mitigate the problem, and it does this in like 10 seconds. So when the event occurs, this markup happens in real time, and by the time you receive the event alert, you already have a custom written runbook recipe that tells you what to do, even for new events that we haven’t seen before. This one is storage on a database server. It tells you steps you might take for scaling the database server.

This one is a CPU in an Azure container group. It tells you an interpretation of the alert. It tells you about the thresholds, the environment, what it means. It tells you next steps and remediations for that specific alert. And again, all of this is done by OpenAI’s GPT. In this case, there’s a VPN that has too much traffic going out. And it tells you the steps you might take to fix the VPN. Again, something it’s never seen before. So, the general idea here is that even for events that have never previously occurred, we want to immediately come with a good first cut answer at a runbook recipe and then have a human review that and make sure that it’s correct and make any adjustments that might be needed to perfect it.

But it’s definitely better than humans winging it in production environments and just trying to kind of figure out as we go, what steps to take to fix a problem, which unfortunately is the way most it systems have historically been operated. It’s much less precise than people generally appreciate. It’s much more often the case that it varies by the operator and the operator makes the decision on the fly about what to do, how to do it, when to do it, etc.

Now I’m going to switch back to our main presentation here. And again, that’s cascadeo.io, which is our software platform, free to use. So, this is how we use OpenAI’s technology for customer operations. Tier 0 is not human. This is, literally monitoring, alerting, notifications. This is really all software driven.

Tier one in our world are the frontline engineers who do things like validate problems and take the initial steps to remediate them based on runbook recipes. Tier two is capable of more complicated remediations. These are more skilled engineers, but these are definitely human.

And then tier three does things like tune the system, review the runbook recipes, write those recipes in conjunction with the AI so that when future events occur, we have a better chance at either automatically remediating them, or at least telling the responding engineer the right series of steps to take right out of the gate.

We have integrated cascadeo.io with OpenAI, and we are using its existing pre-trained models and helping to tune it over time, so that as time passes, we keep getting better and better programmatically at building runbooks, tuned and perfected by humans but generated by AI, and in the near future even executed by AI, meaning it’s not just going to tell you what to do, it’s actually going to do the remediation work.

So how does this play into security and compliance in the cloud world? Obviously, very, very important identity and access management. So this is authentication, authorization, and access control. This is important for your humans. It’s important for your MSP It’s important for your monitoring tools like cascadeo.io. Obviously, you should use multifactor authentication for humans. You should use role based access control for systems. And ideally, you have a single sign on solution so that you don’t have a bunch of random local accounts in different systems, and you instead have centralized control. Data encryption and security has never been more important. You need to encrypt data when it’s at rest on disk, and when it’s in transit moving about between systems. Data encryption is only as good as the encryption algorithm and the key management, meaning if you encrypt the data, even if you encrypt it very well, if the bad guy gets the key, your encryption is largely worthless.

So key management is a whole discipline. The cloud platforms provide excellent key management tools. if you’re not using those to manage your keys, you’re probably doing it wrong. There’s also secrets management like CacheCorp Vault, that’s used for storing things like database, connection, strings, and passwords. Very, very important that you get this right, because this is the linchpin that keeps the whole thing together. And if you fail at key management, you fail in general. And then, if you’re going to run a multi-tenant SaaS service, you need to isolate and segregate the data, right? You can’t have all your customers’ data commingled, because, in a nightmare scenario, you could have a cascading breach where all of your customers get compromised because your system gets compromised.

The art of network security has developed dramatically because of AI. So instead of looking for static pattern matching against known threats and vulnerabilities, we now can look for things we’ve never seen before because they are likely to be problematic or, precisely because they’ve never been seen before. Very important to get this right as well. Principle of least privilege. it used to be that people thought having more access was better and it was an honor. It turns out it puts you at risk. So the goal should be to have as little access and, capability as you need, but nothing beyond that, because that way, if things do go wrong, which they someday will, you have less exposure and the blast radius is limited.

There are a lot of compliance and privacy concerns. In Europe, we have GDPR. In the U. S., we have HIPAA for healthcare data and PCI DSS for credit cards and other sensitive, personal information. There’s a whole body of work around this, and if you’re not familiar with this, this would be a good way to engage with a trusted partner to make sure you get this right. Because these regulations are unforgiving. For example, if you violate HIPAA and leak patient data, if you’re a health care provider, the penalty is something like a quarter of a million dollars per incident. So definitely a major penalty for running afoul of this even inadvertently. And then, you know, incidents will happen.

Security incidents are going to happen. Operational incidents are going to happen. The key here is timely detection and response. You don’t want to have lurking, persistent threat actors that are in there for six months. Very important to have and rehearse your incident response plan for security and operations. And if you want to preserve trust and transparency with your customers, you have to own it, acknowledge it, be upfront about what’s happened and what you’ve done about it. The worst thing you can do is try to hide it because someday they will find out. Maybe because their data appears on the dark web, and if they learn that you were not forthcoming, they may be very, very upset, and you will have permanently damaged trust, and that can be very hard to recover from.

Just as in the data center, monitoring and alerting are key capabilities. cascadeo.io is our platform for this. There are lots of other ones. In the legacy world, you have things like LogicMonitor and Xenos and things like that. In the cloud world, a lot of this is handled by the cloud control plane. So, for example, AWS has CloudWatch metrics.

Time series data are measurements we take over time. Events are all the things that  happen, good, bad, or indifferent. Alerts would be a subset of the event stream. So think of alerts as notable events. Notifications would be an alert that gets delivered typically to a human, like an email or a slack message or a pager duty induced SMS or phone call. And then remediation, of course, is taking corrective action in response to an alert or another event. Logs are more or events are more than just logs, right? So a lot of people conflate these things. Logs come from different sources. They come from the cloud provider. They come from the application, from the database. They come from authentication systems, but there are other events that are not contained in system logs. These would be things like state changes, like we change something about the environment. Or we deployed something new or we scaled something up. And it’s really, really important that you collect all of this.

So in the cloud you can turn on things like flow logs and get all of your network transactions recorded. You can turn on change logs so that when people make changes to the cloud environment, those automatically get recorded. And the general approach you should take is to log and record everything because, if you don’t think to record it, and then you need it, you’re going to find that it’s too late.

Other key concepts, this day and age, anonymization and tokenization of data. So, the idea here is some data is too sensitive to leak, or maybe you want to share it with a partner or an application. You can mask or obfuscate certain parts of the data set, or you can replace parts of it with tokens that are generally not reversible, by a human. So, an identifier that’s unique but doesn’t reveal who the customer is. Obviously, the business impact of a leak or loss of data is very severe. You end up with negative publicity, loss of trust, even share price impact in a worst case scenario for a public company. And consumers are very much aware of this. This is no longer a fringe thing. This is really, really important. We talked about these regulatory concerns. And many of these problems, as it turns out, are intractable, right? So, there are always going to be tradeoffs between speed and security. There are these advanced persistent threat actors that get in and do nothing for a long period of time, and then wait for the right moment to strike. These are like ransomware actors where they get in, you know, carefully plot their attack and then all at once lock you out of all your systems and data and demand that you give them money to unlock it.

Governments actively work to undermine security. So while they’re telling you to secure systems, they have other divisions that are working to undermine security by weakening encryption, defeating it. Introducing back doors into systems, or in some cases through brute force legislation requiring companies to give them data that we would consider to be private. And you can imagine that in certain governments like China, that’s much more of a concern, but even in the U. S., this is an issue with NSA. often at odds with companies that want to secure data and systems and a tough one is liability, right? So you can be held liable for things that are often outside of your control.

And an example of that would be, you know, if you had a system that had a security vulnerability and you leaked patient data as a healthcare provider, you have run afoul of HIPAA and could be liable for a quarter of a million dollar fine, even though the system was largely or completely out of your control.

Generative AI is going to change a lot about cloud. And these are a few ways that in the future cloud is going to change and evolve with the generative AI world. One of them is cloud resourcing and spending. Right now this is often human powered. Humans make decisions about what to spend on and how much to spend and what resources to provision. But you can imagine continuous optimization by AI where instead of humans periodically going and looking for ways to save money or to be more efficient, the AI does that for you. Fully autonomous cloud environment would be automated deployment, automated scaling, automated teardown, automated deployment to new regions, automatic replication of data without any operator intervention. So systems that literally run and tune and scale themselves. You could think of this sort of like a drone, where it’s an aircraft that no longer requires a human pilot, and is even capable of making some navigational decisions, and even in a nightmare scenario. The decision to end human life without operator intervention. So cloud environments no longer necessarily need to be hand flown like a Cessna would. This is much more like a drone that is capable of autonomous flight and navigation decision making.

Federated learnings, this would be where we would take data from lots of different sources, the public internet. Code repositories, books, novels, songs, images, mash that all together and synthesize net new data and images and music from this sea of data.

And this already exists. So if you haven’t seen stable diffusion, for example, it can make photorealistic images that are almost indistinguishable from a human artist or photographer. And it can do that by you simply telling it, in natural language, what image you want, and it will produce it on the fly.

And then of course, this will all move out to the edge. So in the future, this will not be nearly as centralized and consolidated. You will eventually have large language model style AI on your mobile device and in your email client. And you could imagine where you might wake up and in your email is a series of drafts of responses to the email you received overnight that you go and proofread and edit and then hit send on, but 90% of the work is done for you.

This is already a reality for a lot of people or a lot of companies. And very soon, as I mentioned, it will be built into things like Office 365. So this is coming much sooner than you might think.

We are now at the point where we will take any questions from the audience. So I see we’ve got a couple in there, so let me quickly flip over to that, and we will answer any questions that came up here.

 

First, what do you think companies who are using generative AI tools need to do to perfect or protect user privacy?

So if you’re using chat GPT and open AI, they have explicit feature flags that let you secure and, not leak data. But here’s the key thing. If you use the free version and you use the chat interface, by default, your questions and the responses and the data you feed it become part of its training data and could be given to another user.

So it’s very, very important that you enable the feature explicitly that says, don’t use this as training data, don’t store it, and don’t share it. And if you use their paid API, which is what cascadeo.io does, that’s on by default. So this is really more about the chat interface. You need to make sure that your employees are aware of these issues and that they understand your company’s policies for how you how employees can use these tools in the workplace. these are very powerful tools. They’re incredible productivity enhancers. You don’t want to discourage people from doing it, but you need them to understand the risks and the guardrails and the policies and the restrictions. For example, you don’t want people cutting and pasting your proprietary source code into chat GPT because it might regurgitate that proprietary source code to a third party in the future.

Similarly, you need to be careful about mentioning things like customer names when you interface with these systems, because that information could be used in different ways and leaked to other users in the future without your knowledge.

All right, next question: Since I currently have a cloud management platform, if I wanted to use the one you showed, could this be an add on or is it rip and replace?

That’s a great question. So cascadeo.io is designed to meet you where you are. It’s designed to integrate with your current deployments of applications in your data center, in the cloud. It’s designed to connect to your existing tooling. So if you have a ticketing system like Zendesk, if you have a chat system like Slack, if you have email, which everyone does, we’re not going to ask you to rip and replace that. We are instead going to augment it and integrate with it. In the case where you don’t have those things, like if you don’t have a ticketing system, we give it to you for free. And we absorb the cost of that. And we do that even for people and companies that are not Cascadeo customers, because we think this is really, really important. We think this is a great way for you to get to know us and our capabilities in our company. And hopefully at some point you say, Hey, could you help me with this project? Or could you become my managed services provider and do operations for my cloud deployments? But we definitely don’t expect that, right? It’s not required. It’s not expected. You’re free to use it indefinitely with no expectation that you’re ever going to buy anything. That may not be true forever, meaning we might have to limit the number of people that we give this to for free because it’s not free to build it and operate it. We spend a lot of money on systems infrastructure and AI and databases and compute and things like that. But for the foreseeable future, our intent is to let everyone, customer and non-customer alike, use it. And again, to make it very easy to integrate with your existing systems without doing a rip and replace.

And that really is pretty amazing, because generally if you want to augment or modernize your telemetry and monitoring and alerting, you have to rip out or completely replace what you already have. Our belief is that’s too, too great a hurdle and you shouldn’t throw the baby out with the bathwater. So our intent is to meet you wherever you are, take whatever existing tooling you have and make it really easy to integrate that into cascadeo.io.

 

And then next question: What is a use case for AI that isn’t here yet, but what you are personally excited for? let’s see, I’m trying to think of one that, that hasn’t yet been tried, and that’s not a nightmare scenario, right? So, of course, there are all sorts of scary scenarios like weaponization and autonomous weapons and things like that. What about autonomous healthcare? What about AI that is capable of catching things, diagnosis, or helping you, coaching you to a healthier lifestyle, that is precisely tuned to your biology and your diet and your exercise regimen and your overall health? There’s early bits and pieces of that coming out, but really right now they are guesses and suggestions that you would take to a human doctor or a veterinarian if you’re dealing with an animal, but you can imagine a scenario in the future where for common things. You know, a lot of that becomes completely AI driven and we save the doctors and the vets for really serious and unusual scenarios.

But for the day to day routine, you know, physical exams and coaching and, you know, lifestyle coaching and things. That increasingly becomes an AI powered instead of human powered.

 

All right, that was the last question that popped up. So I think we’re going to go ahead and wrap up right on time here. Thanks everyone for tuning in today. And of course, if you do have questions, feel free to send them to info@cascadeo.com or you can email me directly. I’m jared@cascadeo.com.

We would be happy to answer any of those questions.

And of course, if you do have cloud engineering or managed services needs, we would be happy to help you with those. That’s the business we’re in and we encourage you whether you intend to work with us or not to use cascadeo.io because we think this is really, really, really important transformational technology and we are not aware of anybody else who has anything like this.

So I’m sure others will emulate it and build things like it in the future. But today this is state of the art and it’s something we are choosing to make freely available. So we hope you’ll give it a try. And, at this point, I’m going to wrap it up and thank you again for joining. Please do reach out and stay in touch.

And thank you again for being here today. Take care and have a wonderful day. Bye bye.