Can Emerging AI Services Stand Up to Outsized Expectations and Demand?


ChatGPT represents the tip of an iceberg for AI performance, scalability, and security challenges accompanying the next wave of internet experiences. For AI to be successful, best practices to assure AI use cases must be in place from the beginning, including an advanced testing strategy.

OpenAI’s launch of ChatGPT in November swept the web by storm, unabashedly ushering in a new era of internet tech. (BTW, "GPT" stands for "generative pre-trained transformer.")

The fledgling service gave the masses a first look and hands-on experience with the power of AI. Intuitively, users began testing the chatbot’s ability to compose lyrics, write university papers or just answer basic questions.

Serious commercial exploration is happening as well, such as using ChatGPT to write code or embedding it into products via open APIs. Microsoft just invested $10 billion in OpenAI, upping its initial $1B infusion. That puts the global software leader in a strong position to incorporate ChatGPT intelligence into its ubiquitous enterprise products. The company also plans to let companies create their own custom versions of ChatGPT. It’s easy to imagine infinite potential.

This is just the beginning of the coming AI wave.

Ubiquitous, personalized AI may struggle to meet demand

ChatGPT is the fastest-growing consumer application in history, logging 100 million active users just two months after launch. Other services are positioning to join the fray, and many will face the same demand pressures. For all its success, ChatGPT has also been continuously prone to long access delays and system crashes.

ChatGPT will not be the only chatbot that tunes large language models with supervised and reinforcement learning techniques. AI will become ubiquitous, just like spell check software, when it is built into applications such as word processors, where the AI app can offer background information to enhance a paper or article. Microsoft has already announced ChatGPT-powered features on Teams.

In an AI-enabled future, an application or service will morph dynamically to be tailored to individuals. With deep, automatic personalization, one person’s instance of an app may look different than a colleague’s, depending on customization and preferences.

The technology powering ChatGPT is inherently cloud and microservices-based. So, for example, Kubernetes needs to dynamically expand and contract worker loads to support user traffic. At the moment, the frequent “come back later” user error messages mean ChatGPT optimization is not working well.

These growing pains are putting a spotlight on the lack of preparedness for the assurance of AI-driven services, coupled with the robust demand—a critical gap to close as stakeholders pursue commercialization.

Best practices to assure AI use cases

ChatGPT and similarly powerful AI offerings are complex and dynamic networks that, like any network, need to be tested to ensure they will work under a variety of loads.

When ChatGPT is incorporated into a product, the embedded ChatGPT must be tested, along with its impact on surrounding product features. The same is true regarding security.

Anything generated by ChatGPT becomes public domain, meaning data could potentially leak and bad actors may use AI to generate attacks. According to recent reports, ChatGPT is already being used by hackers to create low-level cyber tools, including malware and encryption scripts.


When ChatGPT is incorporated into a product, the embedded ChatGPT must be tested, along with its impact on surrounding product features.

Beyond basic conformance to interfaces and protocols, AI services must be assessed across:

  • Performance, encompassing reliability, predictability, quality of experience (QoE) and question response size. This includes microservice capacity, performance, and workflow behavior under various load conditions, and impact of node expansion and contraction on QoE.

  • Scalability, measuring the ability to dynamically handle a variety of realistic high and low traffic loads.

  • Security capabilities to protect against and withstand known and potential attacks, Kubernetes and docker hardness, and impact of traffic loads on security.

With this in mind, ChatGPT and AI test plans must incorporate capabilities that span:

  • Emulation of realistic (sunny day and rainy day) scenarios under scale, including traffic, protocols, and user actions. The workloads are tested when a thousand people ask the same question and when a thousand people ask unique questions.

  • Reproducibility given continuous AI software updates demand regression test suites that reproduce workloads and tests.

  • Automation to speed time-to-market and management of continuous updates based on an automated test lifecycle that integrates with developer CI/CD processes.

The future—and demands—of AI looms large

ChatGPT is just the tip of the iceberg. The coming decades could realistically see AI used by anyone to build customized, commercial applications like word processors or software instruments based simply on a description.

Hyperscalers might create an AI-powered interface layer that can be used to request a virtual chain or network. The AI would quickly use its knowledge of zones, data centers, and scripts to build the detailed technical code.

The more important these services become, the more critical it will be for these instances to be accurate, secure, and scalable. The only way to assure AI can deliver on expectations is rigorous testing that holistically addresses the demands of any services before it.

Spirent is applying its deep and broad expertise in automated cloud-native and traditional network testing, emulation, and services to emerging AI services, working with enterprises and vendors as they prepare to test emerging AI services and solutions.

Learn more about Spirent’s cloud and virtualization test solutions.




Chris Chapman

Senior Methodologist, Spirent

With over 20 years in Telecommunications and 11+ years of network performance theory, Chris has extensive knowledge in testing and deployment of L1-7 network systems. His expertise includes performance analysis of QoS, QoE, TCP, IP (v4 and v6), UDP, QoE, HTTP(S), FTP, WAN acceleration, BGP, OSPF, IS-IS. MPLS, LDP, RSVP, VPLS, firewalls and load balancers. His specialties are centered on testing L1-7.