ChatGPT, Bard, and other prominent Large Language Models (LLMs) have dominated our news feeds over the last year. And rightly so. These exciting technologies offer us a glimpse into the future, the power, and the possibilities of AI.
While much of the public excitement has centered around creating text, images, and video, these tools can be used for lots of other disciplines, like software automation.
This article will act as a deep dive into how prompt engineering can help us with software automation. However, our first port of call should be an examination of prompt engineering itself.
What is prompt engineering?
Large language models like ChatGPT produce outputs based on the prompts or sentences we provide them. However, the results vary greatly depending on the words or instructions we use. When we input vague and imprecise instructions, the output might not hit the mark.
Prompt engineering refers to the considered design of inputs that help elicit more precise, accurate, and ultimately usable content from these exciting AI systems.
Large Language Model (LLM) systems use natural language processing (NLP) to interpret the statements we give them. The machines turn these questions or instructions (i.e., prompts) into code and run them through their vast repositories of data to produce content in whatever format we specify (i.e., text, images, code).
ChatGPT was trained on over 570 GB of data. The training material consists of books, articles, web texts, and so on. In other words, these data sets contain an unimaginable amount of knowledge.
While we may understand the process, much of what happens underneath the hood of these systems happens out of our sight. Sure, we control the inputs and outputs, and we train the system, but precisely how these algorithms work and make the decisions is still something of a mystery. In the words of Sam Bowman, an AI professor at New York University, “We built it, we trained it, but we don’t know what it’s doing.”
Prompt engineering helps us manage that chaos by using outputs that produce predictable and usable results. They offer us a pathway to unlock the vast amounts of knowledge inside these applications. The discipline is emerging as a new career, with courses springing up everywhere as businesses work out how they can harness this powerful technology.
How can prompt engineering help
with software automation?
Software automation and LLMs have much in common. They both offer a glimpse of a future where machines will augment human creativity to create faster, more productive workplaces.
There are several exciting areas where both of these technologies can converge. Here are three ways that we can use prompt engineering in software automation.
#1. Generating code
Writing code is one of the most promising applications of Large Language Models. AI LLMs are in their infancy. The next few years should see this technology improve as more resources are added to both computing and training.
In the long run, these advances could see AI write whole programs with limited or no human intervention. However, for now, LLMs have some limitations. The quality of the output of LLM coding depends mainly on the quality of the input. Garbage in, garbage out, as they say.
Of course, it’s not just effective prompt engineering that acts as a roadblock. As suggested in ChatGPT and Large Language Models in Academia: Opportunities and Challenges (Meyer, 2023), “Currently, ChatGPT is more likely to be successful in accurately writing smaller blocks of code, whereas its reliability in writing larger/more complex programs (e.g., a software package) is questionable.”
Furthermore, in a recent article in Nature magazine, some computer scientists warned that we should approach code generation with LLMs with some caution. Another contemporary paper, Large Language Models and Simple, Stupid Bugs (Jesse, 2023), demonstrated how a popular LLM, Codex, which is used by the vendor Copilot, produces “known, verbatim SStuBs as much as 2x as likely than known, verbatim correct code.”
While these problems can’t be ignored, there is still a lot of justifiable excitement about how these programs can help democratize software development by supporting technical and non-technical teams alike.
Perhaps the most impressive thing to consider is that tools like ChatGPT can produce functional code very quickly. With the right prompt, engineers can reduce the time it takes to program certain types of code, ensuring a swifter software development life cycle.
At the end of 2022, the popular programming hub Stack Overflow banned AI-generated answers on its forum. They cited the high error rate and inaccuracies associated with the application. However, the technology is in a nascent stage; furthermore, the dissatisfaction with AI-generated output owes as much to poor prompt engineering as it does to the technology itself.
Despite the misgivings over the tech, a recent piece by McKinsey highlights the impact that prompt engineering is already having in the world of programming. The consulting firm’s The state of AI in 2023: Generative AI’s breakout year shared two interesting trends. Firstly, 7% of organizations that have invested in AI are hiring prompt engineers. Secondly, companies that are using AI have reduced AI-related software engineering roles from 38% to 28%.
One way to interpret these trends is that businesses are comfortable with this setup and ready to hand off software automation to their machines. While these figures might startle existing engineers, the McKinsey survey suggests that “only 8 percent say the size of their workforces will decrease by more than a fifth.” Overall, engineers will probably need to reskill to take advantage of the trend toward AI-generated software automation.
One obvious application for AI-generated software automation includes creating automation bots. However, while prompt engineering is an ostensibly user-friendly interface thanks to its focus on conversation, it remains to be seen whether it can supplant existing solutions.
In many ways, software like ZAPTEST has already democratized the software automation market. No-code tools are here now that allow nontechnical teams to build high-quality RPA bots. While software like ChatGPT can build bots, implementation and maintenance could prove tricky for anyone who is not a software engineer and even those who are.
Recording human-computer interactions from your GUI and converting these movements into code is far more user-friendly than using prompts. When coupled with LLM’s potential to produce unstable and error-strewn code, it’s fair to say that RPA software isn’t going anywhere for the foreseeable future.
#2. Converting unstructured data
Unstructured data is not Robotic Process Automation’s strong suit. The tech was not built to handle things like emails, pictures, audio, and more. RPA tools need predefined data models with organized structures.
A huge proportion of the unstructured data involves natural language text. Large language models are built to “understand” this information and extract semantic meaning from them. As such, this creates a considerable opportunity for teams that want to interpret these texts and convert them into a format agreeable to RPA tools.
Many teams have been using natural language processing (NLP) for years to help them with sentiment analysis. This process, also known as opinion mining, helps organizations keep on top of consumer’s feelings and attitudes toward brands. In the majority of cases, these tools are used to detect positive, negative, and neutral sentiments within text. However, the technology is capable of far more granular emotional detection, too.
While there are several tools on the market that offer this functionality, LLM’s provides a pathway to more versatile use beyond understanding how people feel about a product or service. For example, data analytics have exploded in popularity in recent years. Big Data gives companies an edge by allowing them to derive insights and understandings that help with data-driven decision-making.
Robotic Process Automation tools can help with gathering data. However, as we mentioned above, they struggle with specific types of information. However, when paired with AI tools that use Large Language Models, RPA can gather large amounts of data and use it to generate the information that is required for Business Intelligence (BI) tools.
One of the more exciting aspects of Generative AI is its ability to make sense of data inputs. With the right prompt engineering, teams can turn this data into a format that works for their RPA tools.
RPA can help make Big Data workflows more efficient. For starters, you can use it to help with both data entry and extraction. However, perhaps the most valuable and intriguing use cases involve using RPA tools for transforming, cleaning, and loading data or ensuring data migration runs quickly, efficiently, and accurately.
Another important point of note is data governance. Automating data requests helps organizations stay compliant and keeps data out of sight of manual workers.
#3. Test Automation
Test Automation has taken off in software development circles because it provides a quicker way to verify software. Testing and quality assurance have traditionally been expensive and time-consuming processes; test automation provides a solution to both these challenges.
One of the first things that prompt engineering can do is to improve the quality of test cases. With the right prompts, these machines can analyze test cases and identify issues and remedies. This process can enhance the scope of test cases and lead to more comprehensive tests.
For example, you can feed a large language model code in much the same way you might a human reviewer. These machines can quickly run through the code and spot errors, bugs, and even identify performance issues. Perhaps more intriguing, LLMs also offer the possibility of completing test case code from mere snippets, accelerating the creation of test cases.
Prompt engineering aims to tackle many of the issues that have driven the emergence of Agile/DevOps approach to software development. Engineers want efficient, easily repeatable tests that can spot issues before applications are deployed. The idea here is that by freeing up time, software developers can concentrate on more creative and value-driven tasks.
As outlined in a classic paper, Technical Debt in Test Automation (K. Wiklund, 2012), software development teams can run into problems if they spend too much time on manual testing and verification of their software. Initial costs of test automation solutions, a lack of automation experience, and even a preference for older methods can contribute to these slowdowns.
One of the most interesting aspects of Agile software development involves Behavior-Driven Development (BDD). The concept refers to developing software with expected user behaviors. While implementing this approach can clearly save time, many teams struggle to bring this automation to life. However, LLMs can provide a solution.
Some of the most common symptoms of technical debt include poor documentation and a lack of robust testing. These are problems that LLMs of today can help resolve. However, other notable symptoms, such as refactoring, are too complex for current Generative AI, and may not result in time savings.
Final thoughts
Generative AI applications have immense potential. However, the user-friendly, conversational interface can be misleading. Many people believe that it’s straightforward to generate quality outputs from these machines. However, excellent prompt engineering is more complicated than you might expect.
Effective prompt engineering requires a lot of trial and error. It also needs a lot of forethought on behalf of the engineer to ensure the answers are useful. Finally, checking and rechecking the work is important due to the well-publicized potential for errors.
While prompt engineering jobs might be on the rise, not everyone is convinced. Writing in the Harvard Business Review, Oguz A. Acar makes a fascinating argument that “future generations of AI systems will get more intuitive and adept at understanding natural language, reducing the need for meticulously engineered prompts.”
Whatever the future holds, Generative AI will be there in the mix. While prompt engineering has lots of promise, it’s hard to tell for sure what precise role it will play.
Interestingly, software test automation software is already packed with use cases and success stories demonstrating its suitability for speeding up software development without compromising on accuracy or comprehensive verification of applications.
Tools like ZAPTEST already allow developers to address issues like inadequate time and resources, technical debt, documentation, and comprehensive testing and RPA. What’s more, these tools are more user-friendly than prompt engineering, making them far more suitable options for nontechnical teams. As ever, the real potential lies at the intersection of these exciting automation technologies.