Artificial intelligence will inevitably transform legal practice, but it will take time to realise its full potential. Enormous benefits, and dangerous pitfalls, await the tentative legal user. And careful regulation is also required, writes prominent in-house lawyer and legal transformation advisor, Sharyn Ch’ang
Twenty-first century real-world advancements in AI are already used in consumer and business applications, even those from the relatively new field of generative AI.
OpenAI’s public domain ChatGPT broke records with more than 100 million users in the first two months of its November 2022 release. Industry heavyweights like Amazon, Alibaba, Baidu, Google, Meta, Microsoft and Tencent have their own generative AI products.
We’re witnessing a democratisation of AI in a manner not previously seen – you don’t need a degree in any subject to use these tools. You just need to type a question.
If you need any convincing, just listen to today’s big tech gurus like Google CEO Sundar Pichai, who described AI generally as “the most profound technology humanity is working on. More profound than fire, electricity or anything that we have done in the past,” in an interview with The Verge in May. Pichai is not alone.
There are different types of generative AI models, each quite complex to explain. The generative pre-trained transformer (GPT) model is commonly known due to the popularity of ChatGPT.
However, as ChatGPT is a general-use AI tool, this legal industry-focused article instead references Harvey. This is a generative AI platform built on OpenAI’s GPT-4, and is the best example of a multi-functional, generative AI platform designed specifically for lawyers.
Similar legal industry generative AI tools include CoCounsel, LawDroid Copilot and Lexis+AI, and there are other tools with more limited functionality.
Brief history of AI
The foundations of AI can be traced back to the 1940s and mechanical computing pioneers like Alan Turing. Turing is credited with early visions of modern computing, including the hypothetical Turing machine, which could simulate any computer algorithm. In 1950, Turing’s seminal article titled Computing Machinery and Intelligence described how to create intelligent machines and, in particular, how to test their intelligence.
This Turing test is still considered a benchmark for identifying the intelligence of an artificial system: if a human is interacting with another human and a machine, and is unable to distinguish the machine from the human, then the machine is said to be intelligent.
“Artificial intelligence” was coined as a term in 1956, at the Rockefeller Foundation-funded Dartmouth Summer Research Project on Artificial Intelligence workshop hosted by computer scientists John McCarthy and Marvin Minsky. McCarthy and Minsky, together with computer scientist Nathaniel Rochester (who later designed the first commercial scientific computer, the IBM 701) and mathematician Claude Shannon (who founded information theory) are considered the four founding fathers of AI.
The 1960s saw advancements in machine learning and expert systems, and substantial US and UK government funding for AI research. Outputs like Arthur Samuel’s checkers playing program were among the world’s first successful self-learning programs and a very early demonstration of AI fundamentals.
Expert systems were a prominent area of research for the following 20 years, with the likes of IBM’s Deep Blue chess playing program, which beat world chess champion Gary Kasparov in 1997. However, expert systems are arguably not true AI because they require a formal set of rules, reconstructed in a top-down approach as a series of “if then” statements.
The 1980s and 1990s are often referred to as the AI winter, with high expectations by early AI researchers not reflecting the actual capabilities of the AI models being developed. Nevertheless, this period brought significant developments in subfields like neural networks – important because generative AI models use neural networks and deep learning algorithms to identify patterns and generate new outcomes.
AI research saw a resurgence from the turn of the millennium, driven by advancements in computing power and the availability of vast amounts of data. In 2011, IBM Watson, a computer system capable of answering questions posed in natural language, won popular attention by defeating two human champions on the quiz show, Jeopardy!
IBM’s broader goal for Watson was to create a new generation of deep learning technology that could find answers in unstructured data more effectively than standard search technology. Deep learning, a subset of machine learning focused on artificial neural networks with multiple layers, is also a key technology underpinning generative AI.
AI has boomed since 2011, spurred by breakthroughs in computing power, algorithmic advancement and more systematic access to massive volumes of data. AI applications have flourished across various domains including natural language processing, robotics and autonomous vehicles.
AI continues to evolve rapidly, with ongoing research in explainable AI, ethics and the development of generative AI systems that are more transparent, accurate and accountable. While there is immense potential for further innovation, there are also serious industry concerns about the unexpected acceleration in AI systems development and its direction.
This is best exemplified by comments from computer scientist Geoffrey Hinton. Hinton is widely recognised as the “godfather of AI”, who quit his long-time AI role at Google to speak freely without his views reflecting on the company.
His apocalyptic warning is that while machines are not yet as intelligent as humans, in perhaps five to 20 years they may become super intelligent, and become a threat to humanity.
His overarching concern is the lack of global and national legal and regulatory guardrails to control the misuse and abuse of AI, especially from bad actors proliferating deep fakes, harmful content, and mis- and disinformation. He believes it’s unrealistic to halt or pause AI innovation, so governments worldwide must take control of how AI can be used.
Generative AI is a subfield of AI where the system or algorithmic model is trained with human help to produce original content like text, images, videos, audio, voice and software code.
One form of generative AI is a large language model (LLM), a neural network trained on large amounts of the internet and other data. It generates outputs in response to inputs (prompts), based on inferences over the statistical patterns it has learned through training.
ChatGPT and OpenAI’s GPT-4, on which Harvey is built, are popular examples of LLMs. The model can process prompts at a speed, volume and accuracy that outranks average human capability.
Unlike conventional AI systems that primarily classify or predict data (think Google search), generative models learn the patterns and structure of the input training data, then generate new content that has similar characteristics to the training data. However, responses are based on inferences about language patterns rather than what is known to be true or arithmetically correct.
Put another way, LLMs have two central abilities:
- Taking a question and working out what patterns need to be matched to answer the question from a vast sea of data; and
- Take a vast sea of data and reversing the pattern-matching process to become a pattern creation process.
Both functions are statistical, so there’s a certain chance the engine will not correctly understand any question. There is separate probability that its response will be fictitious; a hallucination.
Given the scale of data on which the LLM has been trained, and the fine-tuning it receives, it can seem it knows a lot. However, the reality is, it is not truly “intelligent”. It is only processing patterns to produce coherent and contextually relevant text. There is no thinking or reasoning.
The accuracy and relevance of an output will often depend on the input of the prompt engineering; how the question is asked and contextualised. A prompt might include:
- Particular instructions, like asking the LLM to adopt a particular role, or requesting a particular course of action if it does not know the answer;
- Context like demonstration examples of answers;
- Instructions about the response format; and
- An actual question.
Even with well-crafted prompts, answers can be wrong, biased and include completely fictitious information and data, with sometimes harmful or offensive content.
However, the potential benefits of generative AI easily outweigh such shortcomings, which major AI market makers like OpenAI are consciously working to improve.
Who and what is Harvey?
Harvey, the generative AI technology built on OpenAI’s GPT-4, is a multi-tool software-as-a-service AI platform that is specifically designed to assist lawyers in their day-to-day work.
It comes from a start-up company founded by two roommates – Winston Weinberg, a former securities lawyer and antitrust litigator from O’Melveny & Myers, and Gabriel Pereyra, previously a research scientist at DeepMind, Google Brain (one of Google’s AI groups) and Meta AI.
The story goes that Pereyra showed Weinberg OpenAI’s GPT-3 text-generating system and Weinberg quickly realised its potential to improve legal process workflows.
In late 2022, Harvey emerged, with USD5 million in funding led by OpenAI’s startup fund. In April 2023, Harvey successfully secured USD21 million series A funding led by Sequoia, with participation again from OpenAI together with Conviction, SV Angel and Elad Gil and Mixer Labs.
Typical of generative AI models, Harvey is an LLM that uses natural language processing, machine learning and data analytics to automate and enhance legal work, from research to analysing and producing legal documents like contracts. Sequioa states: “Legal work is the ultimate text-in, text-out business – a bull’s-eye for language models.”
In May 2023, only two organisations were licensed to use Harvey: PwC has exclusive access among the Big Four; and Allen & Overy is the first law firm user. More than 15,000 law firms are on the waiting list.
Harvey is similar to ChatGPT but with more functions, specifically for lawyers. Like ChatGPT, users simply type instructions about the task they wish to accomplish and Harvey generates a text-based result. The prompt’s degree of detail is user-defined.
However, unlike ChatGPT, Harvey includes multiple tools specifically for lawyers, where users can ask:
- Free-form questions that are legal or legal-adjacent, including research, summarisation, clause editing and strategic ideation;
- For a detailed legal research memorandum on any aspect of law;
- For a detailed outline of a document to be drafted, including suggested content for each section; and
Complex, free-form questions about, or requesting summaries of, any uploaded document without any pre-training or labelling.
Because Harvey, like all other generative AI systems, can also very convincingly make things up, including case law references and legislation, any prudent lawyer must properly review Harvey’s output before providing any legal advice based on it.
As this article goes to press, there are at least two precedent-setting examples in the US of judges requiring lawyers to take precautions if using generative AI to prepare for court.
One lawyer’s brief to the New York District Court in his client’s personal injury case against an airline cited six non-existent judicial decisions produced by ChatGPT. Naively, the lawyer said he was “unaware of the possibility that the content could be false”. The court has sanctioned two lawyers involved, fining each USD5000. In addition to basic rules of legal and ethical professional conduct, it’s simply common sense that a lawyer needs to be responsible for any representations or legal advice. That standard does not change whether the assistance is from peers, junior lawyers, paralegals or a machine.
In the US Court of International Trade, Judge Stephen Vaden issued an order requiring lawyers to file notices when appearing before him that disclose which AI program was used and “the specific portions of text that have been so drafted”, and that use of the technology “has not resulted in the disclosure of any confidential or business proprietary information to any unauthorised party”.
A Colombian judge also caused a stir in January by disclosing that he used ChatGPT to research certain questions on which he had to decide. While Judge Juan Manuel Padilla Garcia, of the First Circuit Court in the city of Cartagena, made it clear that he fact-checked ChatGPT’s output before using it, and also used precedent from previous rulings to support his decision, the inclusion of Garcia’s exchange with ChatGPT in the ruling has been contentious.
However, Judge Garcia’s appropriately cautious use and transparent disclosures, and his reasons for use including to “optimise the time spent writing judgments”, is a compelling example of how generative AI can augment faster delivery of access to justice.
Effective lawyers possess a combination of hard (technical) and soft skills. Table A provides a non-exhaustive list.
So, what is generative AI good at now, and how can it be used by lawyers?
Using Harvey as a baseline, generative AI can turbocharge the drafting of written material from scratch, and make edits and recommendations for replacement text. It can analyse, extract, review and summarise faster and at scale beyond human capabilities.
The practical consequence is that properly trained legal generative AI will enable lawyers to do many of their more routine, sometimes mundane tasks, faster, cheaper and more efficiently while improving the quality of work. The next two pages include examples of typical legal work that can readily be augmented with generative AI, liberating a lawyer’s time.
President and co-founder of OpenAI, Greg Brockman, said GPT-4 (on which Harvey is built) works best when used in tandem with people who check its work – it is “an amplifying tool” that allows us to “reach new heights,” but it “is not perfect” and neither are humans.
Will I lose my job?
Goldman Sachs, in its March report titled The Potentially Large Effects of Artificial Intelligence on Economic Growth, estimated 44% of current legal work tasks could be automated by AI in the US and Europe.
There’s no comparable Asia data yet, but it’s reasonable to assume the percentage may be similar. The legal category takes second place only to office and administrative support (46%). Third place goes to architecture and engineering (37%).
But to answer the question, no, AI will not displace lawyers as a profession. That is evident when you review the skills list in Table A of what generative AI cannot do of its own accord. The future of legal practice is a world where generative AI is an indispensable productivity tool, augmenting lawyers.
While AI will automate routine tasks and assist research, analysis, drafting and similar work, the nuanced and complex aspects of legal practice require human expertise, empathy and judgment.
Harvey’s Pereyra agrees. “Our [Harvey’s] goal is not to compete with existing legal tech … or to replace lawyers, but instead to make them work together more seamlessly. We want Harvey to serve as an intermediary between tech and lawyer, as a natural language interface to the law. We see a future where every lawyer is a partner – able to delegate the busy work of the legal profession to Harvey and focus on the creative aspects of the job and what matters most, their clients,” he told lawyer Robert Ambrogi, author of legal technology blog LawSites, in November 2022.
Of course, as generative AI progresses to being the dominant doer of more routine legal workflows, many in the legal profession are anxious about their employment security. Some tasks will inevitably change because AI will do it better, faster and cheaper.
For example, junior lawyers will find their roles evolving. If their usual work is now done by AI, freeing up time in their day, they can focus on higher-value, more engaging work, develop specialised expertise and participate in the strategic aspects of legal practice – all earlier in their career than traditionally done.
On jobs in general, the Goldman Sachs report states: “Jobs displaced by automation have historically been offset by the creation of new jobs and the emergence of new occupations following technological innovations accounts for the vast majority of long-run employment growth.”
Of the legal profession specifically, Google’s Pichai has an interesting prediction: he’s “willing to almost bet” there will be more lawyers a decade from now because “the underlying reasons why law exists and legal systems exist aren’t going to go away, because those are humanity’s problems”.
Risks of using generative AI
While there are substantial potential benefits of generative AI to the legal profession, there are also inherent risks and limitations. Careful consideration and appropriate safeguards at private and governmental levels will be essential to effectively mitigate these risks and promote trustworthy deployment and use in the legal domain.
The many compliance-related and legal issues include monitoring new AI-specific legislation, like the EU’s AI Regulation. Some of the other more commonly identified legal and practical issues include:
Accuracy and reliability of AI-generated legal documents. Lawyers must exercise judgment and validate the content, date and source of data. Unreliable content, undetected errors and hallucinations will undermine trust in AI systems and have legal implications like professional negligence and liability for incorrect advice. Over-reliance on generative AI systems without critical evaluation will also lead to errors and oversights.
Amplification of biases present in legal data. Careful curation of training data is essential to avoid biases. If training data reflect historical biases, generative AI systems may inadvertently produce discriminatory outputs that perpetuate disparities in legal outcomes.
Cybersecurity. Any networked computer system is a cybersecurity risk. To some extent, a generative AI system’s human-like conversations and known hallucinations flaws make it an even more attractive target for social engineering or phishing scams. The general warning is to be alert to potential malicious activity.
Data privacy and confidentiality. Data is now a highly regulated asset. Generative AI relies on vast amounts of it – hundreds of billions or even trillions of data points, which will inevitably include personally identifiable and sensitive legal information. Any personal data shared with a generative AI tool is likely to be protected by privacy laws. It’s also possible the content of a prompt may contain confidential or sensitive information. With the openness of ChatGPT and other public domain models, employees entering confidential, personal and sensitive data into public generative AI models is already a significant problem. Generative AI, which may store input information indefinitely and use it to train other models, could also contravene privacy regulations that restrict secondary uses of personal data. Like all areas of data protection, this has the potential to be a legal minefield.
Intellectual property. While lawyers do not generally encounter IP including copyright or trademark issues while producing their own legal documents (because we all draft from scratch or use our own precedents, right?), having AI draft templates or legal research papers may inadvertently infringe third-party IP rights if the data sources are unknown. On the flip side, there’s the issue of who owns the IP rights of the AI-generated content.
Technical opacity. Generative AI runs on foundation LLMs, which use some of the most advanced AI techniques in existence. Given the billions of dollars invested in developing specific generative AI models, the technology that underpins the AI is also proprietary. However, this adds opacity to the inner workings of AI models: if an AI black box is not sufficiently transparent, the model will not be understood; if the data sources are not disclosed and it is not understood how the model works or responds to any particular prompt, it will be harder for lawyers to trust the output. Policymakers must encourage the development of rigorous quality control methodologies and data set disclosure standards for AI systems, appropriate to the context of the application.
Other limitations. There are two other commonly known technical limitations with LLMs. First, the potential limitation on the underlying foundation data. It’s well understood, because OpenAI disclosed it, that GPT-3.5 is only trained on data from the internet to September 2021. However, the proprietary nature of other AI models may make it difficult or impossible to find their training data source and currency. Second, every generative AI model will have memory limits – in other words, the length of your prompt is limited. GPT-4 has a limit of about 40 pages of double-spaced text – 12,288 words. While LLMs’ memory capacity will improve over time, the current limitations make them unsuitable for legal matter analysis such as complex litigation or major contract reviews where that word limit is exceeded.
To effectively leverage generative AI, lawyers need to develop additional skills. Alongside legal expertise and the Table A lawyering skills, new proficiencies for lawyers of all generations include:
Identification and assessment of which generative AI to use and for what purposes. With more generative AI platforms emerging and many already existing legal tools with some form of generative AI, lawyers will need to make case-specific selections based on the required legal work, process and support.
Familiarity with AI concepts and models. Lawyers will need a foundational understanding of AI concepts and models (machine learning, deep learning, neural networks, natural language processing and LLMs) and appreciation of the risks, limitations and benefits of AI. This understanding will have to be continually updated to accommodate rapid advances in AI innovation.
Legal prompt engineering. Knowing how to formulate precise and contextually appropriate legal queries or instructions to AI models, minimising unnecessary jargon and avoiding ambiguity, to obtain accurate and relevant outputs.
Data literacy and ethics. Strong data literacy skills are essential to assess the quality and reliability of training data used by AI models. Lawyers must also navigate ethical considerations like data privacy, bias mitigation and transparency, together with methodologies to enable AI’s trusted and ethical use.
Critical evaluation of AI outputs. There will be no substitute for a lawyer’s final review of AI-generated legal documents. Lawyers must possess the ability to critically evaluate and validate AI outputs to ensure their accuracy, relevance and compliance.
The way ahead
Given the transformative potential of generative AI in the legal profession, and likewise in our daily lives, lawyers must adapt to effectively incorporate this technology into legal practice. Familiarisation and adaptation will take time, so for a profession notoriously slow to adopt any innovation, there’s sound justification for beginning to embrace it now.
Start with upskilling on digital and AI literacy; actively experiment and get familiar with the various AI platforms and products in the market. If you head a team, lead by example and get others involved on the learning journey.
Just as legal research today is only done online using a search engine, there will be a paradigm shift in delivering some legal services and the use of AI will become second nature. While there are benefits and challenges to the use of generative AI, failure to act risks falling behind peers, competitors and other professions and industries who do harness the advantages of AI-powered tools. It’s also foreseeable that clients will increasingly expect lawyers to leverage AI tools, as clients themselves initiate their use of AI in business.
Contrary to concerns of widespread job displacement, generative AI will not replace lawyers. Rather, AI tools will augment human capabilities, drive efficiency and productivity, and deliver greater value to clients.
With proper regulation, governance and ethical guidelines, generative AI will revolutionise aspects of legal practice and shape the future of a more modern legal profession.
Sharyn Ch’ang is PwC’s Asia Pacific NewLaw director based in Hong Kong. She is also former president and board member of the Association of Corporate Counsel Hong Kong. Having practised in top-tier law firms and in-house focusing on corporate, IP and technology, she became a legal transformation consultant because she is passionate and adamant that current legal practice needs to evolve to be more cost and time efficient for the benefit of lawyers and clients alike.
DELTA Data Protection & Compliance Academy & Consulting – email@example.com