A GenAI case study
How GenAI is used in the legal sector
Chris Price
RESULTS
Evaluate an LLM (Large Language Model) based QA (Question Answer) system focussed on reviewing legal documents (e.g. contracts)
PROBLEM
Results
Baseline
Intentionally naive implementation to set a performance baseline
50%
TARGET
Comparative performance to existing manual techniques
>95%
Achieved
Picking the best performing technique per sample
85%
Background
The Engagement
- A Legal Technology Company
- Offering technology solutions to improve document creation, analysis, and management for legal teams
-
6 months

〞
Who are the parties to these contracts?
– System User
1
Upload documents of interest
2
Specify extraction models of interest based upon the question
4
Review extractions to evidence an answer to the question
3
System extracts matching passages for each extraction model, for each document
5
Apply legal knowledge to refine the answer based on evidence
1
Upload documents of interest
2-4
Ask question of LLM based QA system to produce initial evidenced answer
5
Apply legal knowledge to refine the answer based on evidence
Challenges
ExpectationS
- Linear progression
- Inaccessible tools
- Rapid initial progress
- Engaged stakeholders
POSITIVES
- Cherry-picked is not consistent
- Slow subsequent progress
NEGATIVES
Cycle Times
- Instant feedback
- Purposeful iteration
- Slow models
- QA assesment is hard
Negatives
- Slower humans
MORE Negatives
Relative Value
- Well-tested system
- Even split
- Undemanding code
- Prompt decomposition
Positives
- Creating/finding, transforming, refining, etc. test data
Negatives
Ecosystem Maturity
- Best practices
- De-facto tools
- Little churn
- New techniques
(RAG > HyDE > RRR) - New capabilities
(gpt-4 8k > gpt-4 128k)
Positives
- Evolving best practices
- Immature tools
Negatives
Lessons
ExpectationS
- Expect rapid initial progress followed by prolonged iteration
- Expect significant investment in problem definition and testing
Cycle TImes
- Minimise human assessment and ensure active engagement
- Automated assessment is necessary for rapid iteration
Relative Value
- Code development activities will be insignificant alongside testing
- The system's test data will be its most valuable commodity
LESSONS
Ecosystem Maturity
We've been here before...
Does your problem require a bespoke AI solution now?
Conclusion
Chris Price
Can you directly empower the individuals within your team with this raw capability and observe the emergent behaviour?
A GenAI Case Study
By Chris Price
A GenAI Case Study
How GenAI is used in the legal sector
- 185