Project Atticus Graph: LLM-Based Knowledge Graph Extraction from Legal Contracts
Abstract
Manual review of legal contracts is time consuming, error prone, and doesn't capture the intricate relationships between parties, clauses, and obligations. This paper presents a system that uses Large Language Models (LLMs) to automatically extract structured knowledge graphs from legal contracts. We compare LLM based extraction against traditional regex based NLP methods on the CUAD (Contract Understanding Atticus Dataset). To narrow down the scope for research purposes, we focus on 9 Intellectual Property (IP) clause types across 17 dedicated IP agreement contracts. Our results show that the LLM based approach gets a Macro F1 of 0.734 compared to 0.392 for traditional methods — an 87% relative improvement. The LLM achieves 91% precision and 85% recall, winning on 7 of 9 clause types. While traditional methods cannot extract relationships, our LLM approach successfully extracts 90 IP relationships like who grants/assigns IP to whom, enabling graph-based reasoning and queries in Neo4j. This could have huge impacts for Graph RAG systems in law.
1. Introduction
Legal contracts are common in business, however their analysis is still mainly manual. Contract review has many challenges:
- Volume: Organizations manage thousands of contracts with complex interdependencies
- Complexity: Legal language has nuanced phrasing that varies across jurisdictions and industries
- Evolution: As new contract types and clauses emerge, manual rule systems need constant updates
- Relationships: Contracts contain rich relationships between parties that are difficult to query systematically
Traditional NLP approaches like regular expressions and keyword matching struggle with variety in legal language. For example, a clause granting a “perpetual, irrevocable license” could be paraphrased as “license that survives termination” or “permanent rights”. Regex-based systems easily miss these patterns.
Research Question: Can Large Language Models outperform traditional NLP methods in extracting structured knowledge from legal contracts?
2. Methods
Dataset
We use the Contract Understanding Atticus Dataset (CUAD), containing 510 commercial legal contracts with over 13,000 expert annotations across 41 clause types. We focus on 9 Intellectual Property clause types across 17 dedicated IP Agreement contracts.
LLM-Based Extraction
We use GPT-OSS-120b accessed through the OpenRouter API with structured prompting. The prompt follows a structured template with four key techniques: role specification, clause definitions with example phrases, a JSON output schema, and relationship extraction instructions asking for WHO-WHAT-WHOM triples.
Traditional NLP Baseline
For the baseline, we use regex pattern matching along with spaCy NER for entity extraction. Hand-crafted patterns were developed for each clause type based on common legal phrasing.
Knowledge Graph Schema
We define a schema with three entity types (Contract, Party, Clause) and five relationship types (HAS_PARTY, HAS_CLAUSE, GRANTS_LICENSE_TO, ASSIGNS_IP_TO, SHARES_IP_WITH). Extracted knowledge is stored in Neo4j for Cypher-based graph queries.
3. Results
| Method | Macro F1 | Micro F1 | Precision | Recall |
|---|---|---|---|---|
| LLM | 0.734 | 0.878 | 0.910 | 0.847 |
| Traditional | 0.392 | 0.532 | 0.635 | 0.458 |
| Improvement | +87% | +65% | +43% | +85% |
Key Findings
- The LLM achieves 1.87x better Macro F1
- The LLM wins on 7/9 clause types (77.8%)
- 91% precision — highly accurate predictions
- 85% recall — finds most existing clauses
Relationship Extraction
Traditional regex methods cannot extract relationships at all. The LLM extracted 90 IP relationships across 17 contracts: 72 GRANTS_LICENSE_TO, 15 ASSIGNS_IP_TO, and 3 SHARES_IP_WITH. These enable powerful graph traversal queries, network analysis of licensing hubs, and cross-contract reasoning.
4. Why LLMs Outperform Traditional Methods
- Semantic Understanding: LLMs grasp meaning beyond surface patterns. When a contract states “the license shall survive any termination,” the LLM correctly identifies this as “Irrevocable/ Perpetual,” while regex patterns miss it.
- Context Awareness: LLMs consider surrounding context. “May not assign” might indicate Non-Transferable in a license section but something different in an assignment section.
- Generalization: LLMs handle paraphrased language without explicit patterns. Traditional methods require new rules for each variation.
5. Conclusion
This paper presents Project Atticus Graph, a system for extracting structured knowledge graphs from legal contracts using Large Language Models. Our evaluation on 17 IP Agreement contracts from the CUAD dataset demonstrates: 1.87x better Macro F1 (0.734 vs. 0.392), 91% precision with 85% recall, 7/9 clause types won by the LLM, and 90 relationships extracted (vs. 0 for traditional). The key differentiator is not just higher accuracy on clause detection, but the fundamental capability to extract structured relationships (WHO grants/assigns IP to WHOM), transforming contract analysis from isolated document review to connected knowledge graph reasoning.