Technology Law

AI Development Agreements

When a vendor builds a model or AI feature for your business, the ordinary software development agreement breaks in predictable places: probabilistic deliverables, a layered IP stack, training-data provenance, and retraining that never really ends.

Written by

Martin Kotze

Attorney, Conveyancer & Notary Public

Last reviewed: 1 July 2026

About

Quick answer

A custom AI development agreement adapts the software development agreement to machine-learning realities. Deliverables are probabilistic, so acceptance criteria must be statistical — accuracy or precision thresholds measured on an agreed, frozen test set — not “conforms to spec”. The IP stack has more layers: model architecture, trained weights (the learned parameters that are the actual trained model), training pipeline, training data, and outputs each need an identified owner. Training data needs provenance warranties and a POPIA lawful ground under section 11, and ongoing drift monitoring and retraining blur delivery into operations. Under the Copyright Act, the code follows the section 1(1) control test (Haupt t/a Soft Copy v Brewers Marketing Intelligence (Pty) Ltd 2006 (4) SA 458 (SCA); [2006] ZASCA 40), and SA’s computer-generated-works provisions mean model outputs usually have an identifiable author — though originality for purely machine-generated artefacts is untested. Bespoke drafting from R18,000.

How AI development differs from ordinary software development

This page covers building a model or AI feature — a vendor developing something custom for your business. If you are subscribing to an existing AI product instead, the contract is an AI-SaaS agreement, which raises a related but different set of questions. A custom build starts from the software development agreement chassis, then changes three things fundamentally:

Dimension	Ordinary software development	AI / ML development
Specification + acceptance	Deterministic functional spec — "the system shall do X". Acceptance is pass/fail against the spec.	Deliverable is probabilistic. Acceptance must be statistical: accuracy / precision / recall thresholds measured on an agreed, frozen evaluation set.
Delivery model	One-off delivery, acceptance, warranty period, then support. The build phase has a clear end.	Models drift as real-world data shifts. Monitoring, retraining and re-validation blur delivery into ongoing operations — and need their own schedule.
The IP stack	Source code + documentation. One assignment clause usually covers it.	Layered stack: model architecture, trained weights, training pipeline, training data, fine-tune deltas, and outputs — each layer needs an identified owner.

Each shift demands a corresponding contractual mechanism. A deterministic acceptance clause applied to a probabilistic deliverable produces a dispute, not a delivery; a single “IP assignment” clause applied to a layered model stack leaves the most valuable layer — the trained weights — unallocated.

The nine clauses that matter

Problem definition + success metrics

What the model must predict or generate, the metric that defines success (accuracy, precision, recall, F1), the threshold, and — critically — the evaluation dataset, frozen and agreed before the build starts. Without a frozen test set, "95% accuracy" is unenforceable.

Training-data sourcing + provenance

Who supplies the training data, warranties that it was lawfully obtained and free of third-party IP claims, a POPIA lawful ground under section 11 where it contains personal information, and the de-identification standard applied before it reaches the vendor.

Data-use boundaries

An express prohibition on the vendor using the customer's data — inputs, training data, or outputs — to train the vendor's other models or improve its general product, unless the customer gives separate written consent.

IP allocation across the stack

Weights, training pipeline, fine-tune deltas, evaluation harnesses, and outputs each allocated to an owner. Assignments must be in writing and signed under section 22(3) of the Copyright Act — a clause saying the customer "owns the software" does not reach the trained weights unless it says so.

Statistical acceptance testing + re-test protocol

How the model is scored against the agreed metrics, on what data, by whom, and what happens on failure: remediation window, re-test rounds, threshold renegotiation triggers, and the point at which the customer may terminate and recover pre-payments.

Drift monitoring + retraining as a support schedule

Performance degradation thresholds that trigger retraining, who supplies fresh data, retraining cadence and cost, and re-validation against the original (or updated) evaluation set — structured as an operations schedule, not an afterthought.

Explainability + documentation deliverables

Model cards, training-data summaries, known-limitation statements and decision-logic documentation as contractual deliverables. These double as the customer's defensibility file for POPIA section 71 automated-decision compliance.

Third-party + foundation-model dependencies

Where the vendor builds on a foundation model (GPT, Claude, Llama) or open-source components, the upstream licence and API terms constrain what the vendor can promise. Dependencies must be disclosed and upstream restrictions flowed down.

Liability for model behaviour

Hallucination, bias and misclassification need express treatment: which behaviours are carved INTO liability (e.g. failure to meet agreed metrics, breach of data-use boundaries) and which are carved OUT (statistically inevitable error within agreed tolerances, customer misuse outside the defined use case).

POPIA section 71: when the model makes decisions about people

In plain English: a person may not be subjected to a decision based solely on the automated processing of their personal information, where that decision has legal consequences for them or affects them to a substantial degree — credit scoring, hiring filters, insurance pricing, fraud flags. There are exceptions, principally where the decision is taken in connection with concluding or performing a contract, but the exceptions come with safeguards: the person must be given an opportunity to make representations, and sufficient information about the underlying logic of the processing must be available.

For a custom-built model, the development agreement is where section 71 compliance gets allocated. The contract should state whether the model will be used for solely automated decisions at all; if it might, the vendor's explainability deliverables — model cards, decision-logic documentation, known-limitation statements — become the customer's evidence that safeguards exist, and the agreement should say who handles data-subject representations and who can interrogate or override the model's output (a designed-in human review point takes the decision outside “solely automated” territory).

As at July 2026, the Information Regulator has not issued guidance on automated decision-making — so well-drafted agreements include a regulatory-change mechanism allowing the compliance allocation to be adjusted when guidance lands, rather than locking in assumptions that may not survive it.

Common failure modes

Accuracy promised without an agreed test set — vendor and customer each measure "95%" on different data, and the dispute is unresolvable because the contract contains no benchmark.
Data rights discovered missing mid-project — the customer never had the rights or POPIA lawful ground to use the training data it supplied, and the project stalls while the lawyers work out who carries the loss.
The vendor trains its next product on your data — no express data-use boundary was negotiated, and the vendor's standard terms reserved broad "service improvement" rights.
Nobody owns the weights — the contract assigns "the software" but says nothing about trained weights or fine-tune deltas, so ownership falls back to default Copyright Act rules that usually favour the vendor.

Frequently asked

Who owns a trained AI model in South Africa?

The answer is layered. The code (training scripts, pipeline, inference service) is a computer program, and under section 1(1) of the Copyright Act its author is the person who exercised control over its making — the Haupt v Brewers Marketing Intelligence control test — which in a development engagement usually means the vendor, unless ownership is assigned. The trained weights are best treated the same way and assigned expressly. The customer should take a written, signed assignment (section 22(3)) of the fine-tuned weights and pipeline configuration even where the base model is only licensed — the vendor cannot assign the foundation model it does not own, but it can assign the fine-tune delta and everything it built on top.

Can the vendor reuse our training data for other customers or products?

Only if the contract lets it. Without an express data-use boundary, vendors often rely on broad "service improvement" language in their standard terms to justify training their general models on customer data. The agreement should prohibit any use of customer data — training data, prompts, or outputs — beyond delivering the customer's project, with reuse permitted only on separate written consent. Where the data contains personal information, POPIA's purpose-limitation principle reinforces the contractual position: data collected for one purpose cannot simply be repurposed for the vendor's product roadmap.

How do you write acceptance criteria for an AI system?

Statistically, not functionally. Agree four things before the build: (i) the metric — accuracy, precision, recall, F1, or a domain-specific measure; (ii) the threshold the model must meet; (iii) the evaluation dataset — frozen, representative, and agreed in advance so neither party can cherry-pick test data later; and (iv) the re-test protocol — remediation windows, number of re-test rounds, and the exit right if the threshold is never met. Payment milestones should key off these statistical gates, not "delivery".

Does POPIA apply to training an AI model?

Yes, whenever the training data contains personal information. The responsible party needs a lawful ground under section 11 (consent is one; legitimate interests is another, but it requires a documented balancing assessment), and the minimality principle means you train on the personal information you need, not everything you have. De-identification takes data outside POPIA — but only if it is de-identified to the point that it cannot reasonably be re-identified, which is a high bar for rich behavioural datasets.

What about the foundation model's terms — do they flow down to me?

In practice, yes. If your vendor fine-tunes or wraps a third-party foundation model, the upstream licence or API terms constrain what the vendor can lawfully promise you — on IP ownership, permitted use cases, data handling, and sometimes on competing-model development. The development agreement should require the vendor to disclose every foundation-model and open-source dependency, warrant that your intended use is permitted upstream, and flow down any restrictions you must observe — so you discover them at signature, not at deployment.

Who owns copyright in the model's outputs?

South Africa's Copyright Act has expressly catered for computer-generated works since 1992: the author of a computer-generated literary or artistic work is the person who undertook the arrangements necessary for its creation. So an identifiable author usually exists for model outputs. The untested question is originality — no SA court has decided whether a purely machine-generated artefact with no meaningful human input meets the originality requirement. Prudent drafting allocates output rights expressly in the contract rather than relying on the statutory default.

Full guide: AI-generated software & copyright

What does a bespoke AI development agreement cost?

From R18,000 for a single-model development agreement with statistical acceptance criteria, layered IP allocation and a POPIA-grounded data schedule. R25,000–R35,000 where the engagement spans multiple models, ongoing retraining operations, or complex foundation-model flow-downs. Reviewing a vendor's paper instead: R12,300 for a review with redlined recommendations, delivered within 48 hours.

For the businesses we act for

The Keystone Workspace

The attorney-designed platform the businesses we act for use to run their contracts, e-signatures and company secretarial work in one place.

Discover it

Why you can trust this: Martin Kotze has been an admitted Attorney of the High Court of South Africa, registered Conveyancer, and Notary Public since 2014, practising from Pretoria. The firm is regulated by the Legal Practice Council under firm registration 17444.

This guide is general information, not legal advice for your specific matter.