The Proposed Guidelines clarify how the Personal Data Protection Act (“PDPA”) applies to AI and the key data protection issues relating to generative AI throughout its lifecycle, including development, deployment, and post-deployment stages.
While not legally binding, the Proposed Guidelines underscore the PDPC’s increasing focus on the use of personal data in generative AI systems and provide early insight into how existing data protection obligations may apply in this context.
Highlights of the Proposed Guidelines
AI Lifecycle
Summary of Key Points
Development
Where personal data is publicly available, organisations may rely on the "Publicly Available Exception" to collect, use, or disclose such data without consent.
However, where online data is placed behind a digital barrier, organisations must consider whether the Exception would still apply before collecting and incorporating such data into the development of the AI system. Factors to consider include the purpose and effect of the barrier and whether the data can be accessed without restrictions from other sources.
Where an organisation intends to scrape personal data stored behind digital barriers, it is best practice for the collecting organisation to notify the other organisation of its intention to do so.
Where user data (i.e. data provided by an individual to an organisation, or created in the course of or as a result of the individual’s use of the organisation’s products or services) is used to develop generative AI models, unless deemed consent or exceptions apply, organisations must obtain express consent from the individuals.
Organisations should, where practicable, provide clearly visible AI-specific notifications including:
what functions of the generative AI model use personal data;
the types of personal data used to develop the generative AI model;
how that data is used to train or fine-tuned the generative AI model (e.g. recognising speech patterns); and
how individuals can decline or withdraw consent to the use of personal data for AI training (e.g. an easily accessible opt-out mechanism).
General notifications describing only the broad purpose of processing are insufficient.
Deployment
Generative AI deployment involves model providers, system providers, and system deployers. Each of these players has their own distinct data protection obligations:
Model Providers: must comply with all PDPA obligations, including:
as organisations, retention limitation obligation. If the training data retained by model providers includes personal data, they should develop and make available a data retention policy that includes the rationale for retaining data for a longer period.
as data intermediaries, implement safeguards to protect personal data. In particular, it is good practice to document and make available the measures they have taken to safeguard personal data from downstream sources (e.g. data access controls, data residency and retention policies).
System Providers: may act as organisations or data intermediaries depending on their role. As data intermediaries, they are expected to manage evolving data protection and security risks (e.g. prompt injection). They should also share information on system-level safeguards with downstream deployers.
System Deployers: bear primary responsibility for PDPA compliance. These include:
assessing upstream safeguards to conduct a holistic assessment;
limiting data use to appropriate purposes;
implementing and maintaining safeguards for new data sources (e.g. prompts, outputs, agent activity data);
educating end users on types of data that should be input; and
establishing clear written governance policies and making them pre-emptively available, and regularly reviewing risks.
Post-Deployment
In the context of generative AI, the PDPC recognised that it may be challenging for organisations to comply with their Access and Correction Obligations under the PDPA (i.e. granting individuals the right to request access to and correction of their personal data) due to the massive amounts of data used and the nature of generative AI models (e.g. training data stored as embeddings, user data temporarily held in context windows).
Notwithstanding these difficulties, organisations are expected to adopt best practices in processing access and correction requests. Some examples of best practices include:
adopting upstream data handling measures (e.g. verifying data accuracy at collection and maintaining data provenance records); and
tracking maturity of and adopting technical measures to remove inaccurate personal data (e.g. machine unlearning).
What should businesses do?
While the Proposed Guidelines remain subject to public consultation, they signal the PDPC’s likely approach to regulating the use of personal data in generative AI.
Organisations are encouraged to review their current AI use cases, assess whether existing consent and notification practices are sufficient and begin identifying any potential against the proposed framework.
Should you wish to discuss how the Proposed Guidelines may impact your organisation, please do not hesitate to reach out to us.
The materials on the Eversheds Sutherland website are for general information purposes only and do not constitute legal advice. While reasonable care is taken to ensure accuracy, the materials may not reflect the most current legal developments. Eversheds Sutherland disclaims liability for actions taken based on the materials. Always consult a qualified lawyer for specific legal matters. To view the full disclaimer, see our Terms and Conditions or Disclaimer section in the footer.