Working with sensitive data: Why publicly hosted LLMs are risky

AI ban ordered after child protection worker used ChatGPT in Victorian court case.
Laura Mathews
Published
Length
2 min read
Blog Title Image Working with sensitive data

Over the last 24 months, we’ve seen numerous cases where Large Language Model (LLM) misuse occurred in the workplace.

We had the case where Air Canada lost a legal battle after their ChatBot stated incorrect information around a flight bereavement policy. We had the lawyer disbarred after citing legal precedent based on a prior case. Turns out that prior case was pure fiction based on an LLM hallucination.

The latest case is even more troubling. It comes from Australia’s province of Victoria.

A worker entered significant amounts of personal information, including the name of an at-risk child, into ChatGPT. Victoria’s child protection agency has now been ordered to ban staff from using generative AI services.

According to Josh Taylor, Victoria’s child protection agency received an order to ban staff from using generative AI services. This decision followed an incident where a worker entered large amounts of personal information into ChatGPT.

See full story here

“The Department of Families, Fairness and Housing reported a matter to the Office of the Victorian Information Commissioner after a worker was suspected of using ChatGPT when drafting a protection application report”

This incident serves as a stark reminder that public or insecure LLMs pose substantial risks when handling sensitive data. Here are three key reasons why:

  1. Data Leakage: Public LLMs like ChatGPT are accessible to anyone with an internet connection. This means there’s a significant risk of data leakage, whether intentional or unintentional. In the case reported, personal information including a child’s name was entered into ChatGPT, potentially exposing it to unauthorized access.
  2. Model Limitations: LLMs are trained on vast amounts of text from the internet, including inaccurate, biased, or inappropriate content. Using such models for critical tasks like drafting protection application reports can lead to misleading or incorrect outputs, as seen in this case where a doll used for sexual purposes was described as an ‘age-appropriate toy’.
  3. Lack of Control: When using public LLMs, you have little control over how your data is processed and stored. This lack of transparency and control makes it difficult to ensure compliance with data protection regulations or ethical guidelines.

You Can Use GenAI – Just Make Sure It’s Secure

We’re not in any way arguing you should not use Generative AI (GenAI) in the workplace, it has huge value. But we are saying use secure GenAI. This is AI running completely isolated behind your firewall, with 100% security. Either running in a private cloud or on-premises.

For instance, you can run VisibleThread’s solutions (both VT Docs and VT Writer) 100% securely behind your firewall. This means that sensitive data never leaves your network. By using on-premises or privately hosted LLMs, you maintain full control over your data and can mitigate the risks associated with public models.

In light of this incident, we urge organizations handling sensitive information to:

  • Use secure, on-premises or privately hosted LLMs for critical tasks.
  • Implement strict access controls and monitoring for any AI tools used within your organization.
  • Train staff on data protection guidelines and the risks associated with using public LLMs.

Let’s learn from this incident and ensure that sensitive data is protected when interacting with LLMs. Because while AI can greatly enhance our capabilities, it’s crucial to use it responsibly and securely. We, as a society must safeguard the most sensitive information.

For more information on this topic take a look at our CEO and Founder’s newsletter article on LinkedInWhy CIOs are repatriating to On-prem/Private Clouds?

×

Book a Demo