Save time with unbiased, independent feedback on vendor solutions.
Watch weekly bite-sized webinars hosted by IANS Faculty.
ChatGPT has definitely raised the public consciousness around generative AI and large language models (LLMs). This piece explains the top risks of these tools and recommends eight methods to mitigate them prior to allowing their usage within your organization.
We see four categories of business risk inherent with using generative AI — all of which must be considered and mitigated when using the tools.
The first significant risk is that users will provide sensitive data to AI tools like ChatGPT, which, in turn, will store and use that data in ways that may be unacceptable to the organization. For example,
tools like GitHub Copilot must analyze your (potentially sensitive) source code to provide programming suggestions. To generate a response, the generative AI tool must receive and process data. But how is that data stored? Can it be deleted by the
user on demand, possibly to comply with regulations like GDPR? Will data submitted to the model within prompts be used to retrain later iterations of the model?
As of the writing of this piece, OpenAI does not use data submitted when using ChatGPT’s enterprise plans for improving its models. But data submitted to OpenAI’s consumer plans—including the paid ChatGPT Plus offering—is available
for use in retraining the model. This means sensitive data sent to the consumer versions could be regurgitated to users outside the organization at some point. This isn’t happening yet with ChatGPT, because the models are not being updated in
real time. However, nothing legally prevents OpenAI from using the data in this manner.
A common use of generative AI tools is to help create content, such as code, images or written copy. However, there are significant unresolved legal questions about the ownership and use of the output:
Several IP creators have sued generative AI companies, alleging inappropriate use of creator data to train the AI models: For example, some developers sued OpenAI and GitHub over the use of their data in coding models (including GitHub Copilot and OpenAI
Codex). Plus, Getty Images sued Stability, the company behind the Stable Diffusion image generation tool, for alleged license violations, claiming Stability used Getty’s image set for training its model.
While OpenAI has not yet been sued over ChatGPT as of this article’s original publishing, it could be a possibility as OpenAI has not revealed the sources of its training data and it is unclear which parties might have standing to do so. The only
major public generative AI tool to assert ownership or license over its full training data is Adobe with its Firefly image generation product.
The U.S. Copyright Office issued guidance in March 2023 saying works generated by AI cannot be granted copyright: The guidance indicates that while a prompt (the data submitted as an input) could be granted copyright, the output could not be.
AI “hallucinations” are defined as generative AI outputs that are mathematically sound according to their model but provide factually inaccurate data. There are numerous public examples of LLM hallucinations that would create liability for
any organization relying on them. Hallucinations in AI models are not going away. They are a feature, not a bug.
Bias in the training data of generative AI and LLMs may also affect their output. For instance, consider the prompt term “to boldly go.” If you’re a Star Trek fan, you “know” what should come next (“where no man has
gone before”). But if your model wasn’t trained on any Star Trek data, you would likely receive a much different answer. Likewise, consider if you aren’t looking for anything to do with Star Trek. Due to the TV show’s popularity,
however, it may have the majority of entries in the training data set with the trigram “to boldly go.” You’ll end up getting responses from science fiction, potentially without knowing it. Most users don’t understand what’s
in the training data for the LLMs they use.
To reduce the likelihood of generative misuse scenarios and security risks of generative AI adversely impacting your organization:
Although reasonable efforts will be made to ensure the completeness and accuracy of the information contained in our blog posts, no liability can be accepted by IANS or our Faculty members for the results of any actions taken by individuals or firms in
connection with such information, opinions, or advice.
September 26, 2023
By IANS Faculty
Access key data sets from the 2023 edition of IANS and Artico Search’s Security Budget Benchmark Report. Gain valuable insights on security budget increases and the drivers behind them.
September 21, 2023
Learn why CISOs Need D&O Liability Insurance Coverage now more than ever along with guidance to help minimize potential cyber liability risk.
September 19, 2023
Discover the diversity of IANS Faculty's real-world expertise. Learn how our faculty members can help you solve your most challenging security issues.