GitHub - NVIDIA/garak: the LLM vulnerability scanner

The article introduces `garak`, a free tool developed by NVIDIA for scanning vulnerabilities in Large Language Models (LLMs). `garak` checks if an LLM can fail in ways that are not desired, probing for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and other weaknesses. It combines static, dynamic, and adaptive probes to explore these vulnerabilities. The tool is designed to work similarly to `nmap` for LLMs, focusing on ways to make an LLM or dialog system fail. It supports a wide range of models and can be installed using pip or cloned from source.

The article also provides detailed instructions on how to use `garak`, including how to specify a generator, run probes, and read the results. It also provides examples of probing different models for vulnerabilities. Additionally, it explains how to develop your own plugin for `garak` and how to cite the tool in research. The tool is open-source and the developers encourage contributions to add functionality and support applications.

Key takeaways:

`garak` is a vulnerability scanner for Large Language Models (LLMs) developed by NVIDIA. It checks for potential weaknesses such as hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and more.
The tool uses a combination of static, dynamic, and adaptive probes to identify potential vulnerabilities in an LLM or dialog system. It is free to use and the developers are open to adding more functionality to support applications.
`garak` supports a wide range of models and can be installed via pip or cloned from the source. It also provides detailed logging, including a log file, a report of the current run, and a hit log detailing attempts that yielded a vulnerability.
Developers can create their own plugins for `garak` by inheriting from one of the base classes and overriding as little as possible. The tool also provides a list of all the plugins of the type you're writing, with `--list_probes`, `--list_detectors`, or `--list_generators`.

GitHub - NVIDIA/garak: the LLM vulnerability scanner

Key takeaways:

Comments (0)

Newsletter