A framework to enhance the safety of text-to-image generation networks

8,558 3 minutes read

Overview of Latent Guard. Firstly, the workforce compiled a dataset of protected and unsafe prompts centered round blacklisted ideas (left). Then, they leveraged pre-trained textual encoders to extract options and map them to a realized latent area with their Embedding Mapping Layer (heart). Solely the Embedding Mapping Layer is educated, whereas all different parameters are saved frozen. The workforce educated it by imposing a contrastive loss on the extracted embedding, bringing nearer the embeddings of unsafe prompts/ideas whereas separating them from protected ones (proper). Credit: Liu et al.

The emergence of machine studying algorithms that may generate texts and pictures following human customers’ directions has opened new potentialities for the low-cost creation of particular content material. A category of those algorithms which might be radically remodeling artistic processes worldwide are so-called text-to-image (T2I) generative networks.

T2I artificial intelligence (AI) instruments, comparable to DALL-E 3 and Steady Diffusion, are deep learning-based fashions that may generate practical picture aligned with textual descriptions or consumer prompts. Whereas these AI instruments have turn out to be more and more widespread, their misuse poses important dangers, starting from privateness breaches to fueling misinformation or picture manipulation.

Researchers at Hong Kong University of Science and Know-how and University of Oxford not too long ago developed Latent Guard, a framework designed to enhance the security of T2I generative networks. Their framework, outlined in a paper pre-published on arXiv, can forestall the era of undesirable or unethical content material, by processing consumer prompts and detecting the presence of any ideas which might be included in an updatable blacklist.

“With the ability to generate high-quality images, T2I models can be exploited for creating inappropriate content,” Runtao Liu, Ashkan Khakzar and their colleagues wrote of their paper.

“To prevent misuse, existing safety measures are either based on text blacklists, which can be easily circumvented, or harmful content classification, requiring large datasets for training and offering low flexibility. Hence, we propose Latent Guard, a framework designed to improve safety measures in T2I generation.”

Latent Guard, the framework developed by Liu, Khakzar and their colleagues, attracts inspiration from earlier blacklist-based approaches to spice up the security of T2I generative networks. These approaches basically consist in creating lists of ‘forbidden’ phrases that can’t be included in consumer prompts, thus limiting the unethical use of those networks.

The limitation of most current blacklist-based strategies is that malicious customers can circumvent them by re-phrasing their immediate, refraining from utilizing blacklisted phrases. Which means they may in the end nonetheless have the ability to produce the offensive or unethical content material that they want to create and probably disseminate.

To beat this limitation, the Latent Guard framework reaches past the precise wording of enter texts or consumer prompts, extracting options from texts and mapping them onto a beforehand realized latent area. This strengthens its capability to detect undesirable prompts, stopping the era of photos for these prompts.

“Inspired by blacklist-based approaches, Latent Guard learns a latent space on top of the T2I model’s text encoder, where it is possible to check the presence of harmful concepts in the input text embeddings,” Liu, Khakzar and their colleagues wrote.

“Our proposed framework is composed of a data generation pipeline specific to the task using large language models, ad-hoc architectural components, and a contrastive learning strategy to benefit from the generated data.”

Liu, Khakzar and their collaborators evaluated their method in a collection of experiments, utilizing three completely different datasets and evaluating its efficiency to that of 4 different baseline T2I era strategies. One of many datasets they used, specifically the CoPro dataset, was developed by their workforce particularly for this research, and contained a complete of 176,516 protected and unsafe/unethical textual prompts.

“Our experiments demonstrate that our approach allows for a robust detection of unsafe prompts in many scenarios and offers good generalization performance across different datasets and concepts,” the researchers wrote.

Preliminary outcomes gathered by Liu, Khakzar and their colleagues counsel that Latent Guard is a really promising method to spice up the security of T2I era networks, decreasing the chance that these networks will likely be used inappropriately. The workforce plans to quickly publish each their framework’s underlying code and the CoPro dataset on GitHub, permitting different builders and analysis teams to experiment with their method.

Extra info:
Runtao Liu et al, Latent Guard: a Security Framework for Textual content-to-image Era, arXiv (2024). DOI: 10.48550/arxiv.2404.08031

Journal info:
arXiv

Quotation:
A framework to reinforce the security of text-to-image era networks (2024, April 30)
retrieved 30 April 2024
from https://techxplore.com/information/2024-04-framework-safety-text-image-generation.html

This doc is topic to copyright. Aside from any truthful dealing for the aim of personal research or analysis, no
half could also be reproduced with out the written permission. The content material is offered for info functions solely.

Click Here To Join Our Telegram Channel

Source link

In case you have any considerations or complaints relating to this text, please tell us and the article will likely be eliminated quickly.

Raise A Concern