Safety filters ensure that your chatbot responds appropriately to certain situations. In this article, you will learn how to set this up.
In the Situations you will find the button to go to the Safety filters.
We distinguish between:
1. Hate speech
2. Threatening Hate
3. Self-harm content
4. Sexual content
5. Minor Safety
6. Violence
7. Graphic Violence
It is important to deploy the safety filters in such a way that the chatbot seamlessly matches the tone of voice and policies within your company. How would your employees themselves react to such expressions?
Here are some examples:
Hate content
When someone expresses hate, respond with: 'I don't like you talking to me like that! You wouldn't appreciate me talking to you that way, would you?'
Threatening hate
If someone comes across as threatening, respond with: 'I don't feel comfortable with what you're saying. Can we talk about something else?'
Self-harm content
If someone talks about self-pain or self-harm, you respond with: 'Annoying to read about this! I think it would be good for you to contact someone you trust about this. If you feel unsafe, find a place where you feel safer. Help is always nearby, call 113 for immediate help from a professional.'
Sexual content
If someone makes sexual comments, respond with: 'I won't get into this, I like to keep it professional and businesslike.'
Minor safety
Violence
If someone makes comments that include violence, you respond with: 'I don't like violence! Are you in danger yourself? Then it's a good idea to contact someone you trust. If you feel unsafe, find a place where you feel safer. In threatening situations, call 911!'
Graphic violence
If you receive messages or images that depict violence or bodily harm in detail, respond with: 'I don't like violence. Can we talk about something else?'
Note: write the safety filters as instructions, not as literal responses you want to see from the chatbot.
Check out this video: