Fiddler Guardrails Integration#
Fiddler Guardrails utilizes Fiddler Trust Models in a specialized low-latency, high-throughput configuration. Guardrails can be used to guard Large Language Model (LLM) applications against user threats, such as prompt injection or harmful and inappropriate content, and LLM hallucinations.
Currently, only Fiddler Trust Models (Faithfulness and Safety) - Fiddler’s in-house, purpose-built SLMs - are available for guardrail use. Future model releases and model updates/improvements will also be available for guardrail use.
Setup#
Ensure that you have access to a valid Fiddler environment. To obtain one, please contact us.
Create a new Fiddler environment key and set the
FIDDLER_API_KEY
environment variable to this key to authenticate into the Fiddler service.
Update your config.yml
file to include the following settings:
rails:
config:
fiddler:
fiddler_endpoint: https://testfiddler.ai # Replace this with your fiddler environment
safety_threshold: .2 # Any value greater than this threshold will trigger a violation
faithfulness_threshold: .3 # Any value less than this threshold will trigger a violation
input:
flows:
- fiddler user safety
output:
flows:
- fiddler bot safety
- fiddler bot faithfulness
Usage#
Once configured, the Fiddler Guardrails integration will automatically:
Detect unsafe, offensive, harmful, or jailbreaking inputs into the LLM
Detect unsafe, offensive, or harmful outputs from the LLM
Detect potential hallucinated outputs from the LLM
Customization#
You can configure the thresholds for both the safety and hallucination detection using the safety_threshold
and the faithfulness_threshold
parameters
Error Handling#
Fiddler Guardrails will not block inputs in the event of any API failure.
Notes#
For more information about Fiddler Guardrails, please visit the Fiddler Guardrails documentation.