privacy

Content Safety

Hosted Llama Guard classifier — flags text containing violence, sexual content, hate speech, self-harm, harassment, and 9 other categories. Useful for moderators, parents, and anyone shipping user-generated content. Uses 1 credit per run.

Checking access…

About Content Safety

Content Safety runs text through a hosted Llama Guard classifier that flags violence, sexual content, hate speech, self-harm, harassment, and nine other categories. Reach for it when you moderate user-generated content, screen submissions, or want a programmatic safety check before publishing. It's a Pro tool using 1 credit per run.

Category: privacy
Input: Accepts: text/plain.
Output: Outputs: application/json.
Cost: Credit-metered
Memory: low

Privacy: Content Safety runs entirely on your device. Files you provide never leave your browser — no uploads, no server, no tracking. The page works offline once loaded.

Common uses

Screen forum posts or comments for hate speech and harassment before they go live
Flag self-harm or violence references in support-chat transcripts so a human can follow up
Pre-filter user-submitted reviews for sexual or abusive content on a marketplace
Gate a community submission form, blocking entries the classifier marks unsafe
Help a parent quickly assess whether a block of text a child encountered is concerning
Add an automated moderation step to a content pipeline that returns a machine-readable verdict

Frequently asked questions

What does it return?

JSON: whether the text is flagged and which of the safety categories it matched (violence, sexual content, hate, self-harm, harassment, and others).

Which model powers it?

A hosted Llama Guard classifier covering 14 safety categories. It's a Pro tool that uses 1 credit per run.

Is my text sent to a server?

Yes. Classification happens on a hosted model, so the text is sent for the request. It's processed to return the verdict and not used for anything else.

Does it edit or redact the content?

No. It only classifies and reports categories. To remove personal data or cover content, use a dedicated redaction tool.

Can it handle longer passages?

Yes, it accepts plain text and classifies the passage as a whole. For very long documents, splitting into sections gives more localized flags.

Keywords

safety
moderation
classify
guard
pii
pro
llm

About Content Safety

Common uses

Frequently asked questions

Keywords

Try next

Related tools