Document Sanitizer
Clean and optimize text for token-efficient AI prompts
Sanitization Options
Whitespace
Multiple spaces to single
3+ newlines to 2
Multiple newlines to single
Remove leading/trailing spaces
Delete blank lines
Tabs to spaces, fix line endings
Content
Strip all HTML elements
Strip http/https links
Strip email addresses
Keep only alphanumeric
Strip .,!? and similar
Strip all digits
Convert to lowercase
Input Text
Sanitized Output
Document Sanitizer cleans and optimizes text for token-efficient AI prompts. It removes unnecessary whitespace, special characters, HTML tags, URLs, and other elements that consume tokens without adding value. Token estimates are approximate (~4 chars/token for English).