Document Sanitizer

Clean and optimize text for token-efficient AI prompts

Sanitization Options

Whitespace

Multiple spaces to single

3+ newlines to 2

Multiple newlines to single

Remove leading/trailing spaces

Delete blank lines

Tabs to spaces, fix line endings

Content

Strip all HTML elements

Strip http/https links

Strip email addresses

Keep only alphanumeric

Strip .,!? and similar

Strip all digits

Convert to lowercase

Input Text
Sanitized Output

Document Sanitizer cleans and optimizes text for token-efficient AI prompts. It removes unnecessary whitespace, special characters, HTML tags, URLs, and other elements that consume tokens without adding value. Token estimates are approximate (~4 chars/token for English).