UtilsDaily

HTML Encoder / Decoder

Escape HTML special characters to HTML entities, or decode entities back to plain text. Instant, browser-side processing.

Input โ€” plain text to encode
Output โ€” HTML encoded

What is HTML Encoding?

HTML encoding (also called HTML escaping) converts characters that have special meaning in HTML into their safe equivalent representations called HTML entities. For example, the less-than sign < has special meaning in HTML because it begins a tag. If you want to display a literal < character on a web page, you must write it as &lt; in your HTML.

Without HTML encoding, any user-supplied content containing HTML characters can accidentally (or maliciously) break your page structure or execute scripts โ€” a serious security vulnerability known as Cross-Site Scripting (XSS).

The Five Essential HTML Characters to Encode

These five characters must always be encoded when inserting dynamic content into HTML:

Character Named Entity Numeric (Decimal) Why It Matters
< &lt; &#60; Opens HTML tags โ€” can inject arbitrary markup
> &gt; &#62; Closes HTML tags โ€” breaks structure
& &amp; &#38; Starts entity references โ€” causes rendering errors
" &quot; &#34; Breaks attribute values โ€” can inject attributes
' &apos; &#39; Breaks single-quoted attribute values

HTML Encoding vs URL Encoding vs Base64

These are three different encoding schemes used in different contexts:

HTML encoding is for inserting text safely into HTML documents. It replaces characters that would be interpreted as HTML markup. Use it when putting user data into HTML.

URL encoding (percent encoding) is for encoding characters in URLs. Spaces become %20, & becomes %26. Use it when building query strings or URL parameters.

Base64 encoding is for representing binary data as ASCII text. It makes data safe to transmit in text-based protocols. Use it for embedding images in HTML/CSS, encoding file attachments in emails, or transmitting binary data in JSON.

A common mistake is applying the wrong encoding for the context โ€” for example, HTML-encoding a URL parameter instead of URL-encoding it, which breaks the URL.

XSS Prevention: Why HTML Encoding Matters for Security

Cross-Site Scripting (XSS) is consistently one of the top 10 web vulnerabilities (OWASP Top 10). It occurs when attacker-controlled data is inserted into a web page without proper encoding, allowing malicious scripts to execute in a victim's browser.

Example of a vulnerable code pattern:

// UNSAFE โ€” never do this:
document.innerHTML = 'Hello, ' + userInput;

// If userInput = '<script>stealCookies()</script>'
// The script executes in every visitor's browser

The safe pattern: Always HTML-encode user-supplied data before inserting it into HTML. Most server-side frameworks (Django, Rails, Laravel, etc.) do this automatically by default โ€” but custom string concatenation, innerHTML assignments, and template literals bypass this protection.

Common HTML Entities Reference

Character Entity Character Entity
ยฉ&copy;ยฎ&reg;
โ„ข&trade;โ‚ฌ&euro;
ยฃ&pound;ยฅ&yen;
โ†’&rarr;โ†&larr;
โ€”&mdash;โ€“&ndash;
 (non-break space)&nbsp;ยฐ&deg;

Frequently Asked Questions

Do I need to HTML-encode inside JavaScript strings?

No โ€” HTML encoding is for HTML context only. If you are building a JavaScript string (inside a script tag or .js file), you need JavaScript string escaping (backslash escapes like \n, \", \'). However, if you are inserting a JavaScript variable into HTML via DOM manipulation, always HTML-encode the value when using innerHTML. Better yet, use textContent which never interprets HTML at all.

Will HTML encoding affect how my page looks to users?

No. Browsers decode HTML entities before rendering, so users see the original characters. A page containing &lt; displays as < to the reader. HTML encoding affects the source code only, not the visible output.

Do modern frameworks handle HTML encoding automatically?

Most do, yes. React, Vue, Angular, Django templates, Rails ERB, and Laravel Blade all HTML-encode output by default. The dangerous scenario is opting out of this protection (React's dangerouslySetInnerHTML, Django's |safe filter, Rails' html_safe) without careful validation. When you bypass automatic encoding, you must manually ensure the data is safe.

What is double encoding and why is it a problem?

Double encoding happens when you HTML-encode a string that is already HTML-encoded. The ampersand in &lt; gets encoded to &amp;lt;, which displays as the literal text "&lt;" instead of "<". Always encode raw text exactly once. If you receive HTML from an API or CMS, check whether it is already encoded before encoding it again.

Embed This Tool on Your Website

โ–ผ