Tips, News and Comments on HTML Guard


September 11th, 2009, by Andreas

About HTML Source Code Encryption

What Is HTML Encryption?

One major reason for the success of the World Wide Web is undeniably the openness of HTML. HTML files are basically plain text documents, meaning software applications and human users can easily create, read, and update web pages. The open nature of HTML not only allows users to edit websites with nothing more than a simple text editor, it also enables search engines to spider the web and forms the basis for a wide range of web-related applications for any platform you can imagine.

However, as a web designer or website owner you may encounter situations in which you feel a need for protecting your HTML, CSS or JavaScript code from being viewed and reused – for example, when you want to:

  • keep spam robots from harvesting email addresses from your pages.
  • safely email a website design preview to a customer before payment receipt.
  • stop competitors from studying and borrowing your fancy, hand-crafted JavaScript code.
  • prevent search engines from indexing and caching the phone number(s) listed on your contact page.
  • keep automated downloaders like WebZip or Teleport from copying your entire website.
  • bypass content filters imposed by companies, service providers, or governments.

In these situations HTML encryption is a viable option.

How It Works

HTML encryption/decryption techniques are based on JavaScript. The encrypted HTML code, which is saved inside the HTML document, is decrypted at runtime through JavaScript and written directly into the browser window using the document.write(…) function. This ensures that any JavaScript-enabled web browser can load and display the pages without additional components or plugins.

Any Drawbacks? How Secure Is This?

The following drawbacks should be considered before applying HTML encryption to your code:

  • Only users with JavaScript enabled in their browser will be able to view the pages.
  • Encrypted code cannot be analyzed by Google, which may hurt your search engine rankings.
  • The necessary decryption processing may slow down the page loading (although with modern high-speed computers, this is not really an issue).

In regard to security, you shouldn’t expect too much of HTML encryption. As the decryption algorithm is embedded into the pages, anyone with enough JavaScript knowledge and a bit of time can reverse engineer the code. Additionally, there are browser extensions available that can display the rendered (= decrypted) source code of the currently loaded web page. Nevertheless, the obfuscated code should be “Greek” to most users, therefore providing a certain degree of protection against illegal copying.

Encryption Methods

There are several different ways of encrypting/decrypting HTML pages via JavaScript, some of which I will describe here.

Escape Method

An easy way to obscure HTML code is to replace its characters with escape codes. Escape codes consist of a % sign followed by a hexadecimal representation of the character. For example, the letter A becomes code %41, where 41 is the hexadecimal form of the decimal number 65, which in turn is the ASCII representation of a capital A.

As the JavaScript language includes an unescape() function for decoding escape sequences, no special decryption algorithm is required. So, if you want the browser to output your escaped code, you could do something like:

document.write(unescape("%48%61%6C%6C%6F%20%57%65%6C%74%21"));

(Output: “Hello World!”)

To escape your own code you may use the this page.

Rot13 Method

Rot13 is a method of text obfuscation in which every character is replaced with the character that is 13 places away in the Latin alphabet. It is rumored that this method of encryption was invented by Julius Caesar to send coded messages to his generals.

The following script provides a very comprehensive JavaScript implementation of the Rot13 algorithm:

function rot13(input) {
  return input.replace(/[a-zA-Z]/g,
    function(ch) {
      return String.fromCharCode((ch <= "Z" ? 90 : 122) >=
        (ch = ch.charCodeAt(0) + 13) ? ch : ch - 26);
  });
}

Using the above rot13() function the following code outputs “Hello World!” to the browser window:

document.write(rot13("Uryyb Jbeyq!"));

To obfuscate your own code with Rot13 please visit this page.

Base64 Method

Base64 is a way of encoding arbitrary text into a sequence of printable ASCII characters. The 64 characters within the Base64 alphabet are A-Z, a-z, 0-9, + and /. More details on the functions and uses of Base64 can be found in the corresponding Wikipedia article.

A simple JavaScript function for Base64 decoding can be found below:

function decode64(input) {
  var base64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef" +
    "ghijklmnopqrstuvwxyz0123456789+/=";
  var output = "";
  var ch1, ch2, ch3, enc1, enc2, enc3, enc4;
  var i = 0;
 
  input = input.replace(/[^A-Za-z0-9\+\/\=]/g, "");
  do {
    enc1 = base64.indexOf(input.charAt(i++));
    enc2 = base64.indexOf(input.charAt(i++));
    enc3 = base64.indexOf(input.charAt(i++));
    enc4 = base64.indexOf(input.charAt(i++));
 
    ch1 = (enc1 << 2) | (enc2 >> 4);
    ch2 = ((enc2 & 15) << 4) | (enc3 >> 2);
    ch3 = ((enc3 & 3) << 6) | enc4;
 
    output = output + String.fromCharCode(ch1);
 
    if (enc3 != 64) output = output + String.fromCharCode(ch2);
    if (enc4 != 64) output = output + String.fromCharCode(ch3);
 
    ch1 = ch2 = ch3 = "";
    enc1 = enc2 = enc3 = enc4 = "";
 
  } while (i < input.length);
 
  return output;
}

The following line of code uses the decode64() function to write “Hello World!” to the browser:

document.write(decode64("SGVsbG8gV29ybGQh"));

A form for encoding your own code with Base64 can be found here.

Encrypting Web Pages with HTML Guard

A very easy and painless way to encrypt HTML documents is to use HTML Guard, our software solution for protecting web pages from code and content theft. HTML Guard allows you to choose the parts of your web pages you want to encrypt and lets you process hundreds or thousands of files at the touch of a button. Furthermore, the obfuscation and protection techniques used by HTML Guard are much more sophisticated than the ones presented here. For instance, each processed HTML page is encrypted with a different key and uses different variable names in the JavaScript code, which makes the encryption more difficult to break.