Zero logo

Portable browser - PAC

You can create independent websites and serve these from the same Apache server using Apache’s virtual hosting. Local testing requires resolving their domain names to the local PC. This is achieved by inserting an ip/domain-name pair in the local machine's Hosts file.

Unfortunately resolving domain-names with this method is not portable because you are required to edit the new target PC’s Hosts file. This page looks at resolving this issue using a PAC file and a portable browser. This combination makes the server completely portable.

Background

Testing multi-web sites is relatively easy. First create a virtual host section for each site in the Apache configuration file. Use the hosts file on your machine to resolve the web site names to IP address. That’s it; finished. Suppose you do not have access to the host file. What alternatives are there?

If you use only relative links on your site, you can use the Apache alias and re-map. However if your sites are using absolute or root-relative links, this method will fail.

So what other methods are there for resolving IP addresses? A local DNS server would do nicely, even allow you to enter MX records. That seems a bit of overkill, since all we want to do is convince our browser to pick up pages from a different server masquerading as the real server. Enter the world of proxy servers.

PAC

Web browsers can be configured to use a proxy server, allowing the use of files or other resources available on a different server.

This process is automated with a Proxy Auto-Configuration (PAC) file. Sounds complicated! Not really. The PAC file is a simple text file containing a few instructions. Just tell your browser where to find it by setting the appropriate options. Your browser reads this file when it is re-started and whenever it needs to resolve an IP address.

Any special instructions are executed before your browser attempts to resolve an IP address. This is why PAC is so powerful. PAC allows you to simulate a DNS server locally, with one line per CNAME entry. Unlike the hosts file it is dynamic. PAC is a standard and uses a single file that is supported by all modern browsers.

PAC file

The PAC file is a JavaScript consisting of a single function. This function receives two parameters (url and host). Parameters are automatically provided by a browser.

Internally your browser calls this function using the following line: ret = FindProxyForURL(url, host)

PAC file:
function FindProxyForURL(url, host)
{
  ...
} 

url   The full URL being accessed e.g.: http://wiki.uniformserver.com/index.php/Main_Page
host   The hostname extracted from the URL. It is the string between :// and the first (: or /) for example
http://wiki.uniformserver.com/index.php/Main_Page
Note: The port number is not included. If required, it can be extracted from the URL
ret   The return value is a string describing the configuration.

Return Value:
The JavaScript function must return a single string. If the string is null, no proxies will be used. The string can contain any of the following:

DIRECT   Connections should be made directly, without any proxies.
PROXY host:port   The specified proxy should be used. It's this return value we are interested in
SOCKS host:port   The specified SOCKS server should be used.

Note:
In the FindProxyForURL function you can use a number of predefined functions. We are interested in only two: shExpMatch and dnsDomainIs.
If these do not suit your needs, you can easily find more information on the Internet.

Compare function - shExpMatch

The function shExpMatch(str, shexp) compares two strings:

str   Any string to be compared for example the url or the host.
shexp   The expression to compare against. This expression can contain wildcard characters:
* matches anything
? matches one character
\ will escape a special character
$ matches the end of the string
[abc] matches one occurrence of a, b, or c. The only character that needs
to be escaped in this is ], all others are not special.
[a-z] matches any character between a and z
[^az] matches any character except a or z
~ followed by another shell expression will remove any pattern
matching the shell expression from the match list
(foo|bar) will match either the substring foo, or the substring bar.
These can be shell expressions as well.

The function returns true if the string matches the specified expression.

Examples:

  • Example 1: shExpMatch("http://home.unicenter.com/site1/index.html", "*site1*") returns true.
  • Example 2: shExpMatch("http://home.unicenter.com/site2/index.html", "*site1*") returns false
  • Example 3: shExpMatch("http://home.unicenter.com/site1/index.html", "*unicenter.com*") returns true.
  • Example 4: shExpMatch("http://home.unicenter.com/site2/index.html", "*unicenter.com*") returns true.

The above is just an example. Replace the fixed string with a variable such as the url or host, for example:

  • Example 5: shExpMatch("url", "*site1*") Any url passed to the function containing the string site1 (anywhere) returns true.
  • Example 6: shExpMatch("url", "*unicenter.com*") Any url passed to the function containing the string unicenter.com (anywhere) returns true.
  • Example 7: shExpMatch("host", "unicenter.com") A host passed to the function containing the string unicenter.com returns true.
  • Example 8: shExpMatch("host", "*unicenter.com") A host passed to the function containing the string unicenter.com returns true. This includes a sub-host such as wiki.unicenter.com

Complete PAC file

Add the above filter to the PAC function and we have our very own DNS resolver, with the ability to define any CNAME that we wish.

The PAC file shown below currently resolves localhost, like the host file it maps localhost to IP address 127.0.0.1

Note: If any of the shExpMatch functions do not find a match function FindProxyForURL returns a null value. In this situation to resolve the host name a browser needs to do a little more work. It checks the hosts file, then any local DNS server, and as a last resort puts a request onto the Internet to a DNS server.

function FindProxyForURL(url, host){
if (shExpMatch(host, "*localhost*")) return "PROXY 127.0.0.1:80";
...Other shExpMatch statments ....
...Other shExpMatch statments ....
return "";}

Hosts file emulation with a PAC file

Local testing requires an entry in the Windows hosts. Typical entries contained in this hosts file is show below:

Windows host:

For local testing the host file maps a domain name to IP address (127.0.0.1).
If the domain includes sub-domains for example www.fred.com and wiki.fred.com
they each requires a separate entry in the hosts file.

This is one limitation of the hosts file.

Hosts file:
 127.0.0.1 localhost
 127.0.0.1 www.ric.com
 127.0.0.1 fred.com
 127.0.0.1 www.fred.com
 127.0.0.1 wiki.fred.com

With a PAC file, specific strings can be targeted. We are interested in resolving domain names (host) and not what is typed into a browser (url). There are only three domains to resolve: localhost, www.ric.com and fred.com hence our PAC file requires only three enteries as follows:

PAC:

The domain fred.com is resolved using the wildcard star (*) operator. This picks
out fred.com's sub-domains www and wiki

proxy.pac file:
function FindProxyForURL(url, host){
 if (shExpMatch(host, "*localhost")) return "PROXY 127.0.0.1:80";
 if (shExpMatch(host, "www.ric.com")) return "PROXY 127.0.0.1:80";
 if (shExpMatch(host, "*fred.com")) return "PROXY 127.0.0.1:80";
return "DIRECT";}

Summary

Uniform Server PAC:

The above was just a quick introduction to the PAC file; it is much more powerful and worth further investigation. Search the Internet for more information. I have only covered what is need for the Uniform Server proxy.pac file.

proxy.pac file:
function FindProxyForURL(url, host){
 if (shExpMatch(host, "*localhost")) return "PROXY 127.0.0.1:80";
 ... Other shExpMatch statements automatically added and deleted ....
return "DIRECT";}

The basic structure is shown on the right and is about as complex as it gets.

Note: Creating an Apache Vhost automatically adds its domain name to the proxy.pac file. Deleting a Vhost also deletes its corresponding domain name entry in the PAC file.


--oOo--