10.7. Access Control

The Access Control functionality of Squid is perhaps its most complex set of features, but also among its most powerful. In fact, many use Squid primarily for these features. Because of its complexity we will address it in steps, breaking down the process of creating and implementing an access control list. Access control lists in Squid has two meanings within the configuration file and within the Webmin interface. First, it signifies the whole concept of access control lists and all of the logic that can be applied to those lists. Second, it applies to the lists themselves, which are simply lists of some type of data to be matched against when some type of access rule is in place. For example, forcing a particular site or set of sites to not be cached requires a list of sites to not cache, and then a separate rule to define what to do with that list (in this case, don't cache them). There is also a third type of option for configuring ICP access control. These three types of definition are separated in the Webmin panel into three sections. The first is labeled Access control lists, which lists existing ACLs and provides a simple interface for generating and editing lists of match criteria. The second is labelled Proxy restrictions and lists the current restrictions in place and the ACLs they effect. Finally, the ICP restrictions section lists the existing access rules regarding ICP messages from other web caches.

Figure 10-12. Access Control Lists

Squid access control lists

10.7.1. Access Control Lists

This section provides a list of existing ACLs and provides a means to create new ones. The first field in the table represents the name of the ACL, which is simply an assigned name, that can be just about anything the user chooses. The second field is the type of the ACL, which can be one of a number of choices, that indicates to Squid what part of a request should be matched against for this ACL. The possible types include the requesting clients address, the web server address or hostname, a regular expression matching the URL, and many more. The final field is the actual string to match. Depending on what the ACL type is, this may be an IP address, a series of IP addresses, a URL, a hostname, etc.

Figure 10-13. ACL section

Access control list section

To edit an existing ACL, simply click on the highlighted name. You will then be presented with a screen containing all relevant information about the ACL. Depending on the type of the ACL, you will be shown different data entry fields. The operation of each type is very similar, so for this example, we'll step through editing of the localhost ACL. Clicking the localhost button presents the page below.

Figure 10-14. Edit an ACL

Editing a Squid ACL

The title of the table is Client Address ACL which means that the ACL of the Client Address type, and tells Squid to compare the incoming IP address with the IP address in the ACL. It is possible to select an IP based on the originating IP or the destination IP. The netmask can also be used to indicate whether the ACL matches a whole network of addresses, or only a single IP. It is possible to include a number of IPs, or ranges of IPs in these fields. Finally, the Failure URL is the address to send clients to if they have been denied access due to matching this particular ACL. Note that the ACL by itself does nothing, there must also be a proxy restriction or ICP restriction rule that uses the ACL for Squid to use the ACL.

Figure 10-15. Creating an ACL

Creating a Squid ACL

Creating a new ACL is equally simple. From the ACL page, in the Access control lists section, select the type of ACL you'd like to create.Then click Create new ACL. From there, as shown, you can enter any number of ACLs for the list. In my case, I've created a list called SitesIHate, which contains the web sites of the Recording Industry Association of America and the Motion Picture Association of America. From there, I can add a proxy restriction to deny all accesses through my proxy to those two web sites.

10.7.1.1. Squid ACL Types

Browser Regexp

A regular expression that matches the clients browser type based on the user agent header. This allows for ACL's to operate based on the browser type in use, for example, using this ACL type, one could create an ACL for Netscape users and another for Internet Explorer users. This could then be used to redirect Netscape users to a Navigator enhanced page, and IE users to an Explorer enhanced page. Probably not the wisest use of an administrators time, but does indicate the unmatched flexibility of Squid. This ACL type correlates to the browser ACL type.

Client IP Address

The IP address of the requesting client, or the clients IP address. This option refers to the src ACL in the Squid configuration file. An IP address and netmask are expected. Address ranges are also accepted.

See Also: Client Hostname, Client Hostname Regexp.

Client Hostname

Matches against the client domain name. This option correlates to the srcdomain ACL, and can be either a single domain name, or a list or domain names, or a filename that contains a list of domain names. This ACL type can increase the latency, and decrease throughput significantly on a loaded cache, as it must perform an address-to-name lookup for each request, so it is usually preferable to use the Client IP Address type.

See Also: Client Hostname Regexp, Client IP Address.

Client Hostname Regexp

Matches against the client domain name. This option correlates to the srcdom_regex ACL, and can be either a single domain name, or a list of domain names, or a filename that contains a list of domain names.

See Also: Client Hostname, Client IP Address.

Date and Time

This type is just what it sounds like, providing a means to create ACLs that are active during certain times of the day or certain days of the week. This feature is often used to block some types of content or some sections of the internet during business or class hours. Many companies block pornography, entertainment, sports, and other clearly non-work related sites during business hours, but then unblock them after hours. This might improve workplace efficiency in some situations (or it might just offend the employees). This ACL type allows you to enter days of the week and a time range, or select all hours of the selected days. This ACL type is the same as the time ACL type directive.

Dest AS Number

The Destination Autonomous System Number is the AS number of the server being queried. The autonomous system number ACL types are generally only used in Cache Peer, or ICP, access restrictions. Autonomous system numbers are used in organizations that have multiple internet links and routers operating under a single administrative authority using the same gateway protocol. Routing decisions are then based on knowledge of the AS in addition to other possible data. If you are unfamiliar with the term autonomous system, it is usually safe to say you don't need to use ACLs based on AS. Even if you are familiar with the term, and have a local AS, you still probably have little use for the AS Number ACL types, unless you have cache peers in other autonomous systems and need to regulate access based on that information. This type correlates to the dest_as ACL type.

See Also: Source AS Number.

Source AS Number

The Source Autonomous System Number is another AS related ACL type, and matches on the AS number of the source of the request. Equates to the src_as ACL type directive.

See Also: Dest AS Number.

Ethernet Address

The ethernet or MAC address of the requesting client. This option only works for clients on the same local subnet, and only for certain platforms. Linux, Solaris, and some BSD variants are the "supported" operating systems for this type of ACL. This ACL can provide a somewhat secure method of access control, because MAC addresses are usually harder to spoof than IP addresses, and you can guarantee that your clients are on the local network (otherwise no ARP resolution can take place).

External Auth

This ACL type calls an external authenticator process to decide whether the request will be allowed. Many authenticator helper programs are available for Squid, including PAM, NCSA, Unix passwd, SMB, NTLM (only in Squid 2.4), etc. Note that authentication cannot work on a transparent proxy or HTTP accelerator. The HTTP protocol does not provide for two authentication stages (one local and one on remote websites). So in order to use an authenticator, your proxy must operate as a traditional proxy. This correlates to the proxy_auth directive.

See Also: External Auth Regex.

External Auth Regex

As above, this ACL calls an external authenticator process, but allows regex pattern or case insensitive matches. This option correlates to the proxy_auth_regex directive.

See Also: External Auth.

Proxy IP Address

The local IP address on which the client connection exists. This allows ACLs to be constructed that only match one physical network, if multiple interfaces are present on the proxy, among other things. This option configures the myip directive.

RFC931 User

The username as given by an ident daemon running on the client machine. This requires that ident be running on any client machines to be authenticated in this way. Ident should not be considered secure except on private networks where security doesn't matter much. You can find free ident servers for the following operating systems: Win NT, Win95/Win98, and Unix. Most Unix systems, including Linux and BSD distributions, include an ident server.

Request Method

This ACL type matches on the HTTP method in the request headers. This includes the methods GET, PUT, etc. This corresponds to the method ACL type directive.

URL Path Regex

This ACL matches on the URL path minus any protocol, port, and hostname information. It does not include, for example, the "http://www.swelltech.com" portion of a request, leaving only the actual path to the object. This option correlates to the urlpath_regex directive.

See Also: URL Port, URL Protocol, URL Regexp.

URL Port

This ACL matches on the destination port for the request, and configures the port ACL directive.

See Also: URL Path Regex, URL Protocol, URL Regexp.

URL Protocol

This ACL matches on the protocol of the request, such as ftp, HTTP, ICP, etc.

See Also: URL Path Regex, URL Port, URL Regexp.

URL Regexp

Matches using a regular expression on the complete URL. This ACL can be used to provide access control based on parts of the URL or a case insensitive match of the URL, and much more. The regular expressions used in Squid are provided by the GNU Regex library which is documented Regular expressions are discussed briefly in a nice article by Guido Socher at LinuxFocus. Most Unix systems will also have a man page or info page for regex. This option is equivalent to the url_regex ACL type directive.

See Also: URL Port, URL Protocol, URL Path Regex.

Web Server Address

This ACL matches based on the destination web server's IP address. Squid a single IP, a network IP with netmask, as well as a range of addresses in the form "192.168.1.1-192.168.1.25". This option correlates to the dst ACL type directive.

See Also: Web Server Hostname, Web Server Regexp.

Web Server Hostname

This ACL matches on the hostname of the destination web server.

See Also: Web Server Address, Web Server Regexp.

Web Server Regexp

Matches using a regular expression on the hostname of the destination webserver.

See Also: Web Server Hostname, Web Server Hostname.

More information on Access Control Lists in Squid can be found in [Chapter 10] of the Squid FAQ, and in [Chapter 7] of the Online Squid User's Guide. Authentication information can be found in [Chapter 23] of the Squid FAQ.