The internet was conceived by its pioneers as a place where information can be shared quickly and freely. But today, this philosophy doesn’t apply to everything on the internet. Some of the content is disturbing, offensive, harmful, or outright illegal.
Content filtering is a way of staying away from such things. This article delves into what content filtering is, what types there are, and what it is for.
Table of contents
What is content filtering?
Content filtering is the use of software to prevent access to undesirable web pages (or only their parts) or email messages.
These web pages or emails contain material that is either harmful (e.g. malware, spoofed sites) or in some way objectionable (adult content, social networking sites).
A content filter stands between the user and the source of the content and detects and blocks content that matches the filtering rules. Older content filters reside on the firewall, whereas newer solutions are SaaS-delivered.
Content filters can use a variety of techniques, such as keyword lists to identify harmful content, domain and IP blacklists to filter out dangerous sites, or they identify objects, such as executable files or active elements that may install malware on the user’s device.
Modern content filters use AI, i.e. machine learning and pattern recognition, to reduce the number of false positives that keyword-based content filters suffer from (an example of such a false positive can be a website offering medical information filtered out for sexually explicit content).
The variety of techniques allow content filters to analyze multiple content types, notably:
- Websites—Filtering web traffic is an integral component of businesses and individuals alike.
- Email—Email service providers usually offer a spam filter with their service, and organizations often deploy additional email filters to boost phishing and email spoofing protection.
- Files—Content filters can also block the download of potentially harmful files, like suspicious email attachments or downloads from disreputable sources.
Content filtering has a broad scope, encompassing a large number of technologies, but the primary purpose is always to block content that is harmful or otherwise undesirable.
When is content filtering used?
Content filtering is a widespread practice exercised by businesses, individuals, ISPs, and even governments. Depending on the use case, content filtering may go under a different name, e.g. secure web gateway, parental controls, ad blocking, or censorship.
Below are some of the main use cases of content filtering:
- Malware protection—Content filtering protects against harmful websites or spoofed emails. A web/DNS/email filter is often the first line of defense against malware like trojans, ransomware, or worms.
- Limiting distractions—Companies sometimes block social media sites and streaming services to block access to productivity sinks that distract their employees. Individual users also often install ad blockers to prevent intrusive advertisements from distracting them and consuming bandwidth.
- Parental control—Parents sometimes enable content filtering on devices used by their children. This prevents the child from visiting sites with content inappropriate or inaccessible to under-18s (sites with adult content, violent content, content on guns or alcohol and other substances).
- Censorship—National governments sometimes require ISPs bar access to parts of the internet to curb dissent, prevent radicalization, or contain the spread of disinformation.
What types of content filtering are there?
There are several types of content filters, each with different uses:
Client-side filters
Client-side content filters are applications installed on the user device. These are most common in home use, e.g. as parental controls. Parental controls are password-protected to prevent children from disabling the filter.
Client-side content filters are sometimes used by businesses, but, in large, diverse deployments that encompass remote workers and unsecured networks, it can be difficult to manage, which is why companies sometimes use SaaS solutions that better cope with heterogeneous networks.
Some client-side content filters are internet browser extensions that block online advertisements—ad blockers. Ad blockers use a set of rules (custom rules can be set up by the user) to block communication with advertising servers and blot out ad elements on web pages.
Many internet browsers also have a built-in content filter as part of malware protection.
Client-side internet filters use various techniques for content filtering, such as URL-based filtering or keyword filtering.
Server-side filters
Server-side content filters block harmful content before it reaches the user’s device.
They are installed on a key network device (firewall, load balancer, VPN gateway, email server), where they filter traffic based on a configurable set of rules.
The upside of server-side content filters is ease of management. They allow administrators to manage the filters centrally and apply it to all the company traffic passing through the server. But there is also the potential challenge of enforcing a content policy on remote workers, whose traffic may be routed differently than via the key server.
Server-side content filters can filter content using several techniques. Some block pages and websites based on their URL, while others block access on the DNS level based on domain names and IP addresses.
GoodAccess Threat Blocker is an example of a DNS filter used for automated threat protection and custom content policy enforcement. It is SaaS-based and has no trouble applying the content policy on on-prem and remote users, regardless what network they connect from.
ISP-level filters
Internet service providers (ISP) sometimes impose access policies that prevent users from accessing certain sites. This decision is either a legal obligation, client’s request (e.g. claiming copyright infringement), or it can be their own.
Content filtering at the ISP level means the filter’s management is offloaded onto the service provider, which makes it an awkward solution for businesses due to the limited control they have over the content filtering rules. ISP-level content filters are mostly used by governments, as the filter affects every user within the ISP’s scope.
But ISP-level filters aren’t infallible and users can bypass the filter by using VPN tunnels or altering DNS configuration. That’s why some governments regulate the use of VPNs by their citizens.
Search-engine filters
Search engines also filter search results to protect users from inappropriate and malicious content. Google Safe Search is the best known example.
From the home user perspective, search-engine content filters are very general and don’t offer much granularity of configuration beyond “strict”, “moderate”, “off” settings.
However, business solutions like Google Workspace or Microsoft 365 contain tools that allow much more customization, enabling businesses to block search results that violate company content policy with far greater granularity.
Criticism
Because content filtering can be used both to protect users from threats and objectionable content and to restrain free speech, the ethical connotations of its use has been called into question.
This concerns mainly content filters deployed and managed by ISPs or public libraries (censorware), and it becomes an issue in cases where the use of the filter is mandatory and the user has no way to switch it off, as it would constitute an unacceptable level of censorship in most Western societies.
There were cases where the use of a content filter has been examined by court, notably the Mainstream Loudoun vs. Board of Trustees of Loudoun County Library, which was a court case that ruled the County’s policy on content restriction was in violation of the free speech provisions of the First Amendment.
Users use various techniques to bypass content filtering imposed by ISPs and public bodies, most notably personal VPNs, which allow them to alter their online identity and remove themselves out of the filter’s scope. Specifically, the VPN changes the user’s IP address, which means that other machines on the internet will perceive them to be in a different location, where different rules apply.
Summary
Content filtering has a firm place in today’s digital landscape. It employs software to shield users from harmful or objectionable online content.
Its uses range from malware protection to limiting distraction and parental controls, and are widely used by companies and organizations as a protective layer against malware.