Estimated Reading Time: 9 minutes
There are numerous low-cost, commodity DNS and URL filtering products on the market today—all positioning themselves as “the answer to your IT and security needs.” While these “checkbox” web filtering options provide some value to non-profits, startups, and SMBs—most are woefully under-equipped to provide any meaningful coverage, protection, or customization for the majority of larger companies. For companies safeguarding sensitive data or in the cybersecurity space—these offer little to no defense against malicious threats online such as malware, phishing, botnets, and more.
For service providers and communications companies (ISPs, telcos, etc.), device manufacturers, digital advertising platforms, and any security-focused business (anti-virus, CASBs, MSSPs, etc.)—web filtering simply cannot be viewed as a “checkbox requirement”. For these companies, protecting hardware AND users (both employees and customers) is central to the success of the business—ultimately impacting the value of products and services, as well as customer trust.
If you work in IT, particularly for an enterprise or network security company, you may find yourself responsible for testing and evaluating web filtering technologies—and/or the URL databases that contribute to their efficacy. How do you determine the real value of a premium DNS, URL, or other web content filtering technology? What is the right balance of coverage, performance, protection, and control for your needs?
The goal of this blog is to provide you with important considerations and criteria for performing an evaluation. This will walk you through the most important criteria for grading a web filtering technology and how to prepare and test the following:
- Speed & Performance
- Protection / Malicious Detection
- Ease of Integration
We hope this will save you time and help you to perform a more effective evaluation—imparting you with confidence and trust that the web filtering database that you select will provide the appropriate the protection, coverage, and accuracy.
What are your Business Goals and Technical Requirements For Web and URL Filtering?
First step? Define your business goals and requirements.
Before getting underway with an evaluation, we recommend clearly defining your goals, expectations, and requirements specific to your web filtering needs. This includes how/where it will be implemented, general performance goals (queries/second, etc.), hardware requirements (storage space), etc.
Outlining your goals and requirements up front can significantly improve communication and understanding of all needs between your executive, technical, and business personnel involved in the evaluation.
Below, we’ve outlined some common questions that will help you along the path to identifying the best technology for you:
What are your primary business goals for web filtering?
- Provide custom Parental Control categories by profile
- Real-Time Identification/blocking of Malicious Threats
- Block Unwanted/Objectionable sites
- Support for # of unique categories
- Restricting Network Access to Specific Sites (social media)
- Data Collection & Analytics
- Ability to classify a URL/Domain on the fly
What are your technical requirements for web filtering?
- Are you looking for a cloud API, thin lightweight, on-premises database, etc.?
- If applicable, what are your device or app storage limitations?
- How many requests/second do you need to support?
- What operating systems do you need to support? Other environment needs?
- Specific taxonomies? (IAB, Objectionable, etc.)
- Do you need real-time updates? When will other regular updates be made?
- Level of maintenance/support? Ease of integration?
By gathering and determining these details up front, your team will be prepared to answer questions specific to their contributing role in your evaluation and implementation needs.
What Criteria Should Be Used To Measure a Web or URL Filtering Database?
Now that you’ve defined requirements and expectations for an evaluation, let’s cover the important criteria to grade by during a web filtering database evaluation.
Coverage is defined by the total number of URLs queried that return a category as compared to the total number of URLs tested.
Coverage % = (# of Categorized URLs) / (Total URLs Tested)
When evaluating a URL category database or web filtering technology, Coverage is one of the most critical quality indicators. A high coverage rate ensures that the technology or service provider has and maintains systems that continuously monitor, analyze, and categorize new sites and pages. For protection against malicious threats a high coverage rate is paramount. Web filtering and categorization services aren’t working to actively fight virus or quarantine malicious code—they are identifying and blocking threats before you connect to them. So, if the threat hasn’t been found and identified as malicious—the filtering solution itself won’t offer any protection against it.
GOAL: A Coverage rate (percentage) in the upper 90’s is a sign of a high-quality web filtering solutions. zvelo proudly maintains over 99.9% Coverage of the ActiveWeb.
Accuracy is defined as the percentage of categorized URLs that are verified as being correctly classified.
Accuracy % = (# of Accurately Categorized URLs) / (Total URLs Tested)
This indicator above all others, is what separates the great web filtering technologies from the rest of the pack. Accuracy should be measured using human verification to qualify the categories returned for your test corpus of URLs. Uncategorized URLs and miscategorizations should be considered inaccurate. Accuracy may vary based on the source language of web content, as well as other factors.
GOAL: Similar to Coverage—an Accuracy percentage in the upper 90’s indicates both quality and protection. For example, an Accuracy of 99% demonstrates that the web filtering technology’s systems and processes are finely tuned and managed to return correct classifications.
Speed & Performance
The speed and performance of a web filtering technology is critical to the user experience and must meet the demands of any sized network—making it another of the most critical evaluation criteria.
In many cases, it is prudent to perform shorter, focused tests up front to determine the overall viability of a web filtering technology (i.e. for Coverage and Accuracy). Once complete, we recommend running longer tests with an API, local SDK, or other implementation on a network with real-world traffic in order to measure performance. Some important test metrics and things to think about include:
- Identify peak resource usage
- Identify maximum number of queries per second
- Identify any blockages
- Measure latency and calculate the time to return a URL category
- Measure CPU and disk usage
Coverage, Accuracy, and Performance are among the most important aspects to evaluate—but depending on your application or use case, you may wish to take a closer look at the following:
# of Categories Supported: A greater number of unique categories supports increased precision and filtering capabilities based on domain/URL classifications.
Full Path URL Support: Full path refers to the complete URL (Universal Resource Locator), indicating the individual and specific page, article, or file on the site. This includes the base domain—as well as the protocol, subdomain, path, file, and any parameters included in the URL. Particularly for malicious sources which can reside in just one file—or on a single page of a website—full path URL support is critical. If you’d like to know more about the difference between full path and base domain, check out our blog here.
Malicious Detection: High Coverage and Accuracy marks generally indicate that a filtering technology has malicious detection capabilities. The lifespan of online threats varies significantly—requiring continuous analysis and re-evaluation of compromised threats to keep up with status changes.
Language Support: The internet is a global—therefore effective web filtering technologies must support categorization and filtering of all websites and pages, regardless of language. zvelo’s categorization services support nearly 200 languages worldwide—providing the highest level of coverage, regardless of native language.
How to Prepare URLs For Testing a Web Filtering Technology/Database
Understanding how to build a corpus of URLs for testing can save you a significant amount of time and energy. It will also ensure you are prepared for questions that will arise and allow you to better compare results between multiple vendors/solutions.
Here are test-specific recommendations:
Prepare a Testing Set for Coverage
Be sure to gather/include URLs that are good examples of your actual traffic and needs. This may mean working with your network administrator to pull recent traffic logs and working together to define all of the various expected traffic. By doing this, the Coverage Rate(s) you see during an evaluation will closely match what you can expect once it’s fully implemented.
- Pull URLs that are representative of your network’s traffic, but also from popular “known” sites.
- Include a combination of domain-only and full path URLs.
- Remove any duplicates from your test corpus.
- Remove any unwanted parameters or Personal Identifiable Information (PII) from URLs.
Additionally, you’ll want to run a Coverage test of this same corpus of URLs on several occasions to see how coverage changes/improves over time. Be sure to track your test results.
Prepare a Testing Set For Accuracy
For measuring accuracy, you can use the same corpus of URLs from your coverage testing to verify categories. You may wish to define what categories are considered accurate and have the same team perform accuracy verification across your tests—since different vendors have different names for the same category.
Remember to pay careful attention to malicious and objectionable categories. You may wish to build a secondary test corpus of URLs for this. Because of the nature of malicious and objectionable content—it is critical to test URLs that are very recent and up to date. You may wish to pull a list of malicious/objectionable URLs from email quarantine or an up-to-date malicious feed.
NOTE: If you test objectionable content (e.g. child pornography, terrorist related, etc.), be aware of any legal or compliance requirements—and follow appropriate and applicable local laws. In many countries, it is illegal to even possess a list of these URLs.
Testing For Performance
When testing for performance you may wish to use network activity monitors, “sniffers”, and other diagnostic tools to determine overall performance. You should run a variety of tests on relevant networks and devices—and for a variety of durations to determine any limitations or shortcomings of the technology.
Web Filtering Value Breakdown and Feature Comparison
Taking all of this into account, we can compare the overall value and features of web filtering technologies. Most web filtering technologies fit into one of the following four (4) tiers:
- Commodity DNS
- Basic DNS
- Enhanced DNS
- Premium Web Filtering
Commodity DNS offerings are often selected when low/no cost is the main driver. These “checkbox” filtering solutions often use static feeds and unmanaged databases that provide minimum coverage, accuracy, and protection. Unfortunately, they offer little to no protection against malicious and objectionable sources, lack customization or flexibility, and updates are often delayed.
Basic DNS solutions are often cloud-based, offering easy implementation and fast performance. Similar to commodity offerings, they tend to be more affordable because the underlying database or data sets are not actively managed for high protection or efficiency. Additionally, they do not provide coverage beyond the domain level—offering limited protection against malicious and objectionable content. They are also insufficient for discerning audience, customers, or anything at the full path level.
Enhanced DNS offerings provide improved flexibility, coverage, and accuracy while powered by a managed filtering database for improved protection—achieving a passing web filtering grade. These solutions are still budget-conscious and are often provided via cloud services that keep cost down and remove the requirement for an on-premise UTM/Gateway. However, this also limits protection from malicious and objectionable content. Enhanced DNS is easy to implement but lacks the benefits of full path URL support.
Premium Web Filtering technologies and databases—like zveloDB™ URL Database—provide exceedingly high coverage and accuracy rates. They are built to support a variety of implementation options and to scale for enterprises and security-conscious businesses. This, combined with high coverage, accuracy, # of categories, and full path support—results in outstanding identification and protection from objectionable content (i.e. pornography, terrorism, violence/hate speech) as well as malicious content such as phishing, botnets, malware, and more. The market’s leading network security, anti-virus vendors, device manufacturers, and communications companies all require premium offerings to provide meaningful, up-to-the-minute protection for their employees and users.
In the table below, we’ve compared the features offered by filtering solutions in each of these tiers. From a standpoint of capabilities and protection—the results are clear.
Ultimately, with web filtering technologies and databases—you get what you pay for. Enhanced and Premium offerings provide the highest levels of coverage, accuracy, and protection from online threats. They also provide a higher level of flexibility and customization for IT team to control and tailer the web browsing experiences of users and customers—depending on a company’s model. But rest assured, paying a higher price for Premium Web Filtering means you are supported by teams of security professionals, scalable systems and infrastructure, and responsive customer support that is singularly focused on the quality and protection of their offering.
Final Thoughts & Considerations
At the time of writing this blog, there are nearly 2 billion websites on the internet and counting. It’s important to remember that no web filtering technology will achieve 100% accuracy—AND that content and malicious threats are constantly changing. Today, premium web filtering and categorization services, like zveloDB, use machine learning along with scalable systems and processes to perform continuous monitoring and analysis of content on the internet. The result? Peace of mind.
For more information about the zveloDB URL Database and our other data solutions, contact our support team. If you’re ready to schedule an evaluation, click here.Schedule an Evaluation