AutoCat (Cloud) Website, URL, Web Content Categorization Engine
The AutoCat automated classification engine utilizes proprietary, AI-based technologies to categorize new URLs across dozens of languages.
The processes and systems within the AutoCat engine (short for automated categorization) combine URL analysis, taxonomic content categorization and zero-hour malicious or compromised website detection to provide highly accurate categorization of URLs and dynamic web content.
Categorization Engine Features
Following is a list of features that position the AutoCat engine as an ideal solution for the demands of web content filtering applications developed by service providers, endpoint security and anti-virus software makers, UTM/gateway appliance vendors, online advertising and brand safety technology providers and other high growth markets segments where accuracy, coverage, malicious website detection and fast URL query performance are required.
The ActiveWeb is defined as the websites visited by actual users. These highly engaged and active users are those of zvelo’s technology partners and make up the global community that continually feeds web queries into the AutoCat engine hosted on the zveloNET® cloud network.
Websites, URLs and web content queried to and categorized by AutoCat are immediately stored in the zveloDB® URL database. zveloDB provides in-depth contextual content categorization support at the domain, sub-domain, sub-path and page level, particularly important for social networking and blogging websites where diverse and ever-changing content exists, often times hidden behind a log-in.
Nearly 500 Categories
With the most granular category set in the industry, the near 500 categories of the zveloDB URL database allow for precision policy management and controls. Up to five categories can be assigned to any given URL, which can be mapped to a virtually unlimited number of optional category sets, including legacy vendor's category sets, parental control category sets, reputation filtering category sets, the IAB’s contextual taxonomy tiers and targeting levels, and others.
AutoCat analyzes a number of factors when categorizing URLs, such as history, links, patterns, HTTP status codes and many more.
Taxonomic Content Categorization
The taxonomic content categorization of AutoCat accounts for language, content type, text and meta data analysis and other important factors.
Malicious Website Detection
AutoCat provides detection and protection against malicious, compromised and infected websites. AutoCat systems are continuously enhanced by zveloLABS® – a team of anti-virus, anti-malware engineers and researchers that aim to close the zero-hour coverage gap of most commercial anti-malware software. In addition to developing its own proprietary malicious detection capabilities, zveloLABS has malicious and signature sharing relationships with a wide range of third party security and anti-malware organizations to ensure the highest level of detection for threats ranging from botnets, malware distribution points and compromised sites, to sites being used for phishing, fraud and spam.
zvelo fully supports IPv6 for both URL queries and for the categorization of IPv6 addresses. Supporting IPv6 is necessary since IPv4 address blocks are increasingly becoming scarce in supply.
Spam Web Page Detection
The enhanced spam web page detection feature of AutoCat combines extensive link and content analysis that take into account a number of factors, such as the number of external links and page reputation. Content category variances are also measured. The more unrelated content the more likely the page is spam. These pages are also compared to a list of commonly spam-targeted URL categories like finance, porn and blogs. A point system is employed to determine if a site is spam based on the aforementioned and other criteria. The results are dynamically stored within the zveloDB URL database so that any subsequent hits to these spam web pages can be accounted for and blocked by zvelo’s technology partners.
Embedded URLs Categorization
Embedded URLs can exploit vulnerabilities in common web filters to allow access to prohibited websites or inappropriate web content. The inadequate blocking of embedded URLs poses significant risks to web content filtering OEMs. AutoCat boasts the capability of detecting and categorizing embedded URL content, which comprise of both manual and automated options for decoding and effectively categorizing embedded URLs within query strings. The manual option, which is an API function, will search for embedded URLs in either plain text or obfuscated formats (including Base64 and rot13). The automated option identifies embedded URLs in anonymizer and translator websites and returns merged category sets for the full URL queried.
Quality Assurance Process
URLs and web content categorized by AutoCat undergo strict monitoring and a continuous double-blind Quality Assurance verification process conducted by a team of multi-lingual web analysts, linguists and supervisors. This process proves vital for critical “objectionable” categories such as porn, hate, violence, profanity, weapons and other inappropriate content. In addition, response is immediate to miscategorized URLs (miscats), providing a constant feedback loop to the zveloNET categorization systems and processes.
To learn more about zvelo in action, click the OEM Partnership case studies.