A web content filtering vendor desired to incorporate a URL database and dynamic web content categorization into their web content filtering application and embed it into a child-friendly Android tablet to ensure safe…Details
ISPs, Telcos, device manufacturers, and security vendors go to great lengths to provide their customers’ with online security from malicious and objectionable content (adult, pornography, hate speech, terrorism, cryptocurrency mining, etc.). The industry’s best web filtering (and dns filtering) and parental controls are powered by a global network of over 600 million end users providing unmatched coverage and accuracy of active web traffic and websites. zvelo provides 99.9% coverage and over 99% accuracy for the ActiveWeb. That’s best-in-class website categorization database for OEMs and device manufacturers.
Advertisements are everywhere, from print publications to road-side billboards, and of course TV and on the Web. The intent of advertising is no different regardless of the medium. Advertisers are constantly feuding to win over consumer sentiment. On the Internet, ad-serving technologies have become so advanced that ads can now be targeted based on one’s individual web browsing history and behaviors, likes, shares, location, device type and other factors. From time to time, however, ad placements land severely out-of-context, and here is one such example of online advertising gone bad.
Our willingness to surrender personal privacy in exchange for services that we now consider essential, as discussed in a previous article, has made it much easier for large governments and private individuals alike to collect information.
zvelo once offered 53 categories that were used to classify content on websites about Businesses & Services, Politics & Law, Portal Sites and others. This was later raised to 141 categories to help cover even more topics. The latest version boasts nearly 500 categories, making it one of the most granular categorization sets in the industry. We’ve managed to upgrade our categorization systems to better serve the needs of our existing and future technology partners and following is one example why this matters.
Given the dynamic nature of the majority of today’s websites, categorization at the full path URL versus the base domain is superior and now required. Parts of a website include the top-level domain (.com, .org, etc.), the base domain (example.com), sub-domain (subdomain.example.com) or sub-path (example.com/page). When categorizing content, it is highly important to recognize exactly what is being classified within a website because content can differ dramatically across full path URLs.
What is a URL parameter? Quite simply it is a string of characters, or a query string, that is appended to a URL that contains data. This data is passed to predefined web applications to find the appropriate content and return it back to the user’s web browser which then generates the entire web page. The query string can also be used for various other methods such as identifying a user’s session or using it as a way to look up information about your online bank account after you have logged in. URLs with parameters are used by various types of web sites however online shopping, auction, and banking type sites are probably the most prevalent.
Manually classifying the content on a single web page takes but a few seconds to accomplish. Analyzing the keywords – words or phrases – used and the number of instances of each – keyword density – is one way to go about it. When needing to classify the content on billions of web pages at a time, however, the task becomes overwhelmingly daunting for any human eye to handle. In this scenario, only an automated content classification engine can succeed.
zvelo has received many requests from its technology partners who are in the web filtering and parental control sectors to institute and support a new category that can be used to identify websites that promote self-harm behaviors. As a result of such demand, a new “Self Harm” category has been added to the zveloDB® URL database.
Anatomy of a Dynamic Website Of the hundreds of billions of URL queries zvelo has received for website categorization in 2013, an estimated 27% have been classified as being dynamic (see image 1). Dynamic categories in this data sample included Social Networking, News, Search Engines, Personal Pages & Blogs, Community Forums, Technology (General), and Chat.…
In mid-2013, British Prime Minister, David Cameron, began a push to block pornographic material on the Web in UK households. Under the new legislation, porn would be filtered by default and citizens would have to opt-in to view such adult content. Enforcement of such an ambitious initiative comes with many content categorization and technical challenges, not just in the UK, but within any internet service provider infrastructure.