Our willingness to surrender personal privacy in exchange for services that we now consider essential, as discussed in a previous article, has made it much easier for large governments and private individuals alike to collect information.
ISPs, Telcos, device manufacturers, and security vendors go to great lengths to provide their customers’ with online security from malicious and objectionable content (adult, pornography, hate speech, terrorism, cryptocurrency mining, etc.). The industry’s best web filtering (and dns filtering) and parental controls are powered by a global network of over 600 million end users providing unmatched coverage and accuracy of active web traffic and websites. zvelo provides 99.9% coverage and over 99% accuracy for the ActiveWeb. That’s best-in-class website categorization database for OEMs and device manufacturers.
zvelo once offered 53 categories that were used to classify content on websites about Businesses & Services, Politics & Law, Portal Sites and others. This was later raised to 141 categories to help cover even more topics. The latest version boasts nearly 500 categories, making it one of the most granular categorization sets in the industry. We’ve managed to upgrade our categorization systems to better serve the needs of our existing and future technology partners and following is one example why this matters.
Given the dynamic nature of the majority of today’s websites, categorization at the full path URL versus the base domain is superior and now required. Parts of a website include the top-level domain (.com, .org, etc.), the base domain (example.com), sub-domain (subdomain.example.com) or sub-path (example.com/page). When categorizing content, it is highly important to recognize exactly what is being classified within a website because content can differ dramatically across full path URLs.
What is a URL parameter? Quite simply it is a string of characters, or a query string, that is appended to a URL that contains data. This data is passed to predefined web applications to find the appropriate content and return it back to the user’s web browser which then generates the entire web page. The query string can also be used for various other methods such as identifying a user’s session or using it as a way to look up information about your online bank account after you have logged in. URLs with parameters are used by various types of web sites however online shopping, auction, and banking type sites are probably the most prevalent.
Manually classifying the content on a single web page takes but a few seconds to accomplish. Analyzing the keywords – words or phrases – used and the number of instances of each – keyword density – is one way to go about it. When needing to classify the content on billions of web pages at a time, however, the task becomes overwhelmingly daunting for any human eye to handle. In this scenario, only an automated content classification engine can succeed.
zvelo has received many requests from its technology partners who are in the web filtering and parental control sectors to institute and support a new category that can be used to identify websites that promote self-harm behaviors. As a result of such demand, a new “Self Harm” category has been added to the zveloDB® URL database.
Anatomy of a Dynamic Website Of the hundreds of billions of URL queries zvelo has received for website categorization in 2013, an estimated 27% have been classified as being dynamic (see image 1). Dynamic categories in this data sample included Social Networking, News, Search Engines, Personal Pages & Blogs, Community Forums, Technology (General), and Chat.…
In mid-2013, British Prime Minister, David Cameron, began a push to block pornographic material on the Web in UK households. Under the new legislation, porn would be filtered by default and citizens would have to opt-in to view such adult content. Enforcement of such an ambitious initiative comes with many content categorization and technical challenges, not just in the UK, but within any internet service provider infrastructure.
Wi-Fi hotspots commonly found in many American coffee shops, restaurants and other popular after-school hang outs are providing kids with what they demand – free Internet access. This may help keep them connected with family or friends, in addition to sparing parents from costly data plan overages, but the complimentary Web access was proven to come with a twist in an Adaptive Mobile independent study. The adult, dating, extremist, drug, gambling and other similarly objectionable content typically blocked at home by some type of parental controls solution is easily accessible by kids at these Wi-Fi locations.
I got my hands on a copy of a Northwestern University research paper titled “Evaluating Android Anti-malware against Transformation Attacks.” After digging into it, my zveloLABS colleagues and I decided to conduct an experiment of our own based on the information provided in the research paper.