If one performs the search “use www or not,” well over a billion results in many of the most popular search engines are returned. The focus of each result may differ. For zvelo, the usage is irrelevant because its contextual categorization processes are designed to identify and handle each component of a URL. At a simplistic view, the basic components of a URL are the following:
Cybercrime against high-profile entities like eBay and Target is on the rise, and the media has conjured up nightmarish scenarios of cyber-criminals going on shopping sprees with our well-earned cash – easily obtained through stolen credit card information. The risks that the general public faces vary and should not be applied equally.
zvelo once offered 53 categories that were used to classify content on websites about Businesses & Services, Politics & Law, Portal Sites and others. This was later raised to 141 categories to help cover even more topics. The latest version boasts nearly 500 categories, making it one of the most granular categorization sets in the industry. We’ve managed to upgrade our categorization systems to better serve the needs of our existing and future technology partners and following is one example why this matters.
The importance of the Alexa top websites can never be discounted in zvelo’s day-to-day operations. Providing contextual data sets about the Alexa top sites is a vital element for the online advertising market because it can assist in determining the most ideal and brand-safe placement of online ads and other promotional materials.
Recent events serve as the best example of how the context of security has shifted from the once server-centric model to that of a decentralized threat landscape. From the Heartbleed attacks to the widespread Internet Explorer vulnerabilities and finally the sensationalized OAuth issues, it appears that even organizations with a hardened perimeter infrastructure are just as vulnerable as an end-user at home.
Given the dynamic nature of the majority of today’s websites, categorization at the full path URL versus the base domain is superior and now required. Parts of a website include the top-level domain (.com, .org, etc.), the base domain (example.com), sub-domain (subdomain.example.com) or sub-path (example.com/page). When categorizing content, it is highly important to recognize exactly what is being classified within a website because content can differ dramatically across full path URLs.
The Internet Watch Foundation works to remove online videos and images of child sexual abuse and its 2013 Annual & Charity Report highlighted significant milestones achieved and a big year of change.
What is a URL parameter? Quite simply it is a string of characters, or a query string, that is appended to a URL that contains data. This data is passed to predefined web applications to find the appropriate content and return it back to the user’s web browser which then generates the entire web page. The query string can also be used for various other methods such as identifying a user’s session or using it as a way to look up information about your online bank account after you have logged in. URLs with parameters are used by various types of web sites however online shopping, auction, and banking type sites are probably the most prevalent.
Manually classifying the content on a single web page takes but a few seconds to accomplish. Analyzing the keywords – words or phrases – used and the number of instances of each – keyword density – is one way to go about it. When needing to classify the content on billions of web pages at a time, however, the task becomes overwhelmingly daunting for any human eye to handle. In this scenario, only an automated content classification engine can succeed.
Prior to this blog post, zveloLABS published a phishing URL alert about fake Apple account verification websites. Now, zvelo’s team of engineers and researchers has unearthed a new phishing attack campaign using fraudulent Facebook log-in sites.