Advertisement

Browser Add-ons Widely Used as Distributed Scraping Network for AI Model Training

Security researchers from Secure Annex have identified a growing trend in browser add-on monetization. Developers are embedding a JavaScript library named mellowtel.js into extensions to form a distributed network of user systems for web content scraping. These systems act as proxies and bots that download website content upon receiving instructions from external servers.

Advertisеment

Microsoft Edge Banner Secure Lock
Image by winaero.com

Upon receiving a task, the add-on loads the requested content into a hidden iframe and sends the data to an external service. Researchers have identified this behavior in 245 browser extensions across Chrome, Firefox, and Edge stores, collectively installed by over 909,000 users.

The issue in detail

The Mellowtel platform, which develops the library, positions itself as an ethical alternative to traditional monetization methods such as intrusive advertising or user data exploitation. Developers earn income by allowing their extensions to perform indexing tasks for AI companies. Mellowtel distributes 55% of the generated revenue to extension developers.

The platform claims mandatory user consent and voluntary participation. It requires developers to disable scraping by default and to provide a visible option for users to activate or deactivate the feature. However, not all developers adhere to these guidelines.

Some add-ons activate the scraping functionality without user knowledge, violating the platform’s policies. Google has removed 12 out of 45 Chrome extensions using Mellowtel for malicious behavior. Microsoft has blocked 8 of 129 extensions in its store, and Mozilla has removed 2 of 71 extensions. The specific reasons for removal remain undisclosed, though unauthorized scraping is a likely cause.

To address misuse, Mellowtel has announced mandatory verification of all extensions using its service. Add-ons must ensure that scraping is disabled by default and that users can easily opt out. Requests from non-compliant extensions will be quarantined and ignored.

What the add-ons do

When active, these extensions establish a WebSocket connection to servers hosted on AWS. They transmit metadata including user availability, country location, bandwidth, and whether the user has enabled scraping. The add-ons send periodic heartbeat signals to maintain the connection.

To load external content via hidden iframes, the Mellowtel library strips Content-Security-Policy and X-Frame-Options HTTP headers using the declarativeNetRequest API. It restores these headers after content retrieval. This temporary bypass weakens browser security and exposes users to potential cross-site scripting (XSS) attacks.

Mellowtel is affiliated with Olostep, a service that offers a high-speed scraping API capable of sending up to 100,000 requests per minute. This level of traffic, when directed at a single target, can resemble a distributed denial-of-service (DDoS) attack. Additionally, the system may exploit user devices as proxies to access private websites available only within specific network environments, such as those behind a corporate VPN.

The cause of the issue

AI companies require large volumes of data to train machine learning models. This demand has led to the proliferation of automated bots that scrape and index websites aggressively, without adhering to standard limitations or respecting robots.txt directives. These bots generate excessive traffic that imposes a significant burden on servers, interferes with system performance, and consumes administrative resources. As a result, organizations have increasingly adopted countermeasures to block such activity.

For instance, Cloudflare, a major content delivery network, now includes default bot-blocking capabilities in its services. Additionally, many websites have deployed the Anubis bot protection mechanism. This system grants access only after the client executes a JavaScript challenge. The solution involves generating a value whose SHA-256 hash, when combined with a server-generated string, produces a specified number of leading zeros. Solving this challenge requires computational effort from the client, while verification remains lightweight for the server.

As AI companies continue to demand large volumes of training data, new monetization models like Mellowtel will likely appear in the near future. The end user has to be careful with add-ons they use, as such 'features' may suddenly appear even in trusted extensions.

Source

Support us

Winaero greatly relies on your support. You can help the site keep bringing you interesting and useful content and software by using these options:

If you like this article, please share it using the buttons below. It won't take a lot from you, but it will help us grow. Thanks for your support!

Author: Sergey Tkachenko

Sergey Tkachenko is a software developer who started Winaero back in 2011. On this blog, Sergey is writing about everything connected to Microsoft, Windows and popular software. Follow him on Telegram, Twitter, and YouTube.

Leave a Reply

Your email address will not be published.

css.php
Using Telegram? Subscribe to the blog channel!
Hello. Add your message here.