In this paper, we present the results of a large-scale analysis of open HTTP proxies, focusing on determining the extent to which user traffic is manipulated while being relayed. We have designed a methodology for detecting proxies that, instead of passively relaying traffic, actively modify the relayed content. Beyond simple detection, our framework is capable of macroscopically attributing certain traffic modifications at the network level to well-defined malicious actions, such as ad injection, user fingerprinting, and redirection to malware landing pages.

Our study reveals the true incentives of many of the publicly available web proxies. Our findings raise several concerns, as we uncover multiple cases where users can be severely affected by connecting to an open proxy. As a step towards protecting users against unwanted content modification, we built a service that leverages our methodology to collect and probe public proxies automatically, and generates a list of safe proxies that do not perform any content modification, on a daily basis.