Since a report by the WODC (see Filtering The Internet) stated that it is unclear whether filtering is effective or not, we have been in contact with the Ministry of Justice and Meldpunt Kinderporno (Hotline combating Child Pornography) and discussed ways of reducing child pornography on the Internet. Although we, as a Service Provider, are not responsible for the content our customers put online, we find it good corporate governance to do everything we can to reduce this problem. As a result we will start a pilot soon with filtering images.
Leaseweb is hosting a lot of public upload sites (image- and filehosting sites). For some of those sites, it is impossible to check every file that is being uploaded by unknown users (we have customers with over 1.000.000 uploads/day). With the help of our customers, we have created an online service where hashes can be checked against a database automatically. From several organizations we have received databases containing hashes of known child pornography images (to learn more about hashes see Cryptographic Hashes). During this pilot, we will be using a database from Netclean and databases from Dutch Police.
How it works
When a user is uploading a new image it will be hashed via MD5 to a unique string. This string will be matched against our databases with known illegal material. When a match is found, the uploaded image is illegal and will not be published online. We will not record IP’s for the matched images to prevent privacy issues.
Caveats
Of course, there are some caveats to this project. If someone changes only a small piece of the image, matching will not be possible as the hash will change. Also the number of hashes in the databases is limited and small compared to the number of illegal images distributed. Still a large part of the distribution of child pornography happens under the surface and not via public websites.
During this pilot we will closely monitor the number of hits (both positive and false) to see if it’s effective. But when can we name it effective? With 1 blocked image? Or 10? Or 1000? We will improve the used techniques further and are already looking in smarter fingerprinting solutions (where even small changes in images can be matched). The future will have to show us if this is a way to go, along with other developments from the industry.
Julian
April 25, 2009 at 1:42The initiative is very good… But this filtering could slow down legitimate web sites. Is your filtering service fast enough to face this issue?
R. Haspers • Post Author •
April 25, 2009 at 8:25Yes it is. We (or the customer) only checks hashes, which are very small compared to the uploaded file size. The duration of the upload process is much larger than the time to check the hash against our database.