Supervisors info:
Λάζαρος Μεράκος, Καθηγητής, Τμήμα Πληροφορικής και Τηλεπικοινωνιών, ΕΚΠΑ
Summary:
Tor is a worldwide network that allows people to browse the Internet anonymously. Additionally, it provides hidden services that allow users to publish websites and services without revealing their location or identity. It was designed to give freedom of speech and a chance to avoid Internet censorship to millions of people around the world. On the other hand, Tor’s anonymity attracts suspicious and illegal activities, such as black market drugs, illegal firearms and extreme pornography.
In order to clean Tor’s content and get countermeasures, the discovery of illegal services is required. To enumerate all illegal services, search engines can prove to be a valuable tool. Nevertheless, traditional search engines only rely on text and their results are limited by the keyword similarity. Document search engines cannot be effective enough, due to the fact that in Tor, a special set of non-standard words is being used, in order to obscure the meanings of texts. Hence, we introduce a multimedia search engine, which aids the discovery of illegal services based on the multimedia they host.
Keywords:
image, multimedia, hashing, crawl, spider