Fast Approximate Matching of Programs for Protecting Libre/Open Source Software by Using Spatial IndexesSource Code Analysis and Manipulation, 2007. SCAM 2007. Seventh IEEE International Working Conference on (2007), pp. 111-122.
|
Reviews
[Write a review of this article]
There are no reviews of this article
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
AbstractTo encourage open source/libre software development, it is desirable to have tools that can help to identify open source license violations. This paper describes the implementation of a tool that matches open source programs embedded inside pirate programs. The problem of binary program matching can be approximated by analyzing the similarity of program fragments generated from low-level instructions. These fragments are syntax trees that can be compared by using a tree distance function. Tree distance functions are generally very costly. Sequentially calculating the similarities of fragments with them becomes prohibitively expensive. In this paper we experimentally demonstrate how a spatial index can be used to substantially increase matching performance. These techniques allowed us to do exhaustive experiments that confirmed previous results on the subject. The paper also introduces the novel idea of using information retrieval techniques for calculating the similarity of bags of program fragments. It is possible to identify programs even when they are heavily obfuscated with the innovative approach described here.
BibTeX record
RIS record