
Taint Inference for Cross‐Site Scripting in Context of URL Rewriting and HTML Sanitization
Author(s) -
Pan Jinkun,
Mao Xiaoguang,
Li Weishi
Publication year - 2016
Publication title -
etri journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.295
H-Index - 46
eISSN - 2233-7326
pISSN - 1225-6463
DOI - 10.4218/etrij.16.0115.0570
Subject(s) - cross site scripting , computer science , scripting language , taint checking , rewriting , context (archaeology) , inference , web application , the internet , world wide web , information retrieval , web application security , data mining , computer security , programming language , artificial intelligence , web development , software , paleontology , biology
Currently, web applications are gaining in prevalence. In a web application, an input may not be appropriately validated, making the web application susceptible to cross‐site scripting (XSS), which poses serious security problems for Internet users and websites to whom such trusted web pages belong. A taint inference is a type of information flow analysis technique that is useful in detecting XSS on the client side. However, in existing techniques, two current practical issues have yet to be handled properly. One is URL rewriting, which transforms a standard URL into a clearer and more manageable form. Another is HTML sanitization, which filters an input against blacklists or whitelists of HTML tags or attributes. In this paper, we make an analogy between the taint inference problem and the molecule sequence alignment problem in bioinformatics, and transfer two techniques related to the latter over to the former to solve the aforementioned yet‐to‐be‐handled‐properly practical issues. In particular, in our method, URL rewriting is addressed using local sequence alignment and HTML sanitization is modeled by introducing a removal gap penalty. Empirical results demonstrate the effectiveness and efficiency of our method.