Open Access
HTTP-Request Classification in Automatic Web Application Crawling
Author(s) -
Анна Вадимовна Лапкина,
Андрей Александрович Петухов
Publication year - 2021
Publication title -
trudy instituta sistemnogo programmirovaniâ ran/trudy instituta sistemnogo programmirovaniâ
Language(s) - English
Resource type - Journals
eISSN - 2220-6426
pISSN - 2079-8156
DOI - 10.15514/ispras-2021-33(3)-6
Subject(s) - computer science , javascript , web page , world wide web , context (archaeology) , web service , document object model , static web page , trace (psycholinguistics) , web api , web server , database , web navigation , the internet , paleontology , linguistics , philosophy , biology
The problem of automatic requests classification, as well as the problem of determining the routing rules for the requests on the server side, is directly connected with analysis of the user interface of dynamic web pages. This problem can be solved at the browser level, since it contains complete information about possible requests arising from interaction interaction between the user and the web application. In this paper, in order to extract the classification features, using data from the request execution context in the web client is suggested. A request context or a request trace is a collection of additional identification data that can be obtained by observing the web page JavaScript code execution or the user interface elements changes as a result of the interface elements activation. Such data, for example, include the position and the style of the element that caused the client request, the JavaScript function call stack, and the changes in the page's DOM tree after the request was initialized. In this study the implementation of the Chrome Developer Tools Protocol is used to solve the problem at the browser level and to automate the request trace selection.