z-logo
open-access-imgOpen Access
A Model for Estimating the Posting Frequency in an Online Social Media with Incomplete Data Using Objective Determinants of Users’ Behaviour
Author(s) -
Валерия Фуатовна Столярова,
Aleksandra V. Toropova,
Александр Львович Тулупьев
Publication year - 2021
Publication title -
nečetkie sistemy i mâgkie vyčisleniâ
Language(s) - English
Resource type - Journals
ISSN - 1819-4362
DOI - 10.26456/fssc81
Subject(s) - social media , computer science , naive bayes classifier , akaike information criterion , bayes' theorem , bayesian network , profiling (computer programming) , machine learning , data mining , bayesian probability , artificial intelligence , support vector machine , world wide web , operating system
Профилирование пользователя онлайн социальной сети включает задачу оценки частоты (интенсивности) различных действий, в частности, публикации постов. Однако в силу ресурсных ограничений, может быть доступна только неполная информация о времени публикации нескольких последних постов, полученная, например, в рамках интервью. Оценка интенсивности постинга на основании таких данных востребована при анализе индивидуального риска, связанного с использованием онлайн социальных сетей. В статье предложена расширенная байесовская сеть доверия, которая использует не только информацию о времени публикации последних постов, но и объективные данные из профиля пользователя: пол, возраст, число друзей. Для обучения и демонстрации работы модели были собраны данные о публикации постов случайных пользователей в онлайн социальной сети ВКонтакте. Расширенная структура имеет более высокое значение информационного критерия Акаике по сравнению с упрощенной. User profiling is related to the problem of estimation of frequency of certain user’s actions in an online social media, like posting. But due to limited resources the only information available may be imprecise information on several last episodes of posting, that can be gathered via an interview. The frequency of posting estimates with such limited data may be used in the individual risk assessment that is connected with the use of online social media, for example, in medicine or cybersecurity. In the paper the Bayes belief network (BBN) for this problem is constructed, that incorporates not only the limited data on times of several last posts in an online social media, but the objective data about the user’s profile: age, sex, and friends count. With the training dataset gathered via API VKontakte we estimated conditional probability tables for two expert BBN structures (existing reduced structure based only on dates of several last posts and novel extended structure with objective behavior determinants incorporated) and automatically learned the optimal structure for the training data. Both extended models (expert and learned) showed lower values of the information criteria (Akaike information criteria and bayesian information criteria). Then with the test dataset the classification problem of the true frequency value was assessed. All three models showed similar results based on accuracy, kappa and average accuracy characteristics. This result is related to the weak strength of arcs between frequency variable and objective behavior determinants. But nevertheless the use of such variables is important in the application in order to construct the comprehensive structure of the knowledge in the area of interest. The practical significance of the work lies in the possibility of applying the proposed model to assess the posting frequency in the online social network, in particular in the tasks of modeling risk in the field of public health and socio-cybersecurity.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here