Conversational Intelligence Challenge: Accelerating Research with Crowd Science and Open Source | Zendy

Burtsev Mikhail | Zendy; Logacheva Varvara | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Conversational Intelligence Challenge: Accelerating Research with Crowd Science and Open Source

Author(s) -

Burtsev Mikhail,

Logacheva Varvara

Publication year - 2020

Publication title -

ai magazine

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.597

H-Index - 79

eISSN - 2371-9621

pISSN - 0738-4602

DOI - 10.1609/aimag.v41i3.5324

Subject(s) - computer science , open domain , quality (philosophy) , domain (mathematical analysis) , data science , publication , open source , scale (ratio) , human–computer interaction , world wide web , artificial intelligence , software , question answering , mathematical analysis , philosophy , physics , mathematics , epistemology , quantum mechanics , advertising , business , programming language

Development of conversational systems is one of the most challenging tasks in natural language processing, and it is especially hard in the case of open‐domain dialogue. The main factors that hinder progress in this area are lack of training data and difficulty of automatic evaluation. Thus, to reliably evaluate the quality of such models, one needs to resort to time‐consuming and expensive human evaluation. We tackle these problems by organizing the Conversational Intelligence Challenge (ConvAI) — open competition of dialogue systems. Our goals are threefold: to work out a good design for human evaluation of open‐domain dialogue, to grow open‐source code base for conversational systems, and to harvest and publish new datasets. Over the course of ConvAI1 and ConvAI2 competitions, we developed a framework for evaluation of chatbots in messaging platforms and used it to evaluate over 30 dialogue systems in two conversational tasks — discussion of short text snippets from Wikipedia and personalized small talk. These large‐scale evaluation experiments were performed by recruiting volunteers as well as paid workers. As a result, we succeeded in collecting a dataset of around 5,000 long meaningful human‐to‐bot dialogues and got many insights into the organization of human evaluation. This dataset can be used to train an automatic evaluation model or to improve the quality of dialogue systems. Our analysis of ConvAI1 and ConvAI2 competitions shows that the future work in this area should be centered around the more active participation of volunteers in the assessment of dialogue systems. To achieve that, we plan to make the evaluation setup more engaging.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore