English (United Kingdom)

https://curated-unify.zendy.io/wp-json/zendy-region/v1/featured_content/oa?rat=en

https://curated-unify.zendy.io/wp-json/zendy-region/v1/highlighted_journal/

Zendy Plus

Presents the access of premium content as premium feature

Premium Content

Presents the keyphrase highlighting as premium feature

Keyphrase Highlighting

Presents the summarisation as premium feature

Summarisation

Insights

Presents the pdf analysis as premium feature

PDF Analysis

Presents the zaia usage as premium feature

ZAIA

Zendy Tools

Zendy Open

One of the main challenges in the field of embodied artificial intelligenceis the open-ended autonomous learning of complex behaviours. Our approach is touse task-independent, information-driven intrinsic motivation(s) to supporttask-dependent learning. The work presented here is a preliminary step in whichwe investigate the predictive information (the mutual information of the pastand future of the sensor stream) as an intrinsic drive, ideally supporting anykind of task acquisition. Previous experiments have shown that the predictiveinformation (PI) is a good candidate to support autonomous, open-ended learningof complex behaviours, because a maximisation of the PI corresponds to anexploration of morphology- and environment-dependent behavioural regularities.The idea is that these regularities can then be exploited in order to solve anygiven task. Three different experiments are presented and their results lead tothe conclusion that the linear combination of the one-step PI with an externalreward function is not generally recommended in an episodic policy gradientsetting. Only for hard tasks a great speed-up can be achieved at the cost of anasymptotic performance lost.

Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting: a critical analysis