Towards Hypothetical Reasoning Using Distributed Provenance | Zendy

Daniel  Deutch | Zendy; Yuval  Moskovitch | Zendy; Itay  Polak | Zendy; Noam  Rinetzky | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Towards Hypothetical Reasoning Using Distributed Provenance

Author(s) -

Daniel Deutch,

Yuval Moskovitch,

Itay Polak,

Noam Rinetzky

Publication year - 2018

Language(s) - English

DOI - 10.5441/002/edbt.2018.47

Hypothetical reasoning is the iterative examination of the effect of modifications to the data on the result of some computation or data analysis query. This kind of reasoning is commonly performed by data scientists to gain insights. Previous work has indicated that fine-grained data provenance can be instrumental for the efficient performance of hypothetical reasoning: instead of a costly re-execution of the underlying application, one may assign values to a pre-computed provenance expression. However, current techniques for fine-grained provenance tracking are ill-suited for large-scale data due to the overhead they entail on both execution time and memory consumption. We outline an approach for hypothetical reasoning for largescale data. Our key insights are: (i) tracking only relevant parts of the provenance based on an a priori specification of classes of hypothetical scenarios that are of interest and (ii) the distributed tracking of provenance tailored to fit distributed data processing frameworks such as Apache Spark.We also discuss the challenges in both respects and our initial directions for addressing them.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research