Brainstorming on Foundations of Web Data Management
The Webdam Workshop on Brainstorming on Foundations of Web Data Management took place on August 28th, 2009 at Télécom ParisTech. It was an occasion to present Webdam first achievements to a panel of specially talented researchers, all known as being a leading force in their respective research fields. It was also an occasion to share and compare different and useful visions of how the data management of the web should be founded theoritically.
In particular, we got the following exciting presentations :
- Serge Abiteboul, Webdam in brief: Serge presented the main motivations and goals of Webdam: noticing that management of distributed data on the web is not supported by robust models and theory, he proposes to focus on information residing in autonomous systems, following the direction of Axml. This talk raised interesting debate with the audience on concurrency control and more generally on expectations about Webdam.
- Marie-Christine Rousset, Representing and Reasoning on Web Data Semantics, Survey and Challenges: Marie-Christine presented the importance of data semantics to constrain meta-data for web data management. This will allow reasoning on knowledge using logic. This talk raised questions on the best kind of logic to use, the limitations of RDF and extensions to numeric properties.
- Stefano Ceri, Search Computing: Stefano presented his work on the ERC project Search Computing, mostly focused on data management and query optimization. This talk build natural bridges with Webdam, since the use of a rich data will deeply improve quality of search and process modeling; social network are also a natural path for promising interaction. This research also raises questions about the link between search and probabilistic databases.
- Georg Gottlob, Web Data Extraction — Present and Future: Georg’s talk argued on the need of tools to bridge the gap between unstructured and structured information to feed the data management system. He proposed a langage for expressing such extraction methods and tools to support it. It raised questions about creating new annotations on Datalog and managing duplicates.
- Tova Milo, Querying Past and Future in Web Applications: Tova presented applications which would more naturally grow on top of a rich distributed data management system. In particular, she focused on the need to understand and optimize the interaction with the user, considering past interactions. The main challenges which animated the debate is the generalization of the application to a more generic scenario, using in particular a representation of workflows.
- Peter Buneman, Provenance in databases and workflow: Peter’s talk demonstrated the importance of where-, how- and why-provenance. It provided some tools and model to use in presence of complex workflows. This topic is of direct interest for Webdam, since keeping trace of provenance is fundamental in such distributed environments.
- Dan Suciu, Belief Databases: Dan demonstrated the importance of the management of belief in distributed data management system where each user has a consistent view of the database even if inconsistencies may appears across views. This talk raised interesting discussion on the representation of belief and the kind of logic to use in such a system.
- Val Tannen, Provenance Propagation: Val developed the analysis of the previous speakers about the need of provenance to update and feedback propagation in a web data management system. He proposed an algebraic view of provenance in order to better understand it and get general results. The debate focused on summarization of why-provenance and levels of abstraction.
- Victor Vianu, Static Analysis of Active XML Systems: Axml is a first model of web data management system which may support tasks, controlled by guards. Victor presented how properties of the system could be expressed in tree-LTL logic and verified, providing theoretical insurance on the behavior of the system.
- Pierre Senellart, Probabilistic XML: Survey and Challenges: Pierre presented how XML probabilistic databases could leverage the uncertainty to better represent the knowledge on a distributed data management system. He also explained how to reason on this database. It raised exiting challenges like continuous probabilistic distributions and dependency tractability.
- Luc Segoufin, Links with FoX Project: Fox is an european project which focuses on safe processing of dynamic data over Internet. It deals with similar problems as Webdam: data modeling and specification, querying, extracting and exchanging XML data, modeling and verification of temporal behavior and handling incomplete informations. The two projects have already produced fruitful collaboration.
- Balder ten Cate, Structural Characterizations of Schema Mapping Languages: Balder presented how important schema mappings are for data integration on a distributed data management system. He proposed a study of the languages of data mapping schema. It raised interesting issues on adapting this model to XML data and schema mapping optimization.
- Serge Abiteboul, Recent works around AXML: In this presentation, Serge introduced an existing application of Axml: the business artifact. This allows representing a workflow in a data-centric way, well suited for highly distributed applications. This raised a large number of questions about interaction between autonomous system, synchronization, movement of artifacts, monitoring, quality of services, access control…