A collection of data provenance papers & talks. Papers are classified by research area.
- Provenance in middleware
- H5Prov: I/O Performance Analysis of Science Applications Using HDF5 File-level Provenance (CUG'19)
- Scientific data exchange: a schema for HDF5-based storage of raw and analyzed data (Journal of Synchrotron Radiation 2014)
- A Generic Provenance Middleware for Queries, Updates, and Transactions (TaPP'14)
- Study in Usefulness of Middleware-Only Provenance (eScience'14)
- Provenance in e-Science
- Provenance in Collaborative in Silico Scientific Research: a Survey (SIGMOD record'20)
- ProvONE+: A Provenance Model for Scientific Workflows (WISE'20)
- Workflow Provenance in the Lifecycle of Scientific Machine Learning (arXiv, Sept. 2020)
- Provenance in Collaborative in Silico Scientific Research: a Survey (SIGMOD'20)
- Efficient Runtime Capture of Multi-workflow Data Using Provenance (eScience'19)
- Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering (WORKS'19)
- The role of machine learning in scientific workflows (Int. J. HPC, 2019)
- Efficient Runtime Capture of Multiworkflow Data Using Provenance(eScience'19)
- Blockchain Based Provenance Sharing of Scientific Workflows (Big Data'18)
- Big Provenance Stream Processing for Data Intensive Computations (eScience'18)
- DAC-MAN: Data Change Management for Scientific Datasets on HPC systems (SC'18)
- Capturing and Querying Workflow Runtime Provenance with PROV: a Practical Approach (EDBT'13)
- End-to-End eScience: Integrating Workflow, Query, Visualization, and Provenance at an Ocean Observatory (eScience'08)
- Provenance and Scientific Workflows: Challenges and Opportunities (SIGMOD'08)
- A survey of data provenance in e-science (SIGMOD'05)
- Lineage Retrieval for Scientific Data Processing: A Survey (ACM Computing Surveys, Mar. 2005)
- Provenance in database
- A Provenance Storage Method Based On Parallel Database (ICISCE'20)
- Going Beyond Provenance: Explaining Query Answers with Pattern-based Counterbalances (SIGMOD'19)
- Provenance and Probabilities in Relational Databases (SIGMOD record'19)
- Interoperability for Provenance-aware Databases using PROV and JSON (TaPP'15)
- Why and Where Characterization of Data Provenance (ICDT'01)
- Provenance in scripts
- Provenance in machine learning
- Provenance query
- Provenance visualization
- Provenance optimization
- Provenance standard
- Provenance systems
- Improve Data Scientist Efficiency with Provenance (ICSE'20)
- Runtime Analysis of Whole-System Provenance (CCS'18)
- Provenance Integration Requires Reconciliation (TaPP'11)
- Taverna, reloaded (SSDBM’10)
- Layering in Provenance Systems (ATC'09)
- Provenance-Aware Storage Systems (ATC'06)
- Trio: A System for Integrated Management of Data, Accuracy, and Lineage (CIDR'05)
- Multi area
- Data provenance: What’s next? (SIGMOD Record'18)
- A survey on provenance: What for? What form? What from? (VLDB.J'17)
- A Primer on Provenance (ACMqueue'14)
- The Foundations for Provenance on the Web (Foundations and Trends in Web Science 2010)
- Special Issue on Data Provenance (Data Engineering Bulletin Issues, Dec. 2007)
- Provenance in Databases: Past, Current, and Future
- Provenance and Data Synchronization
- Program Slicing and Data Provenance
- Recording Provenance for SQL Queries and Updates
- Issues in Building Practical Provenance Systems
- Provenance in Scientific Workflow Systems
- Copyright and Provenance: Some Practical Problems
- RDF storage & hardware acceleration
- FAIR principles: maturity evaluations and implimentation considerations
-
FAIR Principles: Interpretations and Implementation Considerations (Data Intelligence, 2020)
-
Evaluating FAIR maturity through a scalable, automated, community-governed framework (Scientific data, Nature, 2019)
-
Results of an Analysis of Existing FAIR Assessment Tools (2019)
-
The FAIR Guiding Principles for scientific data management and stewardship (Scientific data, Nature, 2016)
-
- ProvTalk: Towards Interpretable Multi-level Provenance Analysis in Networking Functions Virtualization (NFV) (NDSS'21)
- On Optimizing the Trade-off between Privacy and Utility in Data Provenance (SIGMOD'21)
- Lineage Stash: Fault Tolerance Off the Critical Path (SOSP'19)
- Towards Scalable Cluster Auditing through Grammatical Inference over Provenance Graphs (NDSS'18)
- Practical Whole-System Provenance Capture (SoCC'17)
- FRAPpuccino: Fault-detection through Runtime Analysis of Provenance (HotCloud'17)
- Lineage-driven Fault Injection (SIGMOD'15)
- Trustworthy Whole-System Provenance for the Linux Kernel (Security'15)
- Diagnosing missing events in distributed systems with negative provenance (SIGCOMM'14)
- Hi-Fi: collecting high-fidelity whole-system provenance (Annual Computer Security Applications Conference 2012)
- Secure network provenance (SOSP'11)
- Backtracking Intrusions (ACM Transactions on Computer Systems '2005)
- Leveraging HDF5 infrastructure by ADF to build an interoperable package & contextualized data in Pharma using semantic technology
- HUG presentation about VOL connectors
- Documenting Scientific Workflows: The Metadata, Provenance & Ontology Project
- DacMan/DVC
- Versioned HDF5 library
- PROV-Overview
- The Importance of Data Set Provenance for Science
- FAIR Principles
- Open Access Datasets