Future Trends in Metadata Management
Dr. Saurabh Pramanick
Data Governance Officer|IEEE Ambassador|Data Scientist|IIM Calcutta|BITS|FRM|IFRS ACCA|CDMP|DCAM|CDO|Cybersecurity|DigitalData Transformation|Data PrivacyICloud Data Architect|Community Speaker|Freelancer|Trainer|Blogger
As part of PhD research, I am focusing on Metadata Management and found these useful themes that are going to drive future metadata research
1.?????FAIRification of Scanning Tunneling Microscopy focuses on data management practices and services for making FAIR compliant a scientific archive of Scanning Tunneling Microscopy (STM) images. The authors report on a metadata database that includes metadata extracted from instruments and each image, which have been enriched via human annotation, machine learning techniques, and instrument metadata filtering. Additionally, the W3C PROV standard was explored for STM image.
2.?????FAIR Data and Metadata: GNSS Precise Positioning User Perspective, presents an analysis of current GNSS users requirements in various application sectors on the way data, metadata and services are provided. Engaged with GNSS stakeholders to validate our findings and to gain understanding
on their perception of the FAIR principles. Authors indicate that results confirm FAIR GNSS data and services are important for this community and have had an impact standard compliant GNSS community metadata enabling FAIR GNSS data and service delivery for both humans and the machines.
3.?????Research on Intelligent Organization and Application of Multi-source Heterogeneous Knowledge Resources for Energy Internet focuses on improving the informaionization and intelligence of the energy Internet industry for enhancing the capability of knowledge services. The authors propose methods to synthesis and transform the original multiple, heterogeneous knowledge resources of the State Grid into a unified and well-organized knowledge system. The effectiveness of the proposed methods are demonstrated with knowledge resources in the field of human resources of the State Grid.
4.?????Extensive analysis of crosswalks -?Most descriptive metadata are interoperable among the schemas, the most inconsistent mapping is the rights metadata, and a large gap exists in the structural metadata and controlled vocabularies to specify various property values. The analysis and collated crosswalks
can serve as a reference for data repositories when they develop crosswalks from their own schemas to provide the research data community a benchmark of structured metadata implementation.
5.?????Metadata as data intelligence with attention to AI/ML methods. Automated metadata annotation: What is and is not possible with machine learning , use cases - the possibility of utilizing AI/ML models in improving subject indexing of culture or data catalogs, and it requires bringing process, technology
领英推荐
and interdisciplinary team together to achieve quality of automated subject.
6.?????Provenance documentation to enable explainable and trustworthy AI - the importance of capturing and providing provenance information within the context of running AI/ML models, for making AI/ML results explainable, trustworthy and reproducible by capturing provenance metadata about each step of the AI process (e.g. data, AI models, software source code for data preparation and executing models).
7.?????Achieving Transparency: A metadata Perspective, discusses what information should be captured in metadata (schema) and in consistent way (technical specification) to ensure metadata quality and transparency of data; in order to communicate better what data mean and why they should be trusted, within the context of providing datasets from the government and government agencies.
8.?????Continuous Metadata in Continuous Integration, Stream Processing and Enterprise Data Ops argues that metadata is continuous in many real data context, thus one-off metadata collection may be inadequate for future analysis. Based on the review of some current tools in specifying, capturing and consuming metadata; the author suggests features and design patterns for future cloud native software, which could enable streamed metadata to power real time data fusion or fine turn automated reasoning through real time ontology updates.
Implementations of metadata tend to favor centralized, static metadata. This depiction is at variance with the past decade of focus on big data, cloud native architectures and streaming platforms. Big data velocity can demand a correspondingly dynamic view of metadata. These trends, which include DevOps, CI/CD, DataOps and data fabric, are surveyed. Several specific cloud native tools are reviewed and weaknesses in their current metadata use are identified. Implementations are suggested which better exploit capabilities of streaming platform paradigms, in which metadata is continuously collected in dynamic contexts. Future cloud native software features are identified which could enable streamed metadata to power real time data fusion or fine tune automated reasoning through real time ontology updates.
9.?????Metadata as a Methodological Commons: From Aboutness Description to Cognitive Modeling, discusses the requirement and feasibility for semantic coding and cognitive metadata modeling, as the rise of huge volume of labeled data and ChatGPT, as well as the availability of emerging technologies (e.g Web 3.0, AI/ML, knowledge graph). Manual-based description and coding obviously cannot adapted to the UGC (User Generated Contents) and AIGC (AI Generated Contents)-based content production in the metaverse era. The automatic processing of semantic formalization must be considered as a sure way to adapt metadata methodological commons to meet the future needs of AI era.
10. Google announces use of IPTC metadata for generative AI images