Resource properties
Pentaho Data Catalog modifies discovered resource properties to add user-defined metadata to the resource. This metadata holds business value or communicates the data quality of the resource.
Properties
Pentaho Data Catalog discovers the resource metadata properties during the Data Profiling (for structured data) and Data Discovery (for unstructured data) processes. Data Catalog uses standard properties as the default, giving you general information about the resource based on metadata standards.
After making any changes in the properties, make sure you save and rerun the Data Profiling and Data Discovery processes.
Custom properties
User-defined metadata contributes to business value. On the Data Canvas, with a resource selected, you can see custom properties for a resource in the Custom Properties pane. Typically, the text that appears below the property field describes how the property is to be used or the values it can take.
You can also click the Properties tab to see more. Custom properties that are visible but are not available for editing appear grayed out.
Data labels
Data labels in Data Catalog are structured metadata elements defined as key-value pairs, designed to provide standardized, machine-readable labels to data assets such as tables, columns, datasets, and files. While similar in format to Custom properties, which also use key-value pairs, Data labels are governed more strictly to ensure consistency and control. For example, Data labels support predefined values under each label name, where as Custom properties allow users to input any values without constraint.
This structured approach helps you to manage metadata more effectively, particularly for use cases involving AI, machine learning, and data governance. With data labels, you can classify data using consistent terminology, such as 'Sensitivity: Confidential' or 'Data Quality: High', which improves model training, search accuracy, and compliance.
Last updated
Was this helpful?