Calculate uniqueness inter and intra cluster records to measure the correctness of DME output
As with the Spark v 2.3.0 or later version, has a API which can calculate measure of Clustering Prediction Score - It will nice to integrate this in our DME plugin as a confidence scoring model out of box. More details about the Spark API : https:...
about 2 years ago
in Data Quality
On the Backlog
Hi, In DB import action we have an option to import all tables in DB. When this option is selected ZDP creates SQOOP command for every single table and executes a MR job for each table. Meaning, if there are 1000 tables in db then ZDP will fire 10...
Enable Data Quality on Hive Views from DQ of underlying Tables
Enable data quality on views in Zaloni via linking with table instead of re-calculating DQ From customer: Views should not have their own profiling/DQ but profile/DQ information should come from the original table profiled from where these columns...
Ability to send values from Pre-ingestion to Post Ingestion
Request for ability for transfer variables set in pre-ingestion to workflow in post ingestion. In some scenarios we need to set few variables in pre-ingestion which needs to set and to be used in post ingestion workflow.
Support record-level insert, update, and delete on DFS using apache HUDI
Apache Hudi - HADOOP UPSERT AND INCREMENTAL is an open-source data management framework used to simplify incremental data processing and data pipeline development. Apache Hudi enables you to manage data at record level in DFS storages to simplify ...
Display ingestion history for db wizard, db import created entities in entity view ingestion history tab when Display is 'Ingested File Size Per Day'
Display ingestion history for db wizard, db import created entities in entity view ingestion history tab when Display is 'Ingested File Size Per Day' Current behavior: The 'Ingested File Size Per Day' is shown only for entities associated with fil...
Currently to trigger an ingestion manually, trigger suffix is being used. Along with trigger suffix if a REST endpoint can be provided where an user (or any third party tool) can hit a request to perform the ingestion for the particular file pattern.
over 2 years ago
in Data Ingestion
Provide ability to comment and crowdsource business information on ZDP entities/fields.
Business user/Data Stewards would like to collaborate and share their comments and findings on the data sets. They would like review these comments before underlying data can be provisioned to downstream systems. Provide ability to capture user fe...
Ability to auto tag entities based on the vocabulary/business glossary
Adding labels to the entity is tedious and time consuming effort. provide capability to auto tag based on the business glossary. or by referring business vocabulary. some of the competitors are leveraging modified Maui - Multi-purpose automatic to...