Skip to main content

Troubleshooting Guide

Quickly resolve issues related to your active data pipelines (from retrieval to processing to destination delivery)

Jon Tam avatar
Written by Jon Tam
Updated over 3 months ago

Overview

Managing your active datasets can require some work. This is due to the unpredictable nature of external data. After going through the tough exercise of simply onboarding a data product, you now need to manage it and deal with unforeseen issues that can occur on an ongoing basis. This page will provide some common scenarios of dataset failures, empower you to identify the root cause of these issues, and provide remediation options for your data flow.

Types of Issues

External data can be messy. A successful data pipeline run and data delivery one day can be a failure the next day due to changing dimensions within the data and at the data supplier. As a result, there can be a myriad of ways that something can go wrong at the point of ingestion, processing, and delivery. The following are some common examples of these issues:

Area

Failure Reason

Remediation Steps

Ingestion

"There was an error accessing this connection."

Please work with the data supplier to confirm that Crux has access retrieving files from the source. When ready, initiate a rerun in the Health Dashboard.

Ingestion

"There was an error ingesting data from the source due to invalid credentials."

Confirm your source connection credentials on the Connections page and update as needed. When updated, initiate a rerun in the Health Dashboard.

Ingestion

"There was an error locating the file matching the expected file pattern at the source."

Please work with the data supplier to confirm the file location at the source and have the supplier update as needed. If there is a new location, you can create a new data product with the new file pattern. If the supplier has fixed this issue, initiate a rerun in the Health Dashboard.

Ingestion

"There was an error ingesting data from the source due to the remote path being unavailable on the remote server."

Please work with the data supplier to confirm that the data file was uploaded to the appropriate remote file path. When ready, initiate a rerun in the Health Dashboard.

Ingestion

"There was an error downloading the file from the source."

There may have been an intermittent error downloading the file from the source. Investigate logs to identify root cause to remediate, and initiate rerun from the Health Dashboard when appropriate.

Ingestion

"There was an interruption when downloading the file from the source."

There may have been an intermittent error downloading the file from the source. Investigate logs to identify root cause to remediate, and initiate rerun from the Health Dashboard when appropriate.

Processing

"There was an error processing the data due to an unexpected file format."

An unsupported file format was detected. Please work with the data supplier to provide a supported file format. You may also investigate logs to obtain more details. When the file is reposted, you may initiate rerun from the Health Dashboard.

Processing

"There was an error processing the data due to an unexpected file encoding for one or more files."

An unsupported file format was detected related to file encoding of at least one file. Please work with the data supplier to provide new file(s). You may also investigate logs to obtain more details. When the file is reposted, you may initiate rerun from the Health Dashboard.

Processing

"There was an error processing the data due to a detected schema change involving inconsistent data types."

A detected schema change led to a failure. You may also investigate logs to obtain more details on the schema change. Create a new data product only for the relevant schema version. Note: schema evolution features are part of the product roadmap.

Processing

"There was an error processing the data due to a detected schema change from prior schema versions."

A detected schema change led to a failure. You may also investigate logs to obtain more details on the schema change. Create a new data product only for the relevant schema version. Note: schema evolution features are part of the product roadmap.

Processing

"There was an error processing the data due to missing column headers in this schema version."

No column headers were detected in the file, which led to a schema break failure. You may also investigate logs to obtain more details on what occurred. If the column headers are known, you may update these in the ODIN dataset spec directly via cruxctl. When updated, initiate a rerun from the Health Dashboard.

Processing

"There was an error processing the data due to a suspected data quality issue."

Investigate logs to identify root cause to remediate, and initiate rerun from the Health Dashboard when appropriate.

Delivery

"There was an error delivering your data due to a connectivity issue with your destination(s)."

Confirm your destination connection settings. You may investigate logs to identify root cause to remediate, and initiate rerun from the Health Dashboard when appropriate.

Delivery

"There was an error delivering your data due to invalid credentials with your destination(s)."

Confirm your destination connection credentials on the Connections page and update as needed. When updated, initiate a rerun in the Health Dashboard.

Delivery

"There was an error accessing this connection."

Confirm that Crux has write access to the destination. You may investigate logs to identify root cause to remediate, and initiate rerun from the Health Dashboard when appropriate.

Delivery

"There was an error delivering your data due to an internal issue."

Investigate logs to identify root cause to remediate, and initiate rerun from the Health Dashboard when appropriate.

Unknown

"Something went wrong."

Investigate logs to identify root cause to remediate, and initiate rerun from the Health Dashboard when appropriate.

Identifying Root Causes

โš ๏ธ Coming soon: Ability to view logs within the Crux app

While there is a generalized reason for a dataset failure in the Health Dashboard, users may want to do a deeper dive investigation of what exactly occurred so they can have enough information to rectify and remediate the issue in a timely fashion. Viewing logs within cruxctl is available for more advanced users in the CLI tool.

Remediation Options

After identifying the root cause of underlying issues with your data pipeline run, you can be proactive in performing the necessary actions to correct what led to the pipeline failure in the first place. After this point, you may initiate a rerun of your data flow to your destination.

Initiating a pipeline rerun

Performing remediation actions may include working with the data supplier to get a new file uploaded, reconfiguring your source or destination access credentials, or making necessary updates to your dataset spec itself. When these actions are done and you are confident in the health of your data pipeline, you may initiate a rerun rather than waiting for the next scheduled delivery.

In the dataset details tray, you can initiate a rerun by clicking on the Rerun Delivery button. This will kick off a pipeline run from ingestion to process to delivery. Due to the resource-intensive nature of pipeline runs, you may only attempt a rerun up to 5 times for a single dataset delivery. The number of reruns you have remaining for a delivery will be visible in this tray as well.

Did this answer your question?