Crux empowers data engineers to build data products and seamlessly transport data from a source to a destination of your choice. Our platform supports various data representations, including "raw" data. In the context of Crux, "raw" data refers to datasets found at the source that may be in any format, such as structured data files like Excel or SQL databases, semi-structured data files like JSON or XML, or even unstructured data like HTML, image files, or audio files.
When delivering data in the "raw" format through Crux, no normalization or processing is applied to the data. This approach ensures fast and straightforward delivery to a file-based destination. We recommend using the "raw" format when the file format is not supported by Crux's data modeling process or when you simply want to set up a pipeline for regularly extracting files from data sources.
By leveraging Crux's capabilities, data engineers can efficiently transport data in its original format, allowing for flexibility and adaptability in delivering diverse data representations to their desired file-based destinations.
Data Profile Review
During the Data Profile Review process, you have the option to choose to deliver all data as raw. This can be particularly useful if you already know that the file you selected does not require any processing or normalization, or if you anticipate that the file size of a dataset might be too large to be effectively processed within the data modeling step. By selecting the option to deliver data as raw during the profile review, you can ensure a faster delivery process without the need for additional processing steps.
Data Modeling
In certain cases, Crux may encounter challenges in modeling a file due to its format or complex content. When you receive an alert message indicating that the file cannot be fully modeled, you have the ability to mark individual tables within the dataset to be delivered as raw. This allows you to receive the data product at your destination first, ensuring a seamless delivery process that can be further processed and utilized for your specific needs.
Closing Thoughts
By providing the flexibility to deliver data as raw, Crux enables you to overcome potential obstacles and continue to work with the data in its original format. This approach empowers you to make the most efficient use of your data and customize subsequent processing steps as required. Note that due to the unstructured aspect of "raw" formats, we do not support delivery for these types to database destinations, such as Snowflake and BigQuery.