File Formats
Spice currently supports CSV, JSON, and Parquet data file-formats for data connectors that can read files from a file system or cloud object storage (i.e. s3://, abfs://, file://, etc.). Support for Iceberg and other file-formats are on the roadmap.
The parameters supported for specific file-formats are detailed on this page.
Parquet​
Spice automatically supports reading any Parquet file, regardless of the compression codec or data encoding used.
Compression codecs:
Data encodings:
PLAINPLAIN_DICTIONARY/RLE_DICTIONARYRLEBIT_PACKED(deprecated in favor ofRLE)DELTA_BINARY_PACKEDDELTA_LENGTH_BYTE_ARRAYDELTA_BYTE_ARRAYBYTE_STREAM_SPLIT
CSV​
Parameters​
csv_has_header: Optional. Indicate if the CSV file has header row. Defaults totruecsv_quote: Optional. A one-character string used to quote fields containing special characters. Defaults to"csv_escape: Optional. A one-character string used to represent special characters or to include characters that would normally be interpreted as delimiters or new line characters within a field value. Defaults tonullcsv_schema_infer_max_records: Optional. A number used to set the limit in terms of records to scan to infer the schema. Defaults to1000csv_delimiter: Optional. A one-character string used to separate individual fields. Defaults to,
JSON​
Parameters​
json_format: Optional. Specifies the JSON format to parse. Valid values arearray,ndjson, andjsonl. Defaults tojsonl
