jsonl - JSON Delimited
JSON Delimited is a file format that stores several JSON documents in one file. The JSON documents are separated by a new line.
Additional data types are stored as follows:
datetime
anddate
are stored as ISO strings;decimal
is stored as a text representation of a decimal number;binary
is stored as a base64 encoded string;HexBytes
is stored as a hex encoded string;json
is serialized as a string.
This file format is compressed by default.
Supported Destinations
This format is used by default by: BigQuery, Snowflake, Filesystem.
How to configure
There are several ways of configuring dlt to usejsonl
file format for normalization step and to store your data at the destination:- You can set the
loader_file_format
argument tojsonl
in the run command:
info = pipeline.run(some_source(), loader_file_format="jsonl")
- You can set the
loader_file_format
inconfig.toml
orsecrets.toml
:
[normalize]
loader_file_format="jsonl"
- You can set the
loader_file_format
via ENV variable:
export NORMALIZE__LOADER_FILE_FORMAT="jsonl"
- You can set the file type directly in the resource decorator.
@dlt.resource(file_format="jsonl")
def generate_rows(nr):
pass