fastavro.read¶
-
class
reader
(fo: Union[IO, fastavro.io.json_decoder.AvroJSONDecoder], reader_schema: Union[str, List[T], Dict[KT, VT], None] = None, return_record_name: bool = False, return_record_name_override: bool = False)¶ Iterator over records in an avro file.
Parameters: - fo – File-like object to read from
- reader_schema – Reader schema
- return_record_name – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
- return_record_name_override – If true, this will modify the behavior of return_record_name so that the record name is only returned for unions where there is more than one record. For unions that only have one record, this option will make it so that the record is returned by itself, not a tuple with the name.
Example:
from fastavro import reader with open('some-file.avro', 'rb') as fo: avro_reader = reader(fo) for record in avro_reader: process_record(record)
The fo argument is a file-like object so another common example usage would use an io.BytesIO object like so:
from io import BytesIO from fastavro import writer, reader fo = BytesIO() writer(fo, schema, records) fo.seek(0) for record in reader(fo): process_record(record)
-
metadata
¶ Key-value pairs in the header metadata
-
codec
¶ The codec used when writing
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
class
block_reader
(fo: IO, reader_schema: Union[str, List[T], Dict[KT, VT], None] = None, return_record_name: bool = False, return_record_name_override: bool = False)¶ Iterator over
Block
in an avro file.Parameters: - fo – Input stream
- reader_schema – Reader schema
- return_record_name – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
- return_record_name_override – If true, this will modify the behavior of return_record_name so that the record name is only returned for unions where there is more than one record. For unions that only have one record, this option will make it so that the record is returned by itself, not a tuple with the name.
Example:
from fastavro import block_reader with open('some-file.avro', 'rb') as fo: avro_reader = block_reader(fo) for block in avro_reader: process_block(block)
-
metadata
¶ Key-value pairs in the header metadata
-
codec
¶ The codec used when writing
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
class
Block
(bytes_, num_records, codec, reader_schema, writer_schema, named_schemas, offset, size, return_record_name=False, return_record_name_override=False)¶ An avro block. Will yield records when iterated over
-
num_records
¶ Number of records in the block
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
offset
¶ Offset of the block from the beginning of the avro file
-
size
¶ Size of the block in bytes
-
-
schemaless_reader
(fo: IO, writer_schema: Union[str, List[T], Dict[KT, VT]], reader_schema: Union[str, List[T], Dict[KT, VT], None] = None, return_record_name: bool = False, return_record_name_override: bool = False) → Union[None, str, float, int, decimal.Decimal, bool, bytes, List[T], Dict[KT, VT]]¶ Reads a single record written using the
schemaless_writer()
Parameters: - fo – Input stream
- writer_schema – Schema used when calling schemaless_writer
- reader_schema – If the schema has changed since being written then the new schema can be given to allow for schema migration
- return_record_name – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
- return_record_name_override – If true, this will modify the behavior of return_record_name so that the record name is only returned for unions where there is more than one record. For unions that only have one record, this option will make it so that the record is returned by itself, not a tuple with the name.
Example:
parsed_schema = fastavro.parse_schema(schema) with open('file', 'rb') as fp: record = fastavro.schemaless_reader(fp, parsed_schema)
Note: The
schemaless_reader
can only read a single record.
-
is_avro
(path_or_buffer: Union[str, IO]) → bool¶ Return True if path (or buffer) points to an Avro file. This will only work for avro files that contain the normal avro schema header like those create from
writer()
. This function is not intended to be used with binary data created fromschemaless_writer()
since that does not include the avro header.Parameters: path_or_buffer – Path to file