Merge branch 'master' into maint-docs

This commit is contained in:
Jamie Hardt
2023-11-08 18:48:45 -08:00
committed by GitHub
3 changed files with 277 additions and 75 deletions

View File

@@ -1,13 +1,15 @@
[![Documentation Status](https://readthedocs.org/projects/wavinfo/badge/?version=latest)](https://wavinfo.readthedocs.io/en/latest/?badge=latest) ![](https://img.shields.io/github/license/iluvcapra/wavinfo.svg) ![](https://img.shields.io/pypi/pyversions/wavinfo.svg) [![](https://img.shields.io/pypi/v/wavinfo.svg)](https://pypi.org/project/wavinfo/) ![](https://img.shields.io/pypi/wheel/wavinfo.svg) ![](https://img.shields.io/pypi/pyversions/wavinfo.svg) [![](https://img.shields.io/pypi/v/wavinfo.svg)](https://pypi.org/project/wavinfo/) ![](https://img.shields.io/pypi/wheel/wavinfo.svg)
[![Lint and Test](https://github.com/iluvcapra/wavinfo/actions/workflows/python-package.yml/badge.svg)](https://github.com/iluvcapra/wavinfo/actions/workflows/python-package.yml) [![Lint and Test](https://github.com/iluvcapra/wavinfo/actions/workflows/python-package.yml/badge.svg)](https://github.com/iluvcapra/wavinfo/actions/workflows/python-package.yml)
[![codecov](https://codecov.io/gh/iluvcapra/wavinfo/branch/master/graph/badge.svg?token=9DZQfZENYv)](https://codecov.io/gh/iluvcapra/wavinfo) [![codecov](https://codecov.io/gh/iluvcapra/wavinfo/branch/master/graph/badge.svg?token=9DZQfZENYv)](https://codecov.io/gh/iluvcapra/wavinfo)
![GitHub last commit](https://img.shields.io/github/last-commit/iluvcapra/pycmx) [![Documentation Status](https://readthedocs.org/projects/wavinfo/badge/?version=latest)](https://wavinfo.readthedocs.io/en/latest/?badge=latest) ![](https://img.shields.io/github/license/iluvcapra/wavinfo.svg)
# wavinfo # wavinfo
The `wavinfo` package allows you to probe WAVE and [RF64/WAVE files][eburf64] The `wavinfo` package allows you to probe WAVE and [RF64/WAVE files][eburf64]
and extract extended metadata, with an emphasis on film, video and and extract extended metadata. `wavinfo` has an emphasis on film, video and
professional music production. professional music production but aspires to be the encyclopedic and final
source for all WAVE file metadata.
## Metadata Support ## Metadata Support
@@ -27,8 +29,9 @@ professional music production.
* Most of the common [RIFF INFO][info-tags] metadata fields. * Most of the common [RIFF INFO][info-tags] metadata fields.
* The [wav format][format] is also parsed, so you can access the basic sample rate * The [wav format][format] is also parsed, so you can access the basic sample rate
and channel count information. and channel count information.
[format]:https://wavinfo.readthedocs.io/en/latest/classes.html#wavinfo.wave_reader.WavAudioFormat
[format]:https://wavinfo.readthedocs.io/en/latest/classes.html#wavinfo.wave_reader.WavAudioFormat
[cues]:https://wavinfo.readthedocs.io/en/latest/scopes/cue.html [cues]:https://wavinfo.readthedocs.io/en/latest/scopes/cue.html
[bext]:https://wavinfo.readthedocs.io/en/latest/scopes/bext.html [bext]:https://wavinfo.readthedocs.io/en/latest/scopes/bext.html
[smpte_330m2011]:https://wavinfo.readthedocs.io/en/latest/scopes/bext.html#wavinfo.wave_bext_reader.WavBextReader.umid [smpte_330m2011]:https://wavinfo.readthedocs.io/en/latest/scopes/bext.html#wavinfo.wave_bext_reader.WavBextReader.umid
@@ -60,6 +63,12 @@ The package also installs a shell command:
$ wavinfo test_files/A101_1.WAV $ wavinfo test_files/A101_1.WAV
``` ```
## Contributions!
Any new or different kind of metadata you find, or any
new or different use of exising metadata you encounter, please submit
an Issue or Pull Request!
## Other Resources ## Other Resources
* For other file formats and ID3 decoding, * For other file formats and ID3 decoding,

View File

@@ -1,19 +1,179 @@
.TH waveinfo 7 "2023-11-07" "Jamie Hardt" "Miscellaneous Information Manuals" .TH waveinfo 7 "2023-11-08" "Jamie Hardt" "Miscellaneous Information Manuals"
.SH NAME .SH NAME
wavinfo \- information about wave sound file metadata wavinfo \- WAVE file metadata
.\" .SH DESCRIPTION .SH SYNOPSIS
Everything you ever wated to know about WAVE metadata but were afraid to ask.
.SH DESCRIPTION
.PP
The WAVE file format is forwards-compatible. Apart from audio data, it can
hold arbitrary blocks of bytes which clients will automatically ignore
unless they recognize them and know how to read them.
.PP
Without saying too much about the structure and parsing of WAVE files
themselves \- a subject beyond the scope of this document \- WAVE files are
divided into segments or
.BR chunks ,
which a client parser can either read or skip without reading. Chunks have
an identifier, or signature: a four-character-code that tells a client what
kind of chunk it is, and a length. Based on this information, a client can look
at the identifier and decide if it knows how to read that chunk and if it wants
to. If it doesn't, it can simply read the length and skip past it.
.PP
Some chunks are mandated by the Microsoft standard, specifically
.I fmt
and
.I data
in the case of PCM-encoded WAVE files. Other chunks, like
.I cue
or
.IR bext ,
are optional, and optional chunks usually hold metadata.
.PP
Chunks can also nest inside other chunks, a special identifier
.I LIST
is used to indicate these. A WAVE file is a recursive list: a top level
list of chunks, where chunks may contain a list of chunks themselves.
.SS Order of Metadata Chunks in a WAVE File
.PP
Chunks in a WAVE file can appear in any order, and a capable parser can
accept them appearing in any order, however authorities give guidance on
where chunks should be placed, when creating a new WAVE file.
.PP
.IP 1)
For all new WAVE files, clients should always place an empty chunk, a
so-called
.I JUNK
chunk, in the first position in the top-level list of a WAVE file, and
it should be sized large enough to hold a
.I ds64
chunk record. This will allow clients to upgrade the file to a RF64
WAVE file
.BR in-place ,
without having to re-write the file or audio data.
.IP 2)
Older authorites recommend placing metadata before the audio data, so clients
reading the file sequentially will hit it before having to seek through the
audio. This may improve metadata read performance on certain architecures.
.IP 3)
Older authorities also recommend inserting
.I JUNK
before the
.I data
chunk, sized so that the first byte of the
.I data
payload lands immediately at 0x1000 (4096), because this was a common
factor of the page boundaries of many operating systems and architectures. This
may optimize the audio I/O performance in certain situations.
.IP 4)
Modern implemenations (we're looking at
.B Pro Tools
here) tend to place the Broadcast-WAVE
.I bext
metadata before the data, followed by the data itself, and then other data
after that.
.\" .PP
.\" Clients reading WAVE files should be tolerant and accept any configuration of
.\" chunks, and should accept any file as long as the obligatory
.\" .I fmt
.\" and
.\" .I data
.\" chunks
.\" are present.
.PP
It's not unheard-of to see a naive implementor expect
.B only
.I fmt
and
.I data
chunks, in this order, and to hard-code the offsets of the short
.I fmt
chunk and
.I data
chunk into their program, and this is something that should always be checked
when evaluating a new tool, just to make sure the developer didn't do this.
Many coding examples and WAVE file explainers from the 90s and early aughts
give the basic layout of a WAVE file, and naive devs go along with it.
.SS Encoding and Decoding Text Metadata
.\" .PP
.\" Modern metadata systems, anything developed since the late aughts, will defer
.\" encoding to an XML parser, so when dealing with
.\" .I ixml
.\" or
.\" .I axml
.\" so a client can mostly ignore this problem.
.\" .PP
.\" The most established metadata systems are older than this though, and so the
.\" entire weight of text encoding history falls upon the client.
.\" .PP
.\" The original WAVE specification, a part of the Microsoft/IBM Multimedia
.\" interface of 1991, was written at a time when Windows was an ascendant and
.\" soon-to-be dominant desktop environment. Audio files were almost
.\" never shared via LANs or the Internet or any other way. When audio files were
.\" shared, among the miniscule number of people who did this, it was via BBS or
.\" Usenet. Users at this time may have ripped them from CDs, but the cost of hard
.\" drives and low quality of compressed formats at the time made this little more
.\" than a curiosity. There was no CDBaby or CDDB to download and populate metadata
.\" from at this time.
.\" .PP
.\" So, the
.\" .I INFO
.\" and
.\" .I cue
.\" metadata systems, which are by far the most prevalent and supported, were
.\" published two years before the so-called "Endless September" of 1993 when the
.\" Internet became mainstream, when Unicode was still a twinkle in the eye, and
.\" two years before Ariana Grande was born.
.PP
The safest assumption, and the mandate of the Microsoft, is that all text
metadata, by default, be encoded in Windows codepage 819, a.k.a. ISO Latin
alphabet 1, or ISO 8859-1. This covers most Western European scripts but
excludes all of Asia, Russia, most of the European Near East, the Middle
East.
.PP
To account for this, Microsoft proposed a few conventions, none of which have
been adopted with any consistency among clients of the WAVE file standard.
.IP 1)
The RIFF standard defines a
.I cset
chunk which declares a Windows codepage for character encoding, along with a
native country code, language and dialect, which clients should use for
determining text information. We have never seen a WAVE
file with a
.I cest
chunk.
.IP 2)
Certain RIFF chunks allow the writing client to override the default encoding.
Relevant to audio files are the
.I ltxt
chunk, which encodes a country, language, dialect and codepage along with a
time range text note. We have never seen the text field on one of these
filled-out either.
.PP
Some clients in our experience simply write UTF-8 into
.IR cue ,
.IR labl ,
and
.I note
fields without any kind of framing.
.PP
The practical solution at this time is to assume either ISO Latin 1, Windows
CP 859 or Windows CP 1252, and allow the client or user to override this based
on its own inferences. The
.I chardet
python package may provide useable guesses for text encoding, YMMV.
.SH CHUNK MENAGERIE .SH CHUNK MENAGERIE
A list of chunks that you may find in a wave file from our experience. A list of chunks that you may find in a wave file from our experience.
.SS Essential WAV Chunks .SS Essential WAV Chunks
.IP fmt .IP fmt
Defines the format of the audio in the Defines the format of the audio in the
.I data .I data
chunk: the audio codec, the sample rate, bit depth, channel count, block chunk: the audio codec, the sample rate, bit depth, channel count, block
alignment and other data. May take an "extended" form, with additional data alignment and other data. May take an "extended" form, with additional data
(such as channel speaker assignments) if there are more than two channels in (such as channel speaker assignments) if there are more than two channels in
the file or if it is a compressed format. the file or if it is a compressed format.
.IP data .IP data
The audio data itself. PCM audio data is always stored as interleaved samples. The audio data itself. PCM audio data is always stored as interleaved samples.
.SS Optional WAVE Chunks
.IP JUNK .IP JUNK
A region of the file not currently in use. Clients sometimes add these before A region of the file not currently in use. Clients sometimes add these before
the the
@@ -42,10 +202,8 @@ very deep heirarchy of chunks, compared to AVI files.
The RIFF container format has a metadata system common to all RIFF files, WAVE The RIFF container format has a metadata system common to all RIFF files, WAVE
being the most common at present, AVI being another very common format being the most common at present, AVI being another very common format
historically. historically.
.IP INFO .IP "LIST form INFO"
A A flat list of chunks, each containing text metadata. The role
.I LIST
form containing a flat list of chunks, each containing text metadata. The role
of the string, like "Artist", "Composer", "Comment", "Engineer" etc. are given of the string, like "Artist", "Composer", "Comment", "Engineer" etc. are given
by the four-character code: "Artist" is by the four-character code: "Artist" is
.IR IART , .IR IART ,
@@ -58,10 +216,8 @@ Comment is
etc. etc.
.IP cue .IP cue
A binary list of cues, which are timed points within the audio data. A binary list of cues, which are timed points within the audio data.
.IP adtl .IP "LIST form adtl"
A Contains text labels
.I LIST
form containing text labels
.RI ( labl ) .RI ( labl )
for the cues in the for the cues in the
.I cue .I cue
@@ -73,17 +229,17 @@ but hosts tend to use notes for longer text), and "length text"
.I ltxt .I ltxt
metadata records, which can give a cue a length, making it a range, and a text metadata records, which can give a cue a length, making it a range, and a text
field that defines its own encoding. field that defines its own encoding.
.IP CSET .IP cset
Defines the character set for all text fields in Defines the character set for all text fields in
.IR INFO , .IR INFO ,
.I adtl .I adtl
and other RIFF-defined text fields. By default, all of the text in RIFF and other RIFF-defined text fields. By default, all of the text in RIFF
metadata fields is Windows Latin 1/ISO 8859-1, though as time passes many metadata fields is Windows Latin 1/ISO 8859-1, though as time passes many
clients have simply taken to sticking UTF-8 into these fields. The clients have simply taken to sticking UTF-8 into these fields. The
.I CSET .I cset
cannot represent UTF-8 as a valid option for text encoding, it only speaks cannot represent UTF-8 as a valid option for text encoding, it only speaks
Windows codepages, and we've never seen one in a WAVE file in any event and Windows codepages, and we've never seen one in a WAVE file in any event, and
it's vanishingly likely an audio app would recognize one if it saw it. it's unlikely an audio app would recognize one if it saw it.
.SS Broadcast-WAVE Metadata .SS Broadcast-WAVE Metadata
Broadcast-WAVE is a set of extensions to WAVE files to facilitate media Broadcast-WAVE is a set of extensions to WAVE files to facilitate media
production maintained by the EBU. production maintained by the EBU.
@@ -124,6 +280,7 @@ chunk.
This is a hybrid binary/gzip-compressed-XML chunk that associates ADM This is a hybrid binary/gzip-compressed-XML chunk that associates ADM
documents with timed ranges of a WAVE file. documents with timed ranges of a WAVE file.
.SS Dolby Metadata .SS Dolby Metadata
Dolby metadata is present in Dolby Atmos master ADM WAVE files.
.IP dbmd .IP dbmd
Records hints for Dolby playback applications for downmixing, level Records hints for Dolby playback applications for downmixing, level
normalization and other things. normalization and other things.
@@ -138,53 +295,86 @@ Region and cue point metadata.
.IP elm1 .IP elm1
.IP minf .IP minf
.IP umid .IP umid
.SH HISTORY .SH REFERENCES
The oldest document that defines the form of a Wave file is the (Note: We're not including URLs in this list, the title and standard number
.I Multimedia Programming Interface and Data Specifications 1.0 should be sufficient to find almost all of these documents. The ITU, EBU and
of August 1991. IETF standards documents are freely-available.)
.\" .SH REFERENCES .SS Essential File Format
.\" .SS ESSENTIAL FILE FORMAT .TP
.\" .TP .B Multimedia Programming Interface and Data Specifications 1.0. Microsoft Corporation, 1991.
.\" .UR https://www.aelius.com/njh/wavemetatools/doc/riffmci.pdf The original definition of the
.\" Multimedia Programming Interface and Data Specifications 1.0 .I RIFF
.\" .UE container, the
.\" The original definition of the .I WAVE
.\" .I RIFF form, the original metadata facilites (like
.\" container, the .IR INFO " and " cue ),
.\" .I WAVE and things like language, country and
.\" form, the original metadata facilites, and things like language, country and dialect enumerations. This document also contains descriptions of certain
.\" dialect enumerations. variations on the WAVE, such as
.\" .TP .I LIST wavl
.\" .UR https://datatracker.ietf.org/doc/html/rfc2361 and compressed WAVE files that are so rare in practice as to be virtually
.\" RFC 2361 non-existent.
.\" .UE .TP
.\" A large RFC compilation of all of the known (in 1998) audio encoding formats .B ITU Recommendation BS.2088-1-2019 \- Long-form file format for the international exchange of audio programme mterials with metadata. ITU 2019.
.\" in use. 104 different codecs are documented with a name, the corresponding Formalized the RF64 file format, ADM carrier chunks like
.\" magic number, and a vendor contact name, phone number and address (no .IR axml
.\" emails, strangely). Almost all of these are of historical interest only. and
.\" .SS RF64/Extended WAVE Format .IR chna .
.\" Formally supercedes the previous standard for RF64,
.\" .TP .BR "EBU 3306 v1" .
.\" .UR https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.2088-1-201910-I!!PDF-E.pdf One oddity with this standard is it defines the file header for an extended
.\" ITU Recommendation BS.2088-1-2019 WAVE file to be
.\" .UE .IR BW64 ,
.\" BS.2088 gives a detailed description of the internals of an RF64 file, but this is never seen in practice.
.\" .I ds64 .TP
.\" structure and all formal requirements. It also defines the use of .B RFC 2361 \- WAVE and AVI Codec Registries. IETF Network Working Group, 1998.
.\" .IR <axml> , Gives an exhaustive list of all of the codecs that Microsoft had assigned to
.\" .IR <bxml> , vendor WAVE files as of 1998. At the time, numerous hardware vendors, sound
.\" .IR <sxml> , card and chip manufacturers, sound software developers and others all provided
.\" and their own slightly-different adaptive PCM codecs, linear predictive compression
.\" .I <chna> codes, DCTs and other things, and Microsoft would issue these vendors WAVE
.\" metadata chunks for the carriage of Audio Definition Model metadata. codec magic numbers. Almost all of these are no longer in use, the only ones
.\" .TP one ever encounters in the modern era are integer PCM (0x01), floating-point
.\" .UR https://tech.ebu.ch/docs/tech/tech3306.pdf PCM (0x03) and the extended format marker (0xFFFFFFFF). There are over a
.\" EBU Tech 3306 "RF64: An Extended File Format for Audio Data" hundred codecs assigned, however, a roll-call of failed software and hardware
.\" .UE brands.
.\" Version 1 of Tech 3306 laid out the .SS Broadcast WAVE Format
.\" .I RF64 .TP
.\" extended WAVE .B EBU Tech 3285 \- Specification of the Broadcast Wave Format (BWF). EBU, 2011.
.\" file format almost identically to Defines the elements of a Broadcast WAVE file, the
.\" .IR BS.2088 , .I bext
.\" Version 2 of the standard wholly adopted metadata chunk structure, allowed sample formats and other things. Over the
.\" .IR BS.2088 . years the EBU has published numerous supplements covering extensions to the
format, such as embedding SMPTE UMIDs, pre-calculated loudness data (EBU Tech
3285 v2),
.I peak
waveform overview data (Suppl. 3), ADM metadata (Suppl. 5 and 7), Dolby master
metadata (Suppl. 6), and other things.
.TP
.B SMPTE 330M-2011 \- Unique Material Identifier. SMPTE, 2011.
Describes the format of the SMPTE UMID field, a 32- or 64-byte UUID used to
identify media files. UMIDs are usually a dumb number in their 32-byte form,
but the extended form can encode a high-precision timestamp (with options for
epoch and timescale) and geolocation information. Broadcast-WAVE files
conforming to
.B "EBU 3285 v2"
have a SMPTE UMID embedded in the
.I bext
chunk.
.SS Audio Definition Model
.TP
.B ITU Recommendation BS.2076-2-2019 \- Audio definition model. ITU, 2019.
Defines the Audio Definition Model, entities, relationships and properties. If
you ever had any questions about how ADM works, this is where you would start.
.SS iXML Metadata
.TP
.B iXML Specification v3.01. Gallery Software, 2021.
iXML is a standard for embedding mostly human-created metadata into WAVE files,
and mostly with an emphasis on location sound recorders used on film and
television productions. Frustratingly the developer has never published a DTD
or schema validation or strict formal standard, and encourages vendors to just
do whatever, but most of the heavily-traveled metadata fields are standardized,
for recording information like a recording's scene, take, recording notes,
circled or alt status. iXML also has a system of
.B "families"
for associating several WAVE files together into one recording.

View File

@@ -1,6 +1,9 @@
References References
========== ==========
A complete list of technical references and commentary is available as man page
and is installed as wavinfo(7) when you install `wavinfo` via pip.
Wave File Format Wave File Format
---------------- ----------------