MediaFile: elegant audio file tagging

MediaFile is a simple interface to the metadata tags for many audio file formats. It wraps Mutagen, a high-quality library for low-level tag manipulation, with a high-level, format-independent interface for a common set of tags.

It currently supports MP3 files (ID3 tags), AAC files (as tagged by iTunes) as well as FLAC, Ogg, Monkey’s Audio, WavPack, and Musepack.

If a tag does not exist, MediaFile will return None instead of throwing an exception.

Supported Metadata Fields

The metadata schema is generally based on MusicBrainz’ schema with similar naming. MediaFile supports:

  • basic fields like title, album, artist and albumartist,
  • sorting variants like albumartist_sort and composer_sort,
  • plural/list variants like artists and albumartists,
  • identifiers like asin, isrc or mb_releasegroupid,
  • dates like the release year, month and day with convenience wrapper date,
  • detailed metadata like language or media,
  • lyrics, copyright, url
  • calculated metadata like bpm (beats per minute) and r128_track_gain (ReplayGain),
  • embedded images (e.g. album art),
  • file metadata like samplerate, bitdepth, channels, bitrate, bitrate_mode, encoder_info, encoder_settings and length.

Compatibility

The ID3 and MPEG-4 test cases were created with iTunes and the FLAC and Ogg test cases were created (mostly) with MediaRage. The Monkey’s Audio tags were mainly fabricated using the open-source Tag. Thus, MediaFile’s tag support most closely aligns with those three applications. Some open questions remain about how to most compatibly tag files. In particular, some fields MediaFile supports don’t seem standardized among FLAC/Ogg taggers:

  • grouping and lyrics: couldn’t find anyone who supports these in a cursory search; MediaFile uses the keys grouping and lyrics
  • tracktotal and disctotal: we use the keys tracktotal, totaltracks, and trackc all to mean the same thing
  • year: this field appears both as a part of the date field and on its own using some taggers; both are supported

For fields that have multiple possible storage keys, MediaFile optimizes for interoperability: it accepts _any_ of the possible storage keys and writes _all_ of them. This may result in duplicated information in the tags, but it ensures that other players with slightly divergent opinions on tag names will all be able to interact with beets.

Images (album art) are stored in the standard ways for ID3 and MPEG-4. For all other formats, images are stored with the METADATA_BLOCK_PICTURE standard from Vorbis Comments. The older COVERART unofficial format is also read but is not written.

MediaFile Class

class mediafile.MediaFile(filething, id3v23=False)

Represents a multimedia file on disk and provides access to its metadata.

__init__(filething, id3v23=False)

Constructs a new MediaFile reflecting the provided file.

filething can be a path to a file (i.e., a string) or a file-like object.

May throw UnreadableFileError.

By default, MP3 files are saved with ID3v2.4 tags. You can use the older ID3v2.3 standard by specifying the id3v23 option.

classmethod fields()

Get the names of all writable properties that reflect metadata tags (i.e., those that are instances of MediaField).

classmethod readable_fields()

Get all metadata fields: the writable ones from fields() and also other audio properties.

save(**kwargs)

Write the object’s tags back to the file.

May throw UnreadableFileError. Accepts keyword arguments to be passed to Mutagen’s save function.

update(dict)

Set all field values from a dictionary.

For any key in dict that is also a field to store tags the method retrieves the corresponding value from dict and updates the MediaFile. If a key has the value None, the corresponding property is deleted from the MediaFile.

Exceptions

class mediafile.UnreadableFileError(filename, msg)

Mutagen is not able to extract information from the file.

class mediafile.FileTypeError(filename, mutagen_type=None)

Reading this type of file is not supported.

If passed the mutagen_type argument this indicates that the mutagen type is not supported by Mediafile.

class mediafile.MutagenError(filename, mutagen_exc)

Raised when Mutagen fails unexpectedly—probably due to a bug.

Internals

class mediafile.MediaField(*styles, **kwargs)

A descriptor providing access to a particular (abstract) metadata field.

__init__(*styles, **kwargs)

Creates a new MediaField.

Parameters:
  • stylesStorageStyle instances that describe the strategy for reading and writing the field in particular formats. There must be at least one style for each possible file format.
  • out_type – the type of the value that should be returned when getting this property.
class mediafile.StorageStyle(key, as_type=<class 'str'>, suffix=None, float_places=2, read_only=False)

A strategy for storing a value for a certain tag format (or set of tag formats). This basic StorageStyle describes simple 1:1 mapping from raw values to keys in a Mutagen file object; subclasses describe more sophisticated translations or format-specific access strategies.

MediaFile uses a StorageStyle via three methods: get(), set(), and delete(). It passes a Mutagen file object to each.

Internally, the StorageStyle implements get() and set() using two steps that may be overridden by subtypes. To get a value, the StorageStyle first calls fetch() to retrieve the value corresponding to a key and then deserialize() to convert the raw Mutagen value to a consumable Python value. Similarly, to set a field, we call serialize() to encode the value and then store() to assign the result into the Mutagen object.

Each StorageStyle type has a class-level formats attribute that is a list of strings indicating the formats that the style applies to. MediaFile only uses StorageStyles that apply to the correct type for a given audio file.

delete(mutagen_file)

Remove the tag from the file.

deserialize(mutagen_value)

Given a raw value stored on a Mutagen object, decode and return the represented value.

fetch(mutagen_file)

Retrieve the raw value of for this tag from the Mutagen file object.

formats = ['FLAC', 'OggOpus', 'OggTheora', 'OggSpeex', 'OggVorbis', 'OggFlac', 'APEv2File', 'WavPack', 'Musepack', 'MonkeysAudio']

List of mutagen classes the StorageStyle can handle.

get(mutagen_file)

Get the value for the field using this style.

serialize(value)

Convert the external Python value to a type that is suitable for storing in a Mutagen file object.

set(mutagen_file, value)

Assign the value for the field using this style.

store(mutagen_file, value)

Store a serialized value in the Mutagen file object.

Examples

To add cover art to a MediaFile:

from mediafile import MediaFile, Image, ImageType

image_file = "cover.jpg"
with open(image_file, 'rb') as f:
    cover = f.read()
    cover = Image(data=cover, desc=u'album cover', type=ImageType.front)
f = MediaFile("file.mp3)
f.images = [cover]
f.save()

To copy tags from one MediaFile to another:

from mediafile import MediaFile

f = MediaFile("file1.mp3")
g = MediaFile("file2.mp3")

for field in f.fields():
    try:
        setattr(g, field, getattr(f, field))
    except:
        pass

g.save()

Changelog

v0.13.0

  • Add a mapping compatible with Plex and ffmpeg for the “original date” fields.
  • Remove an unnecessary dependency on six.

v0.12.0

  • Add the multiple-valued properties artists_credit, artists_sort, albumartists_credit, and albumartists_sort.

v0.11.0

  • List-valued properties now return None instead of an empty list when the underlying tags are missing altogether.

v0.10.1

  • Fix a test failure that arose with Mutagen 1.46.
  • Require Python 3.7 or later.

v0.10.0

  • Add the multiple-valued properties albumtypes, catalognums and languages.
  • The catalognum property now refers to additional file tags named CATALOGID and DISCOGS_CATALOG (but only for reading, not writing).
  • The multi-valued albumartists property now refers to additional file tags named ALBUM_ARTIST and ALBUM ARTISTS. (The latter is used only for reading.)
  • The ListMediaField class now doesn’t concatenate multiple lists if found. The first available tag is used instead, like with other kinds of fields.

v0.9.0

  • Add the properties bitrate_mode, encoder_info and encoder_settings.

v0.8.1

  • Fix a regression in v0.8.0 that caused a crash on Python versions below 3.8.

v0.8.0

  • MediaFile now requires Python 3.6 or later.
  • Added support for Wave (.wav) files.

v0.7.0

  • Mutagen 1.45.0 or later is now required.
  • MediaFile can now use file-like objects (instead of just the filesystem, via filenames).

v0.6.0

  • Enforce a minimum value for SoundCheck gain values.

v0.5.0

  • Refactored the distribution to use Flit.

v0.4.0

  • Added a barcode field.
  • Added new tag mappings for albumtype and albumstatus.

v0.3.0

  • Fixed tests for compatibility with Mutagen 1.43.
  • Fix the MPEG-4 tag mapping for the label field to use the right capitalization.

v0.2.0

  • R128 gain tags are now stored in Q7.8 integer format, as per the relevant standard.
  • Added an mb_workid field.
  • The Python source distribution now includes an __init__.py file that makes it easier to run the tests.

v0.1.0

This is the first independent release of MediaFile. It is now synchronised with the embedded version released with beets v1.4.8.