Media

Sessions in Phon usually consist of a media recording (either audio or video) coupled with transcriptions of utterances from the recording and separated into records. Each record can be associated via a process called segmentation to the time in the recording during which the utterance occurred.

To facilitate the segmentation and transcription of media files, Phon has a built-in media player. This player is available in the Media Player view.

For accurate segment identification and playback the application requires that a wav file with the same name as the original media file exists. This file is referred to as session audio while the original media file is referred to as session media. When the session media file is already in wav format these two terms refer to the same file.

Note: Only 16-bit PCM wav files are supported for session audio.

Segment playback and export

Segment playback is available when a session audio is available or when a session media file is available and the Media Player view is open. Segment playback actions are available from the session editor toolbar and the Media window menu.

The following segment playback actions are available:
  • Play segment CMD/CTRL+R - play segment of currently selected record.
  • Play custom segment... CMD/CTRL+Alt+R - play a custom segment.
  • Play current speech turn CMD/CTRL+L - play all consequitive segments for the current speaker starting with the current record
  • Play adjacency sequence CMD/CTRL+Shift+R - play all consequitive segments for the current speaker and then the next speaker

Actions to export segments (audio only) which mirror segment playback options are also available in the Media menu.

Segment playback from the Timeline and Speech Analysis views is also possible. In these views playback will be either the selected segment of the waveform or the current record segment and can be performed by pressing space when the view is focused.

Media Actions

The following actions are available from the Media window menu when the Session Editor is focused.

Assign media

There are several methods for assigning media to the session.

Unassign media

This will clear media file assignment for the session. Deleting the value shown in the Media text field of the Session Information view will also unassign session media.

Generate/re-encode session audio

If session media is available but a matching wav file cannot be found the following banner will be displayed in the Timeline and Speech Analysis views. Clicking the banner will begin the encoding process and progress will be displayed in the banner. Once the encoding process is complete the new wav file will be opened. This command is also available in the Media window menu.

Figure 1. Generate Session Audio Banner
Generate Session Audio Banner
Note: Phon will require write access to the media folder to generate/re-encode session audio.

If Phon detects that it is unable to read the session audio file you will be prompted to re-encode the file. If you choose to re-encode the wav file the original audio file will be renamed with a -orig suffix in the filename. If the prompt continues to be displayed after the re-encode process you may choose to Do nothing which will leave session audio unloaded.

Select media folders

This is a direct link to the Media Folders tab in application preferences dialog.