SDL Development Notes: Introduction to Audio Basics, Using SDL to Play Audio

If the article is an original article, it may not be reproduced without permission
Original blogger blog address: https://blog.csdn.net/qq21497936
Original blogger blog navigation: https://blog.csdn.net/qq21497936/article/details/102478062
The blog address of this article: https://blog.csdn.net/qq21497936/article/details/108596396
Red Fatty (Red Imitation)'s blog post: development technology collection (including Qt practical technology, Raspberry Pi, 3D, OpenCV, OpenGL, ffmpeg, OSG, MCU, software and hardware combination, etc.) is continuously updated... (click the portal)

Qt development column: three-party library development technology (click the portal)

Previous:" SDL development notes (1): SDL introduction, compilation and use, and project templates>
Next: Stay tuned

 

foreword

   For Qt applications, for greater cross-platform versatility, SDL is used to play audio, and more expansion operations can be done at the same time.

 

sound waves

  Sound is a continuous wave propagating through the air, referred to as sound wave. The strength of the sound is reflected in the size of the sound wave pressure, and the pitch of the tone is reflected in the frequency of the sound.
  A sound signal consists of two basic parameters, frequency and repetition. The frequency of a signal refers to the number of times the signal changes per second, expressed in Hz.
A signal with a frequency range of 20Hz20Khz becomes an audio signal. The audio sound amplitude in this range is between 0.120dB and can be perceived by humans.
When the sound is converted into a digital signal, it becomes an audio signal.

 

audio signal

  Audio signal (acoustic signals) is the information carrier of frequency and amplitude variation of regular sound waves with speech, music and sound effects. According to the characteristics of sound waves, audio information can be classified into regular audio and irregular sound. The regular audio can be further divided into speech, music and sound effects. Regular audio is a continuously changing analog signal that can be represented by a continuous curve called a sound wave.
The three elements of    sound are pitch, intensity and timbre. A sound wave or sine wave has three important parameters: frequency ω0, amplitude An and phase ψn , which also determine the characteristics of the audio signal.
   Sampling the audio signal, and after digitizing the analog signal, it is a digital audio signal.

 

digital audio signal

  The storage of digital audio computer data is accessed in the form of 0 and 1, then digital audio is to first convert audio files, and then convert these level signals into binary data to save, and convert these data to analog when playing. The level signal is then sent to the speakers for broadcast. The digital sound is fundamentally different from the sound in the general tape, radio, and TV in terms of storage and playback methods. In contrast, it has the characteristics of convenient storage, low storage cost, no sound distortion during storage and transmission, and very convenient editing and processing.
  Digital audio signal is the audio data we finally process.
  Audio digital signal has several characteristics:

Quantization level

   Simply put, it is how many bits of binary data the data of the sound waveform is, usually using bits as the unit, such as 16bit, 24bit. The 16-bit quantization level records sound data using 16-bit binary numbers. Therefore, the quantization level is also an important indicator of digital sound quality. We describe the quality of digital sound, usually described as 24bit (quantization level), 48KHz sampling, such as the quality of standard CD music is 16bit, 44.1KHz sampling.

channel

   It can be simply understood that the audio data sampled by one diaphragm is one channel, two diaphragms are two channels, and so on. The diaphragm generally has three sizes: large, medium and small. The larger the size, the more sensitive it is to sound waves, and the higher the cost. Some microphones have one diaphragm, and some have two diaphragms. The one-diaphragm mic is for Mono recording, and the two-diaphragm microphone is for Stereo stereo recording. The five-channel surround sound recording is that microphone 1 records the sound in the northeast direction, microphone 2 records the sound in the northwest direction, microphone 3 records the sound in the southwest direction, microphone 4 records the sound in the southeast direction, and microphone 5 records the sound in the straight ahead. There are also four-channel surround sound recordings and seven-channel surround sound recordings.

Sampling Rate

   Simply put, how much data is needed to record a 1-second sound by waveform sampling. A sound with a 44KHz sampling rate takes 44,000 pieces of data to describe a 1-second sound waveform. In principle, the higher the sampling rate, the better the sound quality.

bit rate

A reference index of digital music compression efficiency, indicating the average bit value required to record audio data per second (a bit is the smallest data unit in a computer, referring to a number of 0 or 1), usually we use Kbps (popularly). Speaking is 1024 bits per second) as a unit. The bit rate of digital music in CD is 1411.2Kbps (that is, to record 1 second of CD music, 1411.2×1024 bits of data are required), and the bit rate required for MP3 digital music, which is close to CD quality, is about 112Kbps to 128Kbps.

Compression ratio

   usually refers to the ratio of the size of music files before and after compression, which is used to briefly describe the compression efficiency of digital sound.

 

Analysis of SDL audio playback process

  The basic process is as follows:

Step 1: Initialize the subsystem

  Initialize the audio system, other redundant systems do not need to be initialized.

Step 2: Turn on the audio device according to the audio information

  Fill the SDL_AudioSpec audio information, turn on the audio device, and the closest audio device will be returned. If there is no close one, the second parameter will return 0. At this time, we directly set the second parameter as 0, and there is no need to return.

Step 3: Start Playing

   Use SDL_PauseAudio(0) for playback.

Step 4: Loop Supplementary Data

  Complement according to the buffer data length and the remaining data length of the file. If the buffer data is gone, add it once, use SDL_Delay to delay 1ms, and use the remaining unplayed length of the current buffer to be greater than 0 and wait with the previous delay.

Step 4 (Additional): Callback function

After    starts playing, there will be other audio sub-threads to call the callback function to supplement the audio data. After testing, 4096 bytes are supplemented each time.

Step 5: Turn off the audio settings

Step 6: Exit the SDL system

 

SDL playback audio related variables

struct SDL_AudioSpec

  SDL_AudioSpec is a structure that contains the audio output format, and it also contains a callback function that is called when the audio device needs more data. This structure is the key.

typedef struct SDL_AudioSpec
{
    int freq;                   // DSP frequency—samples per second
    SDL_AudioFormat format;     // Audio data format
    Uint8 channels;             // Number of channels 1-mono, 2-stereo
    Uint8 silence;              // Audio buffer mute value (calculated)
    Uint16 samples;             // Basically, the inappropriate settings of 512 and 1024 may cause lag’
    Uint16 padding;             // Required for some build environments
    Uint32 size;                // Audio buffer size (bytes) (calculated)
    SDL_AudioCallback callback; // Provide data callback for audio device (null value uses SDL's own predefined SDL_QueueAudio() callback function)
    void *userdata;             // Userdata passed to the callback (ignored for empty callbacks)
} SDL_AudioSpec;

  Example: Play pcm audio "hurried year-44100-16-bit-dual-channel.pcm"

// Audio structure settings
SDL_AudioSpec sdlAudioSpec;
sdlAudioSpec.freq = 44100;
sdlAudioSpec.format = AUDIO_S16SYS;
sdlAudioSpec.channels = 1;
sdlAudioSpec.silence = 0;
sdlAudioSpec.samples = 1024;
sdlAudioSpec.callback = callBack_fillAudioData;
sdlAudioSpec.userdata = 0;
 

SDL playback audio related prototype

SDL_Init()

int SDLCALL SDL_Init(Uint32 flags);

  Use this function to initialize the SDL library. It must be called before using most other SDL functions. Try to be "enough" when initializing, instead of using SDL_INIT_EVERYTHING. There will be some unpredictable problems.

  • Parameter 1: Enter the initialized device

SDL_OpenAudio()

int SDL_OpenAudio(SDL_AudioSpec * desired,
                  SDL_AudioSpec * obtained);

   This function opens the audio device with the required parameters, then returns 0 if successful, putting the actual hardware parameters into the struct pointed to. If obtained is empty, the audio data passed to the callback function will be guaranteed to be in the requested format and will be automatically converted to the hardware audio format if necessary. On failure, this function returns -1, the audio device could not be opened, or the audio thread could not be set up.

  • Parameter 1: Input the parameters of the audio device to be opened;
  • Parameter 2: Returns the parameters of the successfully opened audio device;

SDL_PauseAudio()

extern DECLSPEC void SDLCALL SDL_PauseAudio(int pause_on);

   Pause audio function. Function to pause and unpause audio callback processing.
   Once the audio is turned on, the device on which they start playing the sound should be called with parameter 0. This way the data for the callback function can be safely initialized after the audio device is opened.
   During a pause, silence will be written to the audio device.

SDL_MixAudio: Mixing playback function

void SDL_MixAudio(Uint8 * dst,
                 const Uint8 * src,
                 Uint32 len,
                 int volume);

   This requires playing the audio format and mixing two audio buffers which perform addition, volume adjustment and overflow clipping. Volume ranges from 0 to 128 and should be set to SDL_MIX_MAXVOLUME full audio volume. Note that this does not change the volume of the hardware.
This is for convenience so that audio data can be mixed.

  • Parameter 1: target data, this is pointed to by the stream pointer in the callback function, you can directly use the stream pointer of the callback.
  • Parameter 2: Audio data, this is to mix the audio data that needs to be played into the stream, then here is the playback data that we need to fill.
  • Parameter three: the length of the audio data, this is the length we filled in the past.
  • Parameter 4: Volume, range from 0 to 128, SAL_MIX_MAXVOLUME is 128, the setting is the soft volume, not the hardware sound.

SDL_Delay()

void SDL_Delay(Uint32 ms);

  Wait the specified number of milliseconds before returning.

SDL_Quit()

void SDLCALL SDL_Quit(void);

   This function is used to clear all initialized subsystems. It is called after all exit conditions.

 

Demo source code

void SDLManager::testPlayPCM()
{
    int ret = 0;
    // audio structure
    SDL_AudioSpec sdlAudioSpec;
//    sdlAudioSpec.freq = 44100;
    sdlAudioSpec.freq = 22050;
//    sdlAudioSpec.format = AUDIO_U8; // x
//    sdlAudioSpec.format = AUDIO_S8; // x
//    sdlAudioSpec.format = AUDIO_U16LSB; // x
//    sdlAudioSpec.format = AUDIO_S16LSB; // √
//    sdlAudioSpec.format = AUDIO_U16MSB; // x
//    sdlAudioSpec.format = AUDIO_U16LSB; // x
//    sdlAudioSpec.format = AUDIO_S16MSB; // x
//    sdlAudioSpec.format = AUDIO_U16; // x
    sdlAudioSpec.format = AUDIO_S16; // √
//    sdlAudioSpec.format = AUDIO_S16SYS; // x
//    sdlAudioSpec.format = AUDIO_S32SYS; // x
//    sdlAudioSpec.format = AUDIO_F32SYS; // x
//    sdlAudioSpec.format = AUDIO_F32MSB; // x
    sdlAudioSpec.channels = 1;
    sdlAudioSpec.silence = 0;
    sdlAudioSpec.samples = 1024;    // Causes errors between 512 and 1024
    sdlAudioSpec.callback = callBack_fillAudioData;
    sdlAudioSpec.userdata = 0;

    QString fileName;

#if 0
    fileName = "testPCM/princess-22050-16 bit-single channel.pcm";
    sdlAudioSpec.freq = 22050;
    sdlAudioSpec.channels = 1;
    sdlAudioSpec.format = AUDIO_S16;
#endif
#if 1
    fileName = "testPCM/that year in a hurry-44100-16 bit-dual channel.pcm";
    sdlAudioSpec.freq = 44100;
    sdlAudioSpec.channels = 2;
    sdlAudioSpec.format = AUDIO_S16;
#endif
#if 0
    fileName = "testPCM/Beijing Beijing 8 k16bits mono.pcm";
    sdlAudioSpec.freq = 8000;
    sdlAudioSpec.channels = 1;
    sdlAudioSpec.format = AUDIO_S16;
#endif
#if 0
    fileName = "testPCM/Ice Rain Fragment 48 k16bit mono.pcm";
    sdlAudioSpec.freq = 48000;
    sdlAudioSpec.channels = 1;
    sdlAudioSpec.format = AUDIO_S16;
#endif
#if 0
    fileName = "testPCM/The spray is blooming fragment 48 k16bit mono.pcm";
    sdlAudioSpec.freq = 48000;
    sdlAudioSpec.channels = 1;
    sdlAudioSpec.format = AUDIO_S16;
#endif

    QFile file(fileName);
    if(!file.open(QIODevice::ReadOnly))
    {
        LOG << "Failed" << file.exists();
        return;
    }


    // Step 1: Initialize the Audio Subsystem
    ret = SDL_Init(SDL_INIT_AUDIO);
    if(ret)
    {
        LOG << "Failed";
        return;
    }

    // Step 2: Turn on the audio device
    ret = SDL_OpenAudio(&sdlAudioSpec, 0);
    if(ret)
    {
        LOG << "Failed";
        return;
    }

    // Step 3: Start Playing
    SDL_PauseAudio(0);

#if 1
    // Step 4: Read all the data at once
    QByteArray data = file.readAll();
    int pos = 0;
    _audioPos = (uint8_t *)data.data();
    _audioLen = data.size();
    pos += data.size();
    while(_audioLen > 0)
    {
        SDL_Delay(1);
    }
#else
    // Step 4: Read 4096 at one time
    int readSize = 4096;
    while(true)
    {
        _audioPos = (uint8_t *)file.read(readSize).data();
        _audioLen = readSize;
        while(_audioLen > 0)
        {
            SDL_Delay(1);
        }
    }
#endif
    // Step: Finished playing
    SDL_CloseAudio();

    // Step: Release SDL
    SDL_Quit();


    if(file.isOpen())
    {
        file.close();
        return;
    }
}

void SDLManager::callBack_fillAudioData(void *userdata, uint8_t *stream, int len)
{
    SDL_memset(stream, 0, len);
    if(_audioLen == 0)
    {
        return;
    }
    len = (len > _audioLen ? _audioLen : len);

    SDL_MixAudio(stream, _audioPos, len, SDL_MIX_MAXVOLUME);

    _audioPos += len;
    _audioLen -= len;

    // 4096 per load
    LOG << len;
}
 

Project template: corresponding version number v1.1.0

  Corresponding version number v1.1.0: Play raw PCM data.

 

Previous:" SDL development notes (1): SDL introduction, compilation and use, and project templates>
Next: Stay tuned

 

Original blogger blog address: https://blog.csdn.net/qq21497936
Original blogger blog navigation: https://blog.csdn.net/qq21497936/article/details/102478062
The blog address of this article: https://blog.csdn.net/qq21497936/article/details/108596396

Posted by dbemowsk on Tue, 17 May 2022 16:36:42 +0300