Using WaveIn Functions to Record Sound

Tags: /超级猛料/Multi-Media.多媒体相关/ Date Created:

The Unofficial Newsletter of Delphi Users - by Robert Vivrette

--------------------------------------------------------------------------------
Tidal Flows - Using WaveIn Functions to Record Sound
By Alan G Lloyd - AlanGLLoyd@aol.com
When I first started using windows sound functions to record and play back wave sounds, I used MCI (Media Control Interface) functions which were fairly easy to use. I saw some examples of using WaveIn and WaveOut functions at a lower level, but they always seemed incredibly complex and I had no concept of the overall system. Suddenly I saw the light ! - wave functions were just A-to-D (analogue to digital) and D-to-A (digital to analogue) converters. For WaveIn you provided a buffer and started recording, WaveIn converted the incoming microphone signal to digital values at a predetermined sampling rate. When the buffer was full it told you and wanted another. You had to copy the digital conversion into a file to store it. WaveOut was similar, but in reverse. Of course it's not quite as simple as that, because there are a number of little nooks and crannies wrapped around that simple concept.
Firstly you must define how the analogue system is to be sampled - the type of sampling, its sample rate, its sample size, mono/stereo, all these make up a format definition of the sampling. Secondly the buffers have to have an associated header structure to store details that WaveIn needs as it fills the buffer. Thirdly because it takes some time to do this, and a buffer changeover must take place in as little time as possible, you must have more than one buffer in a queue and prepare the headers beforehand. Fourthly the digital data must be saved in a wave file which includes a definition of the format in which it was sampled. If the format is a compressed one, it must also include a count of the samples.
Breaking this down into PC audio terminology (with references to the method names which are in the sample program) we must:
1. Fill a WaveFormatEx structure with a definition of the recording format (DefineWaveFormat()).
In the example this is a GSM 6.10 compressed format (I was interested in producing smaller dictation files when I developed it). To see all the codecs on your PC see my article on ACM format listing (coming). Because it is a compressed format we have extra bytes at the end of the standard WaveFormatEx structure (two for GSM) and so I have defined a special record to hold the standard WaveFormatEx and these extra bytes. As follows:

PWaveFormatGSM = ^TWaveFormatGSM;
TWaveFormatGSM = packed record
 WFX : TWaveFormatEx;
 AddByte1 : byte;
 AddByte2 : byte;
end;

The standard WaveFormatEx structure is :-

PWaveFormatEx = ^TWaveFormatEx;
TWaveFormatEx = packed record
 wFormatTag: Word;       { format tag }
 nChannels: Word;        { number of channels ( mono, stereo, etc.) }
 nSamplesPerSec: DWORD;  { sample rate }
 nAvgBytesPerSec: DWORD; { for buffer estimation }
 nBlockAlign: Word;      { block size of data }
 wBitsPerSample: Word;   { number of bits per sample of mono data }
 cbSize: Word;           { the count of the number of extra bytes }
end;

Note that the type of sampling is defined by the "Format Tag", the remainder of the elements make up the "Format".
2. Open a WaveIn device to get a handle to the device (PrepareToRecord())
I make a call to WaveInOpen passing the pointer to the format definition, a handle to the form's window (for messages to be returned to it) and specify that windows will find a suitable codec (wave_mapped), and that the handle is a window handle to send a message to, not an address of a callback function.
WaveInOpen(@hndWaveIn, DeviceID, PWaveFormatEx(PtrWaveFormatGSM),
integer(Self.Handle), 0, WAVE_MAPPED or CALLBACK_WINDOW)
Note that because I have used my own record format (to get the extra bytes in it) I have to typecast it to a standard PWaveFormatEx. The number and value of the extra bytes I obtained from previous low level activities with Audio Compression Manager.
3. Allocate memory for at least two buffers (CreateBuffersAndHeaders())
I calculate the buffer size to be apprximately 0.3secs of recording, and to be an integral multiple of blocks (using nAvgBytesPerSec and nBlockAlign). Otherwise it's straightforward AllocMem stuff.
4. Allocate memory for the Header records (one for each buffer) and fill its data elements (also in CreateBuffersAndHeaders())
TWaveHdr holds the pointer to the buffer, and its size, and also holds the number of bytes recorded when the header is returned by MM_WIM_DATA message.
5. Call WaveInPrepareHeader for all the headers (also in CreateBuffersAndHeaders())
6. Call WaveInAddBuffer to add the buffers to the queue for the WaveIn device to use (AddBuffersInitially())
This adds all the buffers to the queue to start with. Recording does not start until I call WaveInStart().
7. Prepare a file to receive the recorded data (PrepareFileWrite()).
A wave file is a RIFF (Resource Interchange File Format) file which has a pre-defined header format which I usually create before starting to record (I could copy al the data un-headed to a file and construct the headed wave file later, copying in the data). The wave file is made up of "chunk" elements. Each of these elements in the file has a four-character identifier, followed by a four-byte integer count of the data, followed by the data bytes itself (the data count does not include the eight byte identifier and count bytes).
The whole file is a chunk containing other chunks as the Riff data :-
"RIFF" 4 byte identifier
<nnnn> 4 byte integer of count of data - ie remainder of file
<data bytes>
The Riff data bytes are made up of other chunks as follows :-
"WAVE" 4 byte list identifier - no data count
"fmt " 4 byte chunk identifier
$14 $00 $00 $00 4 byte integer
<TWaveFormatGSM record> 20 bytes of data
"fact" 4 byte identifier, only in a compressed wave file
$04 $00 $00 $00 4 byte integer
<count of samples> 4 byte integer
"data" 4 byte chunk identifier
<count of "data" data bytes> 4 byte integer
<data bytes> the actual sampled sound
I use MMIO functions to create and write to the output file. As well as having built-in buffers they also do quite a bit of the "grunt" work for you in creating chunks and inserting identifiers and data counts into the file. An mmioCreateChunk() function will create chunks using TMMCkInfo records, whose elements are filled with the chunk identifier(s) ("RIFF" firstly in our case). After the chunk has been created the file write pointer is positioned following a four byte space for the data count. The data is written and then mmioAscend is called. This writes the data length (counted by mmiDescend and mmioWrite as we have written it) into the space left for it (after the chunk identifier) and positions the file write pointer to the byte following the end of the data, ready for another mmioCreateChunk.
8. Call WaveInStart to start recording (StartRecording()).
When a buffer becomes full it sends the MM_WIM_DATA message to the window whose handle has been specified in mmioOpen().
9. Receive MM_WIM_DATA message
The LParam element of the message is a pointer to the wave header which is associated with the buffer and contains a pointer (WaveHdr.lpData) to the buffer. The count of bytes recorded is in the dwBytesRecorded element of the wave header.
10. Message Handler deals with the MMWIM_DATA message (MMWIMData())
The message handler must copy the buffer contents to the wave file and call WaveInAddBuffer to add it back to the queue of buffers. There is no need to re-prepare the header as we have not changed the buffer size or position.
11. Stop Recording (StopRecording())
When we want to stop the recording, we call WaveInReset. This causes the buffers to be returned immediately, with some having zero for the count of bytes recorded. We must call Application.ProcessMessages immediately after WaveInReset so that the buffers are returned, and recorded bytes written to the wave file before the next action.
12. Finish writing to the wave file and close it (FinaliseFileWrite).
When all the digitised sound data has been written to the file, call mmioAscend on the "data" chunk header. This will write the count of data bytes to the chunk data count value.
After this, call mmioAscend with the mmCkInfo record for the overall "RIFF" chunk to write the file size into the count field following the "RIFF" identifier. This writing of chunk data count is possible because as we have been writing data and chunks, the ckSize element of the mmCkInfo record(s) has been quietly being incremented with the lengths of the identifiers, of the count integer, and of the data for the chunks. So that the total length can be written for that chunk when we call mmioAscend.
I must now write the count of samples to the "fact" chunk. Note that I start again from the beginning of file searching for the "fact" chunk. The "fact" chunk and its contents is necessary only for compressed formats, for PCM formats (fomat tag WAVE_FORMAT_PCM == $1) you do not need it. If you do write unneccesary chunks to a Riff file it does not matter. Riff file handlers have to ignore chunks they do not recognise, and you could store all sorts of stuff in a wave file if you wanted to.
13. Unprepare the buffer headers and free all used memory (ClearHeadersAndBuffers())
Straightforward FreeMem() stuff.
14. Indicate progress during recording (PositionTmrTimer())
During the recording we want to know how much we have recorded and that it is still recording. So I use a 500mSec timer which calls WaveInGetPosition, and we calculate the elapsed time from the count of the bytes returned by WaveInGetPosition and the average bytes per second. This timer also "flashes" a small shape as a "recording light".
15. Check for errors when calling MMIO functions (CheckOK())
I wrap any call to an MMIO function in a procedure call, to raise an exception and display a message if anything goes wrong. The returned result from the MMIO function becomes the parameter of the check function. This may be an unfamiliar technique to some, but it provides an easy way to "cover your a....". All the error strings are constructed here, and the description of "where the error occurred" is passed in the second parameter of the check function.
The sample program was written in Delphi 3, and I have included the form as both a .dfm file and as a .txt file, in case anyone is using a later version of Delphi.