In the first case, the data stream has only one speech region:
Figure 1: A data stream with only one speech region.
After filtering, the non-speech regions are removed, and becomes:
Figure 2: A data stream with only on speech region after filtering.
|
|