Parsing AAC audio stream in RTSP

Posted May 27, 20201 min read

Generally speaking, the data volume of one frame of audio data is very small. In RTSP transmission, one RTP packet can be transmitted without subcontracting through FU-A and other forms. Therefore, an RTP carrying a frame of AAC should look like this:

12 Byte | 2Byte | 2Byte | remaining Byte
RTP Header | AU Header Lengh | AU Header | AAC data

If the AAC data transmitted from the RTSP Server is with ADTS, if you want to take only the AAC audio content, you need to offset the AAC data by 7 bytes, because ADTS takes up bytes.

