2008年7月2日 星期三

RFC2833 DTMF packets

The observed behavior is that the Asterisk unit sends six RTP Event packets for each digit press. The first three packets are typical "start of event" packets. The first of these three has the SDP marker set to True (per RFC 2833). The other two have this marker set to False which is typical. All three of these "start" packets have the event duration set to zero. This is also typical. Basically, these three packets serve as a preamble telling the terminating gateway that a DTMF event is coming. See frames 3633-3635 of the attached capture for an example.

The last three packets of the six are all identical "end of event" packets. Each has the end-of-event bit set which is a different flag than the marker bit used in the first start packet. Each of these "end" packets also provide the total duration of the tone event. The Asterisk behavior observed is the total event duration is always 800 clocks ticks. Clock ticks are in reference to an 8kHz clock. So, 8 clock ticks represent 1 millisecond. And, 800 clock ticks would indicate a 100 ms tone duration. See frames 3636-3638 for examples.

The above behavior seems to indicate that when an Asterisk detects a digit press it calls a routine that always generates the above six packets (three start, three end, volume always 10, and tone duration always 100 ms). Other originating devices (i.e. Cisco) will actually sample the digit press for volume and duration and send event "update" packets (my terminology) between the start and end packets. These update packets increase the event duration throughout the length of the digit press. The samples are usually sent at an interval equivalent to the ptime of the codec (i.e. 20 ms for G.711). However, there's nothing inherently wrong with the 6-packet method used by Asterisk. And, I believe it would be considered to conform to RFC 2833. There is however another problem with the way Asterisk delivers these six packets that's at the heart of the double digit problem.

Let's look at two digit press examples from the capture. One that was interpreted correctly (digit press of a "1") and another that experienced a triple bounce (digit press of a "2").

Frames 3633 through 3638 represent one digit press event of the number "1". This digit played out correctly. Note the sequence numbers of these three frames show they arrived in sequence.

Frame 3633 - seq 56430 - start marker true - end of event false - event duration 0
Frame 3634 - seq 56431 - start marker false - end of event false - event duration 0
Frame 3635 - seq 56432 - start marker false - end of event false - event duration 0
Frame 3636 - seq 56433 - start marker false - end of event true - event duration 800
Frame 3637 - seq 56433 - start marker false - end of event true - event duration 800
Frame 3638 - seq 56433 - start marker false - end of event true - event duration 800

Now look at frames 3741 through 3746 for a single press of the number "2". Note how the frames arrive out of sequence.

Frame 3741 - seq 56485 - start marker true - end of event false - event duration 0
Frame 3742 - seq 56488 - start marker false - end of event true - event duration 800
Frame 3743 - seq 56488 - start marker false - end of event true - event duration 800
Frame 3744 - seq 56486 - start marker false - end of event false - event duration 0
Frame 3745 - seq 56488 - start marker false - end of event true - event duration 800
Frame 3746 - seq 56487 - start marker false - end of event false - event duration 0

The above plays out as a triple press. The first three frames are played out as one tone. The fourth and fifth frames play out a second tone. And the sixth frame starts a third tone. Packets arriving out of sequence like this are the cause of the double and triple digits heard by IVRs on the terminating side.
Examination of the time delta between each of the above frames indicates the Asterisk is sending the 6 packets that represent an RTP event as a single burst. The time differential between the packets is extremely small. It is often sub-10 micro-seconds (that's micro, not milli). Given the nature of latency on the public Internet it's easy to see why packets sent so close together may arrive out of sequence at the other end.

I suggest that you find the Asterisk routine responsible for generating the 6-packet RTP event and insert some "sleep" time between each of the six packets to space them out better. Or perhaps, insert just one sleep cycle between the three "start" packets and the three "end" packets. I think the latter is a better suggestion. I believe a 20 ms sleep cycle (the typical ptime for G.711 and G.729) between the three start packets and the three end packets should be enough to compensate for IP jitter in transport. However, you can experiment with different values to see what makes DTMF transport most reliable."

沒有留言: