Click here to Skip to main content
Click here to Skip to main content
Go to top

How to Build a .NET Softphone in C# with SIP, SDP, RTP and RTCP

, 18 Mar 2011
Rate this:
Please Sign up or sign in to vote.
Essay and tutorial about how to easily build a softphone application

Editorial Note

This article appears in the Third Party Products and Tools section. Articles in this section are for the members only and must not be used to promote or advertise products in any way, shape or form. Please report any spam or advertising.

Introduction

The fast spread of Voice over IP (VoIP) technology can be sharply predicted today. You can find VoIP solutions in more and more fields. One of the possible use patterns of these VoIP solutions is the build of a simple Internet telephony program. Because of these facts, I decided to build my own VoIP telephone application based on the following knowledge and requirements:

The used code should support the latest stable .NET technologies, the use of C# programming language and easy obfuscation. [3]

The two essential protocols of VoIP calls are SIP [1] and the H323 [2] protocol. Both protocols are capable of creating audio-visual communication between the participants with the use of other protocols.

I decided to use the SIP protocol because it is easy to implement and the understanding of communication processes is also easier. Moreover, the SIP protocol does not inherit anything from the features of PSTN network. Detailed comparisons have also dealt with this topic.

So after the design decision it was obvious to use SIP, SDP, RTP and RTCP protocols: SIP (Session Initiation Protocol) is used creating sessions between the parties. SDP (Session Description Protocol) is used for describing multimedia communication sessions. RTP (Transport Protocol for Real-Time Applications) defines the delivery of mediadata. The communication process is built up by using SIP, SDP then RTP protocols.

Experiments

In the first phase of my experiment, I decided to write an own Softphone. It started with the minimal implementation of SIP protocol, then I developed the minimal representation of SIP messages (in other words, I developed the SIP Headers that are included in an average SIP Message just like Via, Contact, From, To, Call-ID). After that, I successfully establish a call on INVITE level.

Until this point, I could go on easily and quickly but then I had to face two problems: Once the invite messages that arrived in certain situations had no effects. The other problem was that the SDP protocol that is for the reconciliation of the media was missing or waiting for implementation, and the RTP Protocols that are responsible for the media communication were missing, as well. Then, as a consequence, the following software architecture was imagined:

+-----------------------------+
|        UserAgent [4]        |
+--------------+--------------+
|SIP/SDP [4][5]|    RTP [6]   |
+--------------+--------------+
|        Network layer        |
+-----------------------------+ 

Since only two fifth of this software architecture was ready, I started to search for components on the Internet. There are fine SDP implementations available on the Internet but the RTP implementations also include network communication. This fact would make the standardized use difficult. I was working on the first problem when I found a SIP guide called "A Hitchhiker's Guide to the Session Initiation Protocol" [8]. That was the point when I decided to give up using my own written components to realize the required application.

In the second phase of my experiment, I was looking for outside components that give complex solution for treating SIP protocol. Most of the SDKs available on the Internet do not meet the above requirements. They cannot be used properly, it is difficult to use them, or they require too much technical knowledge or they are Wrapped COM objects.

As the result of my search, I found the solution offered by Ozeki. Ozeki VoIP SIP SDK provides an easy to use interface; furthermore, it helps testing with a Mock Softphone object. In the MockUp part of the software development life cycle, it makes testing of the appropriate components and models much easier. Furthermore, the random events match reality and they create realistic situations without making a real phone call.

In the third phase of my experiment, I got to know and started to use the selected component. Now I would like to summarize the results and experiences. Since the aim of this article is not the presentation of Windows sound management, the sound management will be presented in a simple and striking way. On the basis of the sample code presented on Ozeki VoIP SIP SDK website [9][10], you can get a transparent, easy-to-use and simple code with the help of a component that is able to handle VoIP calls pragmatically.

Ozeki VoIP SIP SDK

To make it simple, I am going to show you a program which ignores the implementation of GUI and the handling of the technicality of the audio device, as well. The problems deriving from these details are easily solved by showing SDK usage in a console application. Sound handling problems are also solved by the instant return of the received audio data. Thus, the code focuses on the handling of events and on the introduction of constructional objects.

In order to do so, we need to be familiar with the available tools. In the middle of the abstraction, there is the IPhoneCall. You can find more information about IPhoneCall in the documentation available on the website of Ozeki. To the Phonecall objects, a listener can be attached that is similar to the Observer pattern. Although, the attaching and detaching needs to be done by the programmer with the help of AttachListener and DetachListener extension methods. Additionally, all event types are unifiedly handled with the help of VoIPEventArgs.

public interface IPhoneCallListener 
{   
    void CallErrorOccured(object sender, VoIPEventArgs<CallError> e);  
    void CallStateChanged(object sender, VoIPEventArgs<CallState> e);
    void DtmfReceived(object sender, VoIPEventArgs<DTMF> e); 
    void MediaDataReceived(object sender, VoIPEventArgs<VoIPMediaData> e);
    void PlainMediaDataReceived(object sender, VoIPEventArgs<EncodedMediaData> e);
}

During the active lifecycle of a representative telephone call object situations can happen. These situations are listed in IPhoneCallListener. These function names speak for themselves so they will not be discussed here. The example shown below can give you guidance.

class PongCallListener : IPhoneCallListener       
{    
    public void DtmfReceived(object sender, VoIPEventArgs<DTMF> e)    
    {                            
        var dtmf = e.Item;  
        var call = (PhoneCall)sender;
        Console.WriteLine("Dtmf received"); 
        call.SendDTMFSignal(VoIPMediaType.Audio, e.Item); 
    }  

In this example, we are creating a simple PhoneCallListener object. It will send some of the received information to the other party as soon as it is received. For this DTMF signal sending is an example.

public void CallErrorOccured(object sender, VoIPEventArgs<CallError> e)       
{                                                                             
    var call = (PhoneCall)sender;                                             
    Console.WriteLine("Call error occurred: "  e.Item);                        
}  

If an error occurs during the configuration of the call, the purpose of the error will be written on the screen.

public void MediaDataReceived(object sender, VoIPEventArgs<VoIPMediaData> e)  
{                                                                             
    var call = (PhoneCall)sender;                                             
    call.SendMediaData(e.Item.MediaType, e.Item.PCMData);                     
}  

A data has arrived in pure PCM format. This means that SDK can handle not just audio but other media type data as well. Here, we simply send back the received data to the sender and by this, we cause a big surprise for him.

public void CallStateChanged(object sender, VoIPEventArgs<CallState> e)       
{                                                                             
    var call = (PhoneCall)sender;                                             
    Console.WriteLine("Call state changed: "  e.Item);                        
                                                                                      
    if (e.Item > CallState.InCall)                                            
    call.DetachListener(this);                                            
}  

The status of the call may change, if the status is different than the InCall, namely it ended somehow, we put the phone down, or the other party put it down. If these situations happen, than it is worth removing the PhoneCallListener object from the PhoneCall object, with the above mentioned DetachListener method.

    public void PlainMediaDataReceived(object sender, VoIPEventArgs<EncodedMediaData> e)
    {    
    }    
}  

If the data arrives in an encrypted form from the caller, then we used the IPhoneCall.PlainMediaData property while the application was running. In this case, we need to do nothing.

Accordingly, the device we mostly need to deal with is the IPhoneCall interface implements object, and the IPhoneCallListener objects that are attached to it. In this way, we get an appropriate creative freedom.

pclass Program                                                       
{ 

We also need to create telephone calls. In order to do this, we need a program.

static Dictionary<string,IPhoneCall> Calls;                     
static PongCallListener FunnyCallListener; 

The program contains the active calls in a Dictionary, and we only use one PongCallListener.

static void Main(string[] args)                                 
{                                                               
    ISoftPhone SoftPhone = new SoftPhone("", 5000, 8000, 5060); 
    //ISoftPhone SoftPhone = new VoIP.SDK.Mock.ArbSoftPhone();  
    SoftPhone.IncommingCall = (SoftPhone_IncommingCall);        
    IPhoneLine PhoneLine = null;                                
    FunnyCallListener = new PongCallListener();                 
    Calls = new Dictionary<string, IPhoneCall>();

We instance a SoftPhone object that handles the calls. If we are bored of testing our application with real calls, then we can use the received ArbSoftPhone, that creates random situations.
After our SoftPhone is completed, we subscribe to incoming call events, that is thrown when there is an incoming call just like its name suggests. In parameters, the calling object is found alone.

We also need a telephone line, this is the IPhoneLine interface. Here, I would like to add that SDK can handle multiple parallel lines, with multiple parallel callings on them. Then, we instance our CallListener object that we are going to attach to every call, during the running of the program.
The Calls dictionary assigns Call strings to the call objects.

Console.WriteLine("Be funny!");                             
Console.Write("Display name: "); string displayName = Console.ReadLine(); 
Console.Write("Username: "); string username = Console.ReadLine(); 
Console.Write("Register name: "); string registerName = Console.ReadLine(); 
Console.Write("Register password: "); string registerPassword = Console.ReadLine();
Console.Write("Domain server: "); string domainServer = Console.ReadLine();

We read the registration information from the user.

string[] domains = domainServer.Split(':');    
int port = 5060;                               
if (domains.Length == 2)                       
   port = Int32.Parse(domains[1]);            
SIPAccount account = new SIPAccount(true, displayName, username, registerName,
	registerPassword, domains[0], port); 
PhoneLine = SoftPhone.CreateAndRegisterPhoneLine(account); 

We check if there was a port in the given domain, if not, then we use the default 5060 one during the registration. To do so, we need to create a SIPAccount object, then by using this object, we request an IPhoneLine object from the SoftPhone. On this IPhoneLine object, we start the registration procedure.

while (true)                                       
{  

Then, comes the fun part...

string statement = Console.ReadLine().Trim();
    if (statement.StartsWith("exit"))
        break; 

...until we are bored of it. From the keyboard, we read a string. If it is "exit" we quit, if it is something else...

    if (!Calls.ContainsKey(statement))             
    {                                              
        if (PhoneLine.RegisteredInfo == PhoneLineInformation.RegistrationSucceded)
        {                                                         
            IPhoneCall Call = SoftPhone.CreateCallObject(         
		PhoneLine, statement, FunnyCallListener       
	);                                                
            Calls.Add(statement, Call);                           
            Call.CallStateChanged = (Call_CallStateChanged);     
            Call.Start();                                         
        }                                                         
    }                                                             
}   

... we check, if we have attached call object to the received string, if yes then nothing happens, if no we check our telephone line has successfully registered or not. If it did, we created a phone object with SoftPhone, and it will only start calls on our one telephone line. These calls will be created to the typed phone number and it will be contained in the property of DialInfo.
We attach the typed phone number to created object, we sign for the call state transition, for the case when the otherside put down, we get a notification and we can take it out from the Calls.
Then we start the call. We keep on repeating this until we are bored of it.

foreach (IPhoneCall call in Calls.Values)                         
    call.HangUp();                                                
SoftPhone.Close();                                                
}  

When quit, we put down every active call.

After that, we also need event handlers. These are for handling the incoming calls. Also, they are responsible for the removal of the calling object dictionary at the specific call ending.

static void SoftPhone_IncommingCall(object sender, VoIPEventArgs<IPhoneCall> e)  
{                                                                 
    e.Item.AttachListener(FunnyCallListener);                     
    Calls.Add(e.Item.DialInfo, e.Item);                           
    e.Item.CallStateChanged = (Call_CallStateChanged);           
    e.Item.Accept();                                              
} 

We immediately attach our CallListener for the incoming calls. We add it to Calls. Then we add the change state transition function as well. We automatically accept the incoming call.

    static void Call_CallStateChanged(object sender, VoIPEventArgs<CallState> e)
    {                                                        
        if (e.Item > CallState.InCall)                       
        {                                                    
            IPhoneCall call = sender as IPhoneCall;          
            if (call == null)                                
                return;                                      
                                                                 
            Calls.Remove(call.DialInfo);                     
                                                                   
            call.DetachListener(FunnyCallListener);          
            call.CallStateChanged -= (Call_CallStateChanged);
        }                                                    
    }	                                                     
}  

If the received CallState is greater than the InCall, then the call is ended, and this is the event we are interested in. We remove the call object from Calls, according to dial info, and then we remove CallListener from it, just like Call_CallStateChanged event handler.
The CallState is an enum, that is a sorting along the call statuses. For example, it grows from Setup to InCall, through Completed, that is why the comparative operators can be used on it, everything that signals the ending of the call is larger than the InCall.

The example shows how simply and quickly a rarely complex application can be developed that is able to handle phone calls. Expendation only depends on the IPhoneCallListener implementation.

Summary

To summarize, in order to implement your own VoIP SoftPhone is time consuming and it requires a lot of energy. Therefore, it is efficient to use previously written components. After the study of many SDKs, the most understandable was the solution given by Ozeki. As it was shown in the examples, written in interfaces, implemented in classes it handles phone calls in the easiest way. Like Albert Einstein said:"Everything should be made as simple as possible, but no simpler."

You do not need to worry about the implementation details. Everything that the programmer can do with the telephone call, this stands in the center, and its situation is defined in one place. Therefore, your plans can be easily achieved, because you do not need to get lost in technical details. It is as easy as one, two, three. I ask for a telephone, and for one or more lines and I call or I am called. However, the article greatly refers to the documentation out on the web page from which more information can be earned. I can recommend this solution to everyone.

References

[1] http://en.wikipedia.org/wiki/Session_Initiation_Protocol
[2] http://en.wikipedia.org/wiki/H323
[3] http://en.wikipedia.org/wiki/Obfuscated_code
[4] http://tools.ietf.org/html/rfc3261
[5] http://tools.ietf.org/html/rfc4566
[6] http://tools.ietf.org/html/rfc3550
[7] http://tools.ietf.org/html/rfc3551
[8] http://tools.ietf.org/html/rfc5411
[9] http://www.voip-sip-sdk.com
[10] http://www.voip-sip-sdk.com/index.php?owpn=98

History

  • 18th March, 2011: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

No Biography provided

Comments and Discussions

 
GeneralMessage Automatically Removed PinmemberMember 1058834911-Feb-14 4:03 
GeneralMy vote of 4 PinmemberMic6-Aug-13 14:42 
Questionbad Pinmemberlogistum18-Jan-13 7:50 
QuestionIts a pity that this is a Windows only solution Pinmembermijxyphoid15-Aug-12 17:13 
Questionone dll file missing PinmemberMember 905861827-Jun-12 20:47 
AnswerRe: one dll file missing PinmemberMember 1066756120-Mar-14 21:34 
QuestionHelp me,please! [modified] Pinmemberdiepa9k399-Mar-12 23:40 
AnswerRe: Help me,please! PinmemberWilsonLast13-Apr-12 2:04 
QuestionSoftphone in VB.Net PinmemberWilsonLast29-Feb-12 22:53 
GeneralMy vote of 5 Pinmember_storm129-Apr-11 3:33 
GeneralMy vote of 5 PinmemberTell Will Mosh8-Apr-11 20:01 
It works

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140916.1 | Last Updated 18 Mar 2011
Article Copyright 2011 by Esteban Murandi
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid