Can pure virtual functions be defined as well?

A class that contains pure virtual functions is called abstract or interface class and concrete class derived from abstract class is called implementation class. Since abstract class just defines the interface i.e. what data a class contain and what operations can be performed on it. Virtual functions in abstract class are usually declared but not defined. It is the responsibility of derived class to define it.

Declaration of pure virtual function in C++:

virtual <return type> <user defined function name>(<arguments>) = 0; 
e.g.
virtual void Speak(int a) =  0;

But compiler does not complain even if you give the definition of pure virtual function and in some cases it is useful. Let us try to understand where it can be helpful and when that function get called as there can not be object of abstract class.

Let us say we have base class Aeroplane and AeroplaneA and AeroplaneB are derived from it. class Aeroplane has a method called fly() that defines general way of flying. Let us say AeroplaneA and AeroplaneB  have their own way of flying because they are special sort of planes (Jet or something :P). How can we design it in C++?

First approach:   Make function fly() pure virtual and define the fly() function in every class which is an intuitive solution.

  But let us say for another 3 Aeroplanes C, D and E have same general of flying(They are not special kind of planes). In that case you will have to copy the code of fly() function of Aeroplane class to the fly() function of class AeroplaneC, AeroplaneD and AeroplaneE that is duplicate code and moreover let us say there is some bug in flying functionality you will have to change 3 functions and you may forget to modify one or two of them. Hence It is really difficult to maintain.

class Aeroplane {
// constructor,destrcutor etc
public :
   virtual void fly() = 0;
} ;
class AeroplaneA : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
     //own functionality
  }
} ;
class AeroplaneB : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
       //own functionality
   }
} ;
class AeroplaneC : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
     // general fly functionality Duplicate code
   }
} ;
class AeroplaneD : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
     // general fly functioanlity.. Duplicate code
   }
} ;
//similarly for AeroplaneE

Second Approach :  Make function fly() virtual instead of pure virtual. Put the general flying functionality in fly() function of Aeroplane class and whichever class wants to override this functionality it can do. 

    Now we don't need to declare fly() function in AeroplaneC,AeroplaneD and AeroplaneE. Hence for objects of type AeroplaneA or AeroplaneB their own version of fly() will be called. and for rest of the Aeroplanes general fly() function in Aeroplane class will be called. 

class Aeroplane {
// constructor,destrcutor etc
public :
   virtual void fly() {
      // general fly functionality
   }
} ;
class AeroplaneA : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
     //own functionality
  }
} ;
class AeroplaneB : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
       //own functionality
   }
} ;
class AeroplaneC : public Aeroplane{
// constructor,destrcutor etc
// don't need to define fly function here

} ;
class AeroplaneD : public Aeroplane{
// constructor,destrcutor etc
// don't need to define fly function here
} ;

It looks perfect. Isn't it?

But unfortunately answer is No.  It can also cause some problems. e.g.

 Let us say another AeroplaneF(special plane) which has its own way of flying is introduced and designer forget to over ride the fly() function, compiler will not complain, your code will just start working without even a warning. It will be caught as part of bug etc. But what if it is not even caught in testing? For crucial and such critical design you can not take risk or you should not do such mistakes. During the actual run, your Aeroplane may just crash. :(

So what do we do? Neither solution looks like a good solution. Both are bug prone. 

Don't worry we will not let your plane crash.. :) C++ compiler gives you a very nice solution for such problems. 

Solution : Define pure virtual function.  By defining pure virtual function, we can put general fly() functionality in fly() function of Aeroplane and since fly() function is pure virtual all concrete class needs to redefine it and there we can simply call fly() function of class Aeroplane. Hence we have taken care of both problems.  Now there is no duplicate code and compiler enforces you to define virtual function in each concrete class.

class Aeroplane {
// constructor,destrcutor etc
public :
   virtual void fly() = 0 {
      // general fly functionality
   }
} ;
class AeroplaneA : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
     //own functionality
  }
} ;
class AeroplaneB : public Aeroplane{
// constructor,destrcutor etc
public :
   virtual void fly() {
       //own functionality
   }
} ;
class AeroplaneC : public Aeroplane{
// constructor,destrcutor etc
  virtual void fly() {
       Aeroplane::fly();
  }

} ;
class AeroplaneD : public Aeroplane{
// constructor,destrcutor etc
virtual void fly() {
       Aeroplane::fly();
  }


 
} ;

Reference : One of the Item from 50 specific ways to improve your C++ skills by Scott Meyer.

Dead reference Problem and its detection

In Meyer's singleton pattern, we saw how we can use meyer's implementation approach of singleton design pattern to destroy the singleton object efficiently. It seems perfect at the first glance. However, it has certain problem called Dead reference problem. Let us try to understand it with the help of an example :

Let us say there are two static objects of class A and B and we have a singleton design pattern class called log that errors out if there is some error in creation or deletion of A or B. As we had studied in the last post that singleton object is created in Instance() function when it is called first time. A and B calls Instance() function of log while initialization and destruction if there occurs some error. Let us say object of A got created successfully  but while constructing object of B there comes some error and it calls Instance() function of log Hence singleton object got created and possibly whole application exits out after this. Since compiler destroys local static object in the reverse order of their creation Hence log will be destroyed before object of A and let us say there occurs some error in destruction of A Instance() function return an Invalid object and program can behave unexpected. This is called dead reference problem.

Before thinking its solution we should understand what the real problem is. Problem is that we do not know the order of creation of object of A,B and log. A and B should follow the C++ rule(last object created first) but log should be exempted from this rule. log should destroyed after destruction of A and B so that it can catch all errors thrown by creation or destroy of A and B.

We have no way yet to control lifetime of objects (at least not in this post) but we can atleast detect dead reference problem that can be done by keeping a static bool variable called deleted_ inside singleton class which is initialized to false  and set to true in its destructor.


class Singelton {
   static bool deleted_;
public :
  Singleton& Instance() {
    if( !deleted ) {
          static singleton obj;
         return obj;
    } else {
        throw std::runtime_error("dead reference detected") ;
    }
 }

 ~Singeton() {
     inst_ = 0;
    deleted_ = true;
  }
};

bool Singleton::deleted_ = false;

If some object tries to access singleton after its destruction, It will throw an error.

How to automatically include header dependencies in makefile

Why automate inclusion of header file dependencies: While working on a new project, I had to write a makefile from scratch. I found it little challenging to find out how to automatically include header dependencies in makefile, because everytime you include some header file in your source file, you can not modify your makefile. So, I thought, there must be some mechanism that automatically takes care of it. If you search on google, you will find various ways. Manual solution is to use sed command that I guess searches for #includes in your source code and puts all those dependencies in your makefile. But with modern compiler, you dont need to do much. It will do your job. You just need to pass some flags to it.


For instance, with gcc version 4.1.2 20080704 (Red Hat 4.1.2-51), before including header dependencies, target to make object file was


./%.o : ../%.cpp
     g++ -c $(INC_PATH) $(FLAGS)  $< -o $@
./%.o : ../%.cpp
     g++ -c $(INC_PATH) $(FLAGS) -MMD -MP $< -o $@

-include $(SRC_FILES:.cpp=.d)

you just need to pass -MMD and _MP flag and need to include these dependencies (.d file).


 See the below example that demonstrate it

Let us say we have 3 c++ and 3 header files :


  1. test.cxx
  2. add.cxx
  3. sub.cxx
  4. test.h
  5. sub.h
  6. add.h
add.cxx includes add.h
sub.cxx includes sub.h
test.cxx includes test.h , add.h and sub.h

Following is the makefile that only includes dependencies of .cxx files  :

---------------------------------------------------------------------------------
### declaration of all variables
FLAGS := -g
COMPILER := g++
SRC_FILES := $(wildcard *.cxx)
OBJ_FILES := $(SRC_FILES:%.cxx=%.o)
.PHONY  : all clean
EXE := my_exe
STATIC_LIB := my_exe.a
all : $(EXE)

### compile source files to make object files
### expand this section for each object file to include dependency of header files
## NAAAA I am not gonna do it..  :P
%.o : %.cxx
        $(COMPILER) -c $(FLAGS) $< -o $@

### make static libary of object files
$(STATIC_LIB) : $(OBJ_FILES)
    ar -crv $@ $^

### make executable  
$(EXE) : $(STATIC_LIB)
    $(COMPILER) $^ -o $@

# clean all the files
clean :
    rm -rf $(OBJ_FILES) $(EXE) $(STATIC_LIB)
---------------------------------------------------------------------------------


If you do any changes in .h file and try to do "make" it will say nothing to be done for 'all'  as we have no where mentioned that any of my target depends on header file.

Hence,  to include dependency of header files I should explicitly include header files in my makefile but what if I add one more header file in test.cxx? I will again need to modify makefile which is a cumbersome task and non-practical in a real project.

Today compiler provides you facility to automatically generate dependency graph. All you need to do is to just pass some flags.

e.g. with g++ compiler you should give -MMD -MP flag. It will generate dependency file(.d) for each object file and include those dependency files in makefile as follow :

---------------------------------------------------------------------------------
## not including variable declaration part

%.o : %.cxx
        $(COMPILER) -c $(FLAGS) -MMD -MP $< -o $@

### include all dependency files

-include $(SRC_FILES:.cxx=.d)

$(STATIC_LIB) : $(OBJ_FILES)
    ar -crv $@ $^
   
$(EXE) : $(STATIC_LIB)
    $(COMPILER) $^ -o $@

clean :
    rm -rf $(OBJ_FILES) $(EXE) $(STATIC_LIB)
---------------------------------------------------------------------------------

Test.d : 


test.o: test.cxx add.h sub.h test.h

add.h:

sub.h:

test.h:

This is how a dependency files look like. Now test.o depends on  test.h, add.h and sub.h along with test.cxx. similarly add.o includes dependency of add.h and sub.o takes sub.h into account.

Whenever time stamp of any of header file changes corresponding object file will be modified that solves our purpose. :)

Meyer's Singleton pattern

Singleton pattern is my all time favorite. Some people call it anti-pattern as they use it even when it is not actually required. Today, I was reading about singleton pattern and found so many interesting things about it.

static data + static functions != singleton pattern

Static class (a class that contains only static data and functions) and singleton pattern are not same. In some cases it may seem same, but with static classes there are some problems e.g.

  1. You can not extend its functionality by inheritance since static functions can not be virtual.
  2. Initialization and clean up is very difficult in static classes; since there is no central point of initialization or clean up; i.e. data is not bound with constructor or destructor as the whole data is static.

Singleton design pattern takes care of this along with focusing on the fact that there is a single object in the execution of program.

Given below is one design variant of singleton design pattern :

class singleton {
     singleton *pinst_;
    singleton() {};
    singleton(const singelton&) ;
   operator =(const singleton&);
public:
   static singleton* Instance() {
      if(pinst_ == NULL)
         pinst_ = new singleton();
      return pinst_;
   }
}; 

Since all constructors are private, user is not allowed to create its instance. Hence, uniqueness is enforced at compile time. If Instance function is never called, no object is created at all. Instance of singleton is created when Instance is called first time. This optimization is useful, if size of class is too big to be very expensive, but what if singleton object is not too big.

In that case, one can keep static object of  singleton class instead of pointer as follow : 

Another design Invariant :
class singleton {
     singleton inst_;
    singleton() {};
    singleton(const singelton&) ;
   operator =(const singleton&);
public:
   static singleton* Instance() {
      return &inst_; // return its reference
   }
}; 
in implementation file 
singleton singleton::inst_; // initializing static object. 
This is not a good solution although everything is same in second invariant as well except that in 2nd approach there is static object and in first there is pointer. 

In second approach, inst_ is initialized dynamically at runtime and in first approach it happens statically (It is a type without constructor initialized with compile time constant). .
The compiler performs static initialization before the very first assembly statement gets executed. But compiler does not define the order of initialization for dynamically initialized objects found in different translation unit (A compilable source file).
int global = singleton::Instance().do_something();
depending on the order chosen by compiler to initialize global and inst_ singleton::Instance() may return an object that has not been even constructed yet. 


Destroying singleton : As we discussed above, first approach is more reliable than the second one. But it has problem that when and how to destroy the static singleton object. Like there is a function Instance(), we can make a public function destroy() that will call destructor  but one has to be very careful that nobody access the object after it has been destroyed. we would not have to face this problem if we have used second approach but definitely that is more dangerous. Scott meyer came up with another approach. Therefore, people refer it as Meyer's singleton.

singleton& singleton::Instance {
   static singleton obj; //function static object
   return obj;
}

Function static object is initialized when control flow hits this function first time. Primitive static variables are initialized with compile time constants. e.g
int func() {
  static int x=100;
  return x++;
}

In this case, x is initialized to 100 before any code in the program is executed most likely at the load time. When the static variable is not compile time constant or an object with a constructor it is initialized when the program hits for first time.  

A pseudo C++ code generated by compiler : 

singleton& singleton::Instance() {
// functions generated by compiler  extern void __ConstructorSingleton(void *memory);  extern void __DestroySingleton();// objects created by compiler
  static bool __initialized  = false;  static char __buffer[sizeof(singleton)];  if(! __initialized) {     __ConstructorSingleton(__buffer);     atexit(__DestroySingleton);    __initialized = true;
  }  return *reinterpret_cast<singleton*>(__buffer); }

Main part here is atexit function, which is provided by standard c library. It allows register function to be called automatically during program's exit. 

Each call to atexit function pushes its parameter on a private stack maintained by C runtime library. During application's exit, these functions are called.

P.S
In subsequent  posts we will disucss about following things in more details :

  1. The compiler performs static initialization before the very first assembly statement gets executed. But compiler does not define the order of initialization for dynamically initialized objects found in different translation unit(A compilable source file). 
  2. atexit function.. problem associated with it.
  3. Renterpret cast

Comparison between Array, Linked List and Vector

Arrays, linked lists and vectors are used as a storage for multiple element components. Each of these has its own merits and demerits in terms of memory occupied, speed of traversal and complexity. In this post, we will try to have a brief but concise comparison between the three:

ArrayArray is a data structure that can store a fixed number of elements of of similar type at contiguous memory locations.
In C, the most simple way to declare an array is as shown below:
int a[10];
The elements of an array are stored at contiguous locations. For example, let us say, if the array is an array of integers of 4-bit each and first element is stored at location 100, the subsequent elements will be stored at locations 104, 108 and so on.
Figure 1: Storage of array elements
This statement will create an array of 10 integer elements located at contiguous locations as can be seen in figure 1 alongside. Let us say, first element is located at 100th location in memory; then, second element will be at 104th location assuming the size of integer as 4 (although size of integer depends on machine. On 64 bit machine, size of integer will be 8 byte ) and third element will be located at  108 and so on.

NOTE: The given example of storage of array in memory does not represent real memory locations. This is just for the sake of understanding. 

Since, array elements are located at contiguous memory locations, rate to access any element in array is constant O(1). e.g. nth element can be accessed as :

    base address + (n-1)*size of element

e.g. 3rd element is present at 100 + 2*4 = 108th address.

Advantages of array : As explained, array has high access rate of order O(1) i.e. constant access rate. The access time for a very large array can be reasonably small.

Disadvantages : 1) Since size of array is constant. It has to be pre-determined, as that much amount of memory has to be reserved which has to be contiguous locations, which can lead to memory wastage.
2) If memory is fragmented into smaller chunks then array creation may fail, because array elements are stored at contiguous locations. e.g. let us say, there is a space of 1024 bytes in memory but memory is fragmented into 2 chunks of 512 bytes. Then, maximum size allowed of an array will be 512 although total available space is 1024 bytes. 
3) Insertion and deletion in the middle of array is a costly operation because elements to the right of required position has to be moved. 

Linked List : Due to the disadvantages of array related to insertion and deletion time, Linked list came into existence. Linked list is a linear data structure that can store collection of data elements. Number of elements to be stored in linked list need not to be constant. It can be dynamic. Elements in linked list need not to be stored at contiguous locations. Since, elements are not kept at contiguous locations, each element contain the location (address) of next element.

Hence, data elements in Linked list consist of two parts:

  1. Data and
  2. Address of next element. 
The structure representing an element in a linked list can be defined in C as shown below
struct link_list {    int data;    struct link_list *next; }; 
A linked list consists of two parts: data and address of next element.
Figure 2: Linked list representation
Advantages of Linked List : 1) Size of linked list need not to be constant. It can be dynamically decided at the run time.
2) Insertion and deletion in the middle of linked list is of order of 1, O(1) i.e it is constant. It just requires changing the pointers.e.g If you want to add an element  at 3rd position i.e. after 200, 2nd node will point to location 250 instead of 300 and new location 250 will point to 300.

Disadvantages : 1) Access rate is lesser than that of array. Complexity of accessing element is of order of n O(n) where n is number of elements in linked list because one has to traverse through all the element to access last element.
2) Extra memory cost. An element in a list does not contain only data, but pointer to next element as well.   If data is some huge object say 128 bytes, then pointer of size 4 bytes is not a big overhead but if the data is itself of size 4 bytes like an integer then size of each element will be doubled.

Vector : Vector is a data structure which provides advantages of both linked list and array. Major advantage of array is higher access rate and that of linked list is dynamic size. A vector is a hybrid of these two. 
Vector data structure internally uses an array of some user defined size let us say N. When the user tries to put N+1th element in vector following operations takes place:
  1.   It internally creates a new array of double size i.e. 2N 
  2.  Copies the element of previous array into newer one, 
  3.  Destroy the previous array 
  4.  New array comes in place of previous one. 
From this, we can see that extending size of vector is a heavy operation. Hence there is always a trade off between speed and memory. Because, if you choose smaller size initially; then, you are making sure that memory is not wasted. But as soon as your size reaches threshold and new element is to be inserted, above described operation takes place which slows down your activity. On the other hand, if you choose initially larger size, one can get rid of above operations. But then, memory may get wasted. In such cases where we over-estimate size, there is no difference between an array of max size and vector.

So, its a tricky part to choose initial size of vector. You have to choose wisely which depends upon application to application.If you don't mention the size of vector while construction then compiler chooses a default size that I guess depend upon library implementation.

There are so many other linear data structures like skip list, deque etc that tries to optimize the performance and tries to make use of advantages provided by linked list and array.

Also read:

Noise margins



In this realistic world, nothing is ideal. A signal travelling along a wire/cable/transmission line is susceptible to noise from the surroundings. Also, there is degradation in signal due to parasitic elements involved in the line. Moreover, the output signal produced by the transmitter itself only does resemble the ideal signal thereby worsening the scenario. There are repeaters/buffers along the line to minimize the impact of noise. But there is a limit up to which degradation is allowed beyond which the receiver is unable to sense the correct value of the signal. This degradation is measured in terms of noise margins. One can find the topic discussed in all the textbooks related to digital logic and system design might it be CMOS, TTL or any other logic family.

Let us illustrate the concept of noise margins with the help of an example. Let us assume that a signal has to travel from a transmitter to a receiver through an inter-connect element (or, commonly called as a net) which will only degrade the signal, since there is no active element in-between transmitter and receiver. The output signal produced by Transmitter (Tx) will deviate from ideal voltage levels as is shown in figures 1 and 2 for logic level ‘1’. In addition, there will be signal degradation by inter-connect element as well as noise induced from the surroundings. As a result, the band of voltages that can be present at the receiver input for logic ‘1’ will further widen. Now, there are two cases:

  1. If the band voltages recognized as logic ‘1’ by the receiver is super-set of the band of voltages that can exist at the receiver input as shown in figure 1, receiver will recognize the transmitted logic ‘1’ for all the cases. This is the desired scenario as no logic ‘1’ transmitted will be missed by the receiver. This scenario is depicted in figure 1, wherein the noise induced by surroundings is such that the range of voltages present at the receiver does not violate the band of voltages recognized as voltage '1' by the receiver. So, it will be recognized correctly as logic '1' by the receiver.

When the noise induced is less than noise margin, it will be captured properly by the receiver
Figure 1: Figure showing the noise induced is less than noise margin


2)  If the band of values recognized as logic ‘1’ by the receiver is a sub-set of the band of voltages that can exist at the receiver input as shown in figure 2, there will be some cases that will not be recognized as logic ‘1’, but are intended to be recognized. So, there will be a loss of information/incorrect transmission of information possible in such cases. This scenario is depicted in figure 2, wherein the noise induced by surroundings makes the band of voltage at the receiver's input larger than that can be decoded correctly as logic '1' by the receiver. So, there is no guarantee that the signal will be perceived as logic '1' by the receiver.

Figure showing the noise induced is less than noise margin. In case this happens, the signal will not be correctly decoded by the receiver.
Figure 2: Figure showing the noise induced is greater than noise margin
Let us now label each of these regions to make the discussion more meaningful. The lowest voltage that will be produced as logic ‘1’ by the transmitter is termed as VOH and, let us say, highest is VDD. (We are here considered about lower level only). So, the range of voltages produced by the transmitter is (VDD – VOH).  And let the receiver accept voltages higher than VIH. So the range of voltages accepted by the receiver will be (VDD – VIH). So, the maximum degradation that can happen over the communication channel is (VOH – VIH) which is nothing but the noise margin. If the degradation is less than this figure, the logic ‘1’ will be recognized correctly by the receiver; otherwise it won’t. So, the noise margin equation can be given as below for logic '1':


Noise margin for logic '1' (NM) = VOH – VIH
Where
VOH = Lowest level of voltage that can be produced as logic '1' by the transmitter
VIH = Lowest level of voltage that can be recognized as logic '1' by the receiver

Similarly, for logic ‘0’, the range of outputs that can be produced by the transmitter is (0 - VOL) and the range of input voltages that can be detected by the receiver is (0 – VIL), thereby providing the noise margin as:
Noise margin (NM) = VIL – VOL

Where

VIL = Highest level of voltage that can be recognized as logic ‘0’ by the receiver.
VIH = Highest level of voltage that is produced as logic ‘0’ by the transmitter.

Figure 3 shows all these levels for the example we had taken earlier to demonstrate the concept of noise margins.

Noise margin calculation.
Figure 3: Noise margin

From out preceding discussion, if the degradation over the communication channel is more than noise margin, it will not be detected correctly by the receiver. So, it is imperative for the designer to design accordingly.


Definition of noise margin: Thus, we can conclude this post by defining noise margin as below:
"Noise margin is the difference between the worst signal voltage produced by the transmitter and the worst signal that can be detected by receiver."
Also read