You have to know that this is an ongoing field of research so there isn't really a perfect solution for this problem yet there are several good ways of doing a distance measurement using cameras.
The easiest solution of course would be if you have a calibration object in your field of view (e.g. a ruler of known length), but this is rarely the case or only in very special situations.
If you have two cameras filming the same scene from different positions it can be done using several stereo algorihms. Some of them are provided by openCV. See for example the documentation here:
http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#stereosgbm[
^].
Using two (ore more cameras) for a precise depth measurement is probably the most often used solution.
There are also many tutorials about this online which you can find easily (keywords: Stereo vision, depth map, distance measurement).
If you have only one camera you can try using structure from motion algorithms. For example this blog explains it on an example:
http://www.morethantechnical.com/2012/02/07/structure-from-motion-and-3d-reconstruction-on-the-easy-in-opencv-2-3-w-code/[
^]. As the name should imply the camera has to move in a scene to use this kind of algorithms.
Another possibility is to use any kind of distance sensors in combination with a camera (e.g. a Kinect camera, optical laser, accustical sensors).
Good luck.