Our Blog
Research >> 3D Innovation
Description3D is set to revolutionize the multimedia industry. While 3D video has enjoyed some commercial success in theatres, obstacles remain that prevent 3D from overtaking 2D as the dominant display format for both cinema and television. These include difficulty capturing content, displaying legacy content, and display limitations such as the need to wear glasses. Our group is addressing these and other problems by carrying out research in life-like high dynamic range 3D video, 2D to 3D conversion, viewer quality of experience, 3D video post-processing, and interactive 3D content. Detailed descriptions of our current projects are provided below.
Researchers
Our current projects:
High Dynamic Range 3D Video - True-to-life immersive experience:
3D TV broadcast services will only be successful in the long term if the perceived image quality and viewing comfort are significantly better than those of 2D HDTVs. The combination of 3D TV and HDR will accomplish this by providing a true-to-life immersive experience. To this end, we are developing appropriate algorithms, as well as analysis and design tools, that will lead to novel and practical HDR 3D capturing, compression, and display solutions.
2D to 3D Video Conversion: To meet the initial demand for 3D video content, it is not realistic to rely only on the production of new 3D videos. One alternative is to convert existing 2D popular movies and documentaries into 3D format, enabling content owners and providers to resell existing products. 2D to 3D conversion has therfore received a lot of attention recently from the research and commercial sectors. However, conversion is very challenging, since it requires approximation of the scene's depth information based on monocular depth cues. The quality of the approximated depth-map directly affects the quality of the rendered 3D content. Thus, there is a clear need to develop effective depth-estimation methods.
We have developed effective 2D to 3D video conversion techniques that utilize monocular depth cues (motion, texture variation, sharpness, perspective, occlusion, etc.) extracted from 2D video to compute the depth-map of the scene. 3D content can then be generated in the format of stereo videos via depth-image based rendering. One of our proposed schemes exploits the existing relationship between the motion of objects and their distance from the camera, to estimate the depth map of the scene in real-time. The other proposed scheme integrates multiple monocular depth cues and estimates the depth map model using a Random Forests machine learning approach. Our proposed methods can be used at the transmitter and receiver ends. It is preferable to employ them at the receiver end so that they do not increase the transmission bandwidth requirements.
3D Video Compression: Transmission of 3D video involves huge amounts of data, requiring the development of highly efficient coding schemes. A straightforward approach for 3D video coding is simulcast coding, which compresses each video stream independently. While this scheme exploits temporal and spatial correlations within each stream, it does not benefit from the existing correlation between different views. Because of this correlation, 3D video coding has a different structure from 2D video coding techniques. We have developed an efficient 3D video coding scheme based on a novel inter-view prediction structure. Performance evaluations show that the proposed scheme outperforms H.264/MVC in terms of both compression efficiency and random-access delay.
Unsynchronized Zoom Correction in 3D Video: One of the shooting conditions that can degrade perceived 3D quality is unsynchronized zooming of the dual cameras. Precisely synchronizing the optical zooming of two identical cameras is not an easy task. If the two cameras have different zoom factors, an object will have different sizes in the left and right views, and thus vertical parallax will be introduced. This causes eyestrain and interferes with the fusion of the two images. We have developed an efficient post-processing algorithm that corrects for unsynchronized zooming. First we find a set of matching points between the left and right views using the SIFT algorithm. Then a least squares regression is performed on the y coordinates of these matching points to determine which view needs to be scaled and to estimate the amount of scaling and translation needed to align the views. Experimental results show our method produces videos with negligible scale difference and vertical parallax.
Effect of Brightness on the Quality of Visual 3D Perception: Producing visually pleasing 3D video is challenging in both the commercial and research settings. There are many factors and parameters that have an effect on the perceptual quality of 3D media. One of these factors is brightness. Our objective is to obtain a good understanding of the effect that brightness has on the visual quality of 3D content and compare it to that of 2D content. This understanding will help us identify criteria related to brightness and utilize them in the 3D production pipeline to enhance the viewers' 3D quality of experience. In our study, we capture outdoor and indoor scenes with different exposures and then run extensive subjective quality assessment experiments to test how brightness levels affect the perceived quality of the 3D experience.
Guidelines for Capturing High Quality Stereoscopic Content Based on a Systematic Subjective Evaluation: Although 3D TVs have already been introduced to the consumer market, the availability of stereoscopic (3D) content remains a serious challenge. Another challenge is that while some manufacturers are introducing 3D cameras and Hollywood is using proprietary solutions, there are no guidelines for consistently capturing high quality stereoscopic content. We compiled a comprehensive stereoscopic image database with content captured at various distances from the camera lenses. We then conducted subjective tests to assess the perceived 3D quality of these images, which were shown on displays of different sizes. Based on these results, we have established guidelines for acquisition distances between the cameras and captured objects.
Quality of Experience of Stereoscopic Content on Displays of Different Sizes: 3D-capable devices have entered the consumer market. The depth that is perceived with stereoscopic content is strongly linked to the size of the screen on which it is displayed. We have tested the effect that different display sizes have on the quality of experience for viewers of 3D content and offer some recommendations.
Correcting Colour and Sharpness Mismatches in 3D and Multiview Video: Capturing 3D video requires two or more cameras to be used to capture the scene from slight different viewpoints. Due to slight variations in capture parameters (i.e., exposure, aperture, shutter speed, focus), there can be inconsistencies between the videos captured with different cameras. Therefore the videos may differ in brightness, colour, sharpness, etc. These inconsistencies can reduce the perceived quality of the 3D video, and will also negatively affect performance when the videos are compressed. In our research, we are developing methods for correcting inconsistencies in 3D and multiview video sets. We have developed a method for correcting the colour of multiview video sets so that all the views match the average colour of the captured videos. Our method produces videos that are visually consistent in colour, and it greatly improves the compression efficiency of multiview video coding. We have also developed a technique for making the views in a stereo image match in sharpness. This not only improves the visual consistency but also improves the accuracy of automatic depth estimation.
Crosstalk Compensation for 3D Displays: One of the major problems with current 3D displays is crosstalk, where a portion of the light intended for one eye reaches the other one. Crosstalk can cause the viewer to see a “double” or “ghost” image, which can reduce the quality of the 3D effect or even completely prevent the viewer from fusing the 3D image. We are performing work on compensating for crosstalk in 3D displays by preprocessing the images such that crosstalk will be cancelled out when the light reaches the viewer’s eyes. A major limitation of current crosstalk reduction techniques is that they reduce the image contrast, which lowers the perceived image quality. We are working on techniques for performing crosstalk reduction that lower image contrast only in local regions that suffer from crosstalk, rather than globally reducing the contrast. This allows for effective crosstalk cancellation with minimal loss of visual quality.
Augmented Video in Interactive 3D TV: 3D TV can a dramatically enhance the TV viewing experience, and users’ expectations of realistic 3D images are very high. Users also have high expectations concerning interactivity, but much less research attention has been paid to this problem. Previous research has mainly focused on changing the point of view in multi-view video and free-viewpoint video. One way to enhance interactivity in video is through the use of augmented video. In augmented video, additional objects are rendered over the video to produce an image enhanced with features not present in the original stream. Such additional objects provide an augmented reality by allowing users to interact with the additional objects. In our current research, we are addressing problems that can arise when a 3D TV stream is augmented with extra objects. Such problems include appropriately rendering the additional objects to produce a correct perception of depth and a realistic image.
Selected Publications:
- C. Doutre and P. Nasiopoulos, “Sharpness Matching in Stereo Images”, Journal of Virtual Reality and Broadcasting, accepted for publication in February 2011, 11 transactions pages.
- V. Sanchez, P. Nasiopoulos and R. Abugharbieh, "3D Scalable Medical Image Compression with Optimized Volume of Interest Coding," IEEE Transactions on Medical Imaging, Vol. 29, No. 10, pp. 1808–1820, 2010.
- M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, “Generating the Depth Map from the Motion Information of H.264-Encoded 2D Video Sequence”, the EURASIP Journal on Image and Video Processing, Vol. 2010, Article ID 108584, 13 pages, doi:10.1155/2010/108584.
- M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, "An H.264-based Scheme for 2D to 3D Video Conversion", IEEE Transactions on Consumer Electronics, Vol. 55, No. 2, pp. 742-748, May 2009.
- C. Doutre and P. Nasiopoulos, "Color Correction Preprocessing For Multiview Video Coding." IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 9, pp. 1400-1406, Sept. 2009.
- V. Sanchez, R. Abugharbieh, and P. Nasiopoulos, “Symmetry-Based Scalable Lossless Compression of 3D Medical Image Data”, IEEE Transactions on Medical Imaging, Vol. 28, No. 7, pp. 1062-1072, July 2009.
- V. Sanchez, P. Nasiopoulos and R. Abugharbieh, “Efficient Lossless Compression of Four Dimensional (4D) Medical Images Based on the Advanced Video Coding (AVC) Scheme”, IEEE Transactions on Information Technology in Biomedicine, Vol. 12, No. 4, pp. 442 – 446, July 2008.
- Z. Mai, M. T. Pourazad, and P. Nasiopoulos “Effect of contrast on the quality of 3D visual perception”, accepted to be presented at the 3rd International Conference on Creative Content Technologies (CONTENT 2011), 4 pages, September 2011.
- Z. Mai, M. T. Pourazad, and P. Nasiopoulos, “Influence of contrast on the quality of 2D and 3D visual perception”, accepted to be presented at the 3rd International Workshop on Quality of Multimedia Experience (QoMEX), 4 pages, September 2011.
- C. Doutre and P. Nasiopoulos, "Crosstalk Cancellation in 3D Video With Local Contrast Reduction", accepted to be presented at the European Signal Processing Conference (EUSIPCO-2011), 4 pages, August 2011.
- M. T. Pourazad, Z. Mai, P. Nasiopoulos, K. Plataniotis, R.K. Ward, “Effect of brightness on the quality of visual 3D perception”, accepted to be presented at the 2011 IEEE International Conference on Image Processing (ICIP 2011), 4 pages, September 2011.
- Z. Mai, C. Doutre, P. Nasiopoulos and R. Ward, "Subjective Evaluation of Tone-Mapping Methods on 3D Images," accepted to be presented at the 17th IEEE International Conference on Digital Signal Processing (DSP 2011), 6 pages, July 2011.
- C. Doutre and P. Nasiopoulos, "Optimized Contrast Reduction For Crosstalk Cancellation in 3D Displays", accepted to be presented at the 3DTV Conference 2011, Antalya, Turkey, 4 pages, May 2011.
- V. Sanchez and P. Nasiopoulos, “3D medical Image Coding with Optimal Channel Protection for Wireless Transmission”, accepted to be presented at the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011), Prague, 4 pages, May 2011.
- M. T. Pourazad, and P. Nasiopoulos, “Preserving quality in 2D to 3D video conversion”, the Electronic Visualization and the Arts (EVA) 2011, 4 pages, May 2011.
- Pourazad, M.T., Bashashati, A., and Nasiopoulos, P. (Jan. 2011) Random Forests-based Approach for Estimating Depth of Human Body Gestures Using a Single Video Camera, IEEE Conference on Consumer Electronics (ICCE), Las Vegas, USA, January 9-12, 2011, pages 661-662.
- Lino E. Coria, Di Xu, Panos Nasiopoulos, "Quality of Experience of Stereoscopic Content on Displays of Different Sizes: A Comprehensive Subjective Evaluation," IEEE International Conference on Consumer Electronics ICCE 2011, Las Vegas, NV, USA, January 9-12, 2011, pages 778-779.
- M. T. Pourazad, P. Nasiopoulos and A. Bashashati, “Random Forests-Based 2D-to-3D Video Conversion”, the 17th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2010, December 2010.
- Di Xu, Lino Coria, Panos Nasiopoulos, "Guidelines for Capturing High Quality Stereoscopic Content Based on a Systematic Subjective Evaluation," IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2010, Athens, Greece, December 12-15, 2010, pages 166-169.
- C. Doutre and P. Nasiopoulos, "A Stereo Matching Data Cost Robust To Blurring," IEEE International Conference on Image Processing (ICIP 2010), pp. 1773-1776, Hong Kong, Sept. 2010.
- Doutre, C., Pourazad, M.T., Tourapis, A., Nasiopoulos, P., and Ward, R. K. Correcting Unsynchronized Zoom in 3D Video, IEEE International Symposium on Circuits and Systems (ISCAS), Paris, France, pp. 3244-3247, May 2010.
- M. Pourazad, P. Nasiopoulos and R. K. Rabab, “Conversion of H.264-Encoded 2D to 3D Video Format,” IEEE Conference on Consumer Electronics, ICCE 2010, January 2010.
- C. Doutre and P. Nasiopoulos, “Correcting Sharpness Variations in Stereo Image Pairs,” 6th Conference for Visual Media Production 2009, London, UK, pp. 45-51, November 2009.
- C. Doutre and P. Nasiopoulos, "Fast Vignetting Correction and Color Matching For Panoramic Image Stitching," IEEE International Conference on Image Processing (ICIP 2009), Cario, Egypt, pp. 709-712, Nov. 2009.
- M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, “An efficient low random access delay panorama-based multiview video coding scheme,” IEEE International Conference on Image Processing (ICIP) 2009, Cairo, Egypt, November 2009.
- V. Sanchez, R. Abugharbieh and P. Nasiopoulos, “3D Scalable Lossless Compression of Medical Images Based on Global and Local Symmetries”, IEEE International Conference on Image Processing (ICIP) 2009, Cairo, Egypt, November 2009.
- Pourazad, M.T., Nasiopoulos, P., and Ward, R. K., A new prediction structure for multiview video coding, 16th IEEE International Conference on Digital Signal Processing (DSP), 1-5 pages, Santorini, Greece, July 2009.
- C. Doutre, and P. Nasiopoulos, "Colour Correction Of Multiview Video With Average Color As Reference," IEEE International Symposium on Circuits and Systems (ISCAS) 2009, Taiwan, pp. 860-863, May 2009.
- M. T. Pourazad, P. Nasiopoulos and R.K. Ward, “Converting H.264-derived Motion Information into Depth Map”, 15th International MultiMedia Modeling Conference (MMM2009), France, pp. 108-118, January 2009.
- M. T. Pourazad, P. Nasiopoulos and R.K. Ward, “An H.264-based Scheme for 2D to 3D Video Conversion”, 2 proc. pages, IEEE Conference on Consumer Electronics, Las Vegas, January 2009.
- C. Doutre, and P. Nasiopoulos, "A Colour Correction Preprocessing Method For Multi-view Video Coding," 5 proc. pages, The European Signal Processing Conference (EUSIPCO) 2008, Switzerland, August 2008.
- V. Sanchez, P. Nasiopoulos, and R. Abugharbieh, “Efficient 4D Motion Compensated Lossless Compression of Dynamic Volumetric Medical Image Data”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2008, pp. 549 – 552, Las Vegas, April 2008.
- Pourazad, M.T., Nasiopoulos, P., and Ward, R. K., An H.264-based video encoding scheme for 3D TV, European Signal Processing Conference (EUSIPCO), Florence, Italy, Sep. 2006.
- V. Sanchez, P. Nasiopoulos and R. Abugharbieh, “Lossless Compression of 4-D Medical Images Using H.264/AVC”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2006, pp. 1116-1119, France, May 2006.