Introduction to OpenCV -- image smoothing technology

About image sampling

There are many ways to obtain digital images. Usually, image sampling can be used to digitize continuous images. The collection relationship between sampling points is the grid, and the infinitesimal sampling points between grids correspond to the pixels in the image. Sampling on image transformation is the process of changing the image resolution. Sampling is divided into up sampling and down sampling.

Upsampling refers to expanding the resolution of the image.

Down sampling: the resolution of the image is reduced.

catalogue

0x01 nearest neighbor interpolation

0x02 bilinear interpolation

0x03 image pyramid

0x04 Fourier transform

There are two common image scaling methods for OPenCV:

  • resize function provided in geometric transformation.

  • Image pyramids pyrDown and pyrUp based on resolution theory.

0x01 nearest neighbor interpolation

Nearest neighbor interpolation is the simplest method of image processing. Its principle is to extract the nearest pixel value of its neighborhood in the source data image as the pixel value of the corresponding point of the target image. The resolution of the source image f(x,y) is w*h, and the resolution of the target image f(x',y') after scaling is w '* H'. Then the nearest neighbor interpolation transformation is as follows:

The key step of nearest neighbor interpolation scaling is to find the scaling multiple. Complete the neighborhood mapping up. OpenCV provides a function to convert floating-point numbers to integers. Specifically, the cvRound function returns the integer value closest to the parameter and rounds it. cvFloor returns the maximum integer value not greater than the parameter; cvCeil returns the smallest integer value not less than the parameter.

Then we try to use nearest neighbor interpolation. In fact, this usage has been widely used for image scaling in single chip microcomputer:

In fact, the popular point of this algorithm is that we calculate the scaling coefficient, and then according to the scaling coefficient n, we can get that we regard the pixels of N source images as one point.

The code is as follows:

#include <iostream>
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include "opencv2/core/core.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <opencv2/imgproc/imgproc_c.h>
#include <opencv2/opencv.hpp>

using namespace cv;
using namespace std;
// Nearest neighbor interpolation image scaling
cv::Mat nNeighbourInterpolation(cv::Mat srcImage)
{
	// Judge input validity
	CV_Assert(srcImage.data != NULL);
	int rows = srcImage.rows;
	int cols = srcImage.cols;
	// Build target image
	cv::Mat dstImage = cv::Mat(
		cv::Size(rows/2, cols/2), srcImage.type(),
		cv::Scalar::all(0));
	int dstRows = dstImage.rows;
	int dstCols = dstImage.cols;
	// Coordinate conversion to obtain scaling multiple
	float cx = (float)cols / dstCols;
	float ry = (float)rows / dstRows;
	std::cout << "cx: " << cx << "ry:" << ry << std::endl;
	// Traverse the image to complete the zoom operation
	for (int i = 0; i < dstCols; i++)
	{
		// Rounding obtains the corresponding coordinates of the target image in the source image
		int ix = cvFloor(i * cx);
		for (int j = 0; j < dstRows; j++)
		{
			int jy = cvFloor(j * ry);
			// Boundary processing prevents pointer from crossing the boundary
			if (ix > cols - 1)
				ix = cols - 1;
			if (jy > rows - 1)
				jy = rows - 1;
			// Mapping matrix
			dstImage.at<cv::Vec3b>(j, i) =
				srcImage.at<cv::Vec3b>(jy, ix);
		}
	}
	return  dstImage;
}
int main()
{
	// Image source acquisition and verification
	cv::Mat srcImage = cv::imread("./image/AA.png");
	if (!srcImage.data)
		return -1;
	// Nearest neighbor interpolation scaling operation
	cv::Mat dstImage = nNeighbourInterpolation(srcImage);
	cv::imshow("srcImage", srcImage);
	cv::imshow("dstImage", dstImage);
	cv::waitKey(0);
	return 0;
}

0x02 bilinear interpolation

Bilinear interpolation is one of the most widely used image scaling methods, which has high stability and excellent time complexity. The principle of bilinear interpolation is to calculate the weighted average of the pixel values in the neighborhood (2 * 2) of the position points of the source data image, and then obtain the position points corresponding to the target image.

If it is realized, the corresponding coordinates can be calculated by finding the four adjacent coordinate points closest to the corresponding coordinate point. The calculation formula is as follows:

Where yk is the nearest four adjacent points, and wk is the corresponding weight proportion of the corresponding points. It should be noted that most of the operations involved in bilinear interpolation are floating-point data operations. Optimization should be considered for coordinate conversion. The common optimization is to change the mapping relationship: ((i+0.5) * w/w'-0.5,(j+0.5)*h/h'-0.5). Bilinear interpolation makes the transformed image pixel values continuous by using the characteristics of regional neighborhood, weakens the high-frequency component information to a certain extent, and the image contour will be blurred to a certain extent.

Then the functions implemented are as follows:

cv::Mat BilinearInterpolation(cv::Mat srcImage)
{
	CV_Assert(srcImage.data != NULL);
	int srcRows = srcImage.rows;
	int srcCols = srcImage.cols;
	int srcStep = srcImage.step;
	// Build target image
	cv::Mat dstImage = cv::Mat(
		cv::Size(150, 150), srcImage.type(),
		cv::Scalar::all(0));
	int dstRows = dstImage.rows;
	int dstCols = dstImage.cols;
	int dstStep = dstImage.step;
	// Data definition and conversion
	IplImage src = cvIplImage(srcImage);
	IplImage dst = cvIplImage(dstImage);
	std::cout << "srcCols:" << srcCols << " srcRows:" <<
		srcRows << "srcStep:" << srcStep << std::endl;
	std::cout << "dstCols:" << dstCols << " dstRows:" <<
		dstRows << "dstStep:" << dstStep << std::endl;
	// Coordinate definition
	float srcX = 0, srcY = 0;
	float t1X = 0, t1Y = 0, t1Z = 0;
	float t2X = 0, t2Y = 0, t2Z = 0;
	for (int j = 0; j < dstRows - 1; j++)
	{
		for (int i = 0; i < dstCols - 1; i++)
		{
			// Scaling mapping relationship 
			srcX = (i + 0.5) * ((float)srcCols) / (dstCols)-0.5;
			srcY = (j + 0.5) * ((float)srcRows) / (dstRows)-0.5;
			int iSrcX = (int)srcX;
			int iSrcY = (int)srcY;
			// Three channel neighborhood weighted value 1
			t1X = ((uchar*)(src.imageData + srcStep * iSrcY))[
				iSrcX * 3 + 0] * (1 - std::abs(srcX - iSrcX)) +
					((uchar*)(src.imageData + srcStep * iSrcY))[
						(iSrcX + 1) * 3 + 0] * (srcX - iSrcX);
					t1Y = ((uchar*)(src.imageData + srcStep * iSrcY))[
							iSrcX * 3 + 1] * (1 - std::abs(srcX - iSrcX)) +
								((uchar*)(src.imageData + srcStep * iSrcY))[
									(iSrcX + 1) * 3 + 1] * (srcX - iSrcX);
				t1Z = ((uchar*)(src.imageData + srcStep * iSrcY))[
										iSrcX * 3 + 2] * (1 - std::abs(srcX - iSrcX)) +
											((uchar*)(src.imageData + srcStep * iSrcY))[
												(iSrcX + 1) * 3 + 2] * (srcX - iSrcX);
												// Three channel neighborhood weighted value 2
												t2X = ((uchar*)(src.imageData + srcStep * (
													iSrcY + 1)))[iSrcX * 3] * (1 - std::abs(srcX - iSrcX))
													+ ((uchar*)(src.imageData + srcStep * (
														iSrcY + 1)))[(iSrcX + 1) * 3] * (srcX - iSrcX);
												t2Y = ((uchar*)(src.imageData + srcStep * (
													iSrcY + 1)))[iSrcX * 3 + 1] * (1 - std::abs(srcX - iSrcX))
													+ ((uchar*)(src.imageData + srcStep * (
														iSrcY + 1)))[(iSrcX + 1) * 3 + 1] * (srcX - iSrcX);
												t2Z = ((uchar*)(src.imageData + srcStep * (
													iSrcY + 1)))[iSrcX * 3 + 2] * (1 - std::abs(srcX - iSrcX))
													+ ((uchar*)(src.imageData + srcStep * (iSrcY + 1)))[(
														iSrcX + 1) * 3 + 2] * (srcX - iSrcX);
												// The target image weighting is solved according to the formula
												((uchar*)(dst.imageData + dstStep * j))[i * 3] =
													t1X * (1 - std::abs(srcY - iSrcY)) + t2X * (
														std::abs(srcY - iSrcY));
												((uchar*)(dst.imageData + dstStep * j))[i * 3 + 1] =
													t1Y * (1 - std::abs(srcY - iSrcY)) + t2Y * (
														std::abs(srcY - iSrcY));
												((uchar*)(dst.imageData + dstStep * j))[i * 3 + 2] =
													t1Z * (1 - std::abs(srcY - iSrcY)) + t2Z * (
														std::abs(srcY - iSrcY));
		}
		// Column operation
		((uchar*)(dst.imageData + dstStep * j))[(dstCols - 1) * 3] =
			((uchar*)(dst.imageData + dstStep * j))[(dstCols - 2) * 3];
		((uchar*)(dst.imageData + dstStep * j))[(dstCols - 1) * 3 +
			1] = ((uchar*)(dst.imageData + dstStep * j))[(
				dstCols - 2) * 3 + 1];
		((uchar*)(dst.imageData + dstStep * j))[(dstCols - 1) * 3
			+ 2] = ((uchar*)(dst.imageData + dstStep * j))[(
				dstCols - 2) * 3 + 2];
	}
	// Line operation
	for (int i = 0; i < dstCols * 3; i++)
	{
		((uchar*)(dst.imageData + dstStep * (dstRows - 1)))[i] =
			((uchar*)(dst.imageData + dstStep * (dstRows - 2)))[i];
	}
	return  dstImage;
}
int main()
{
	cv::Mat srcImage = cv::imread("./image/BB.png");
	if (!srcImage.data)
		return -1;
	cv::Mat dstImage = BilinearInterpolation(srcImage);
	cv::imshow("srcImage", srcImage);
	cv::imshow("dstImage", dstImage);
	cv::waitKey(0);
	return 0;
}

The resize function provided in OpenCV can realize the transformation of image size, and the default interpolation method is bilinear interpolation. So which of the three methods is more efficient? The time complexity of bilinear interpolation and nearest neighbor interpolation in resize function is relatively low. Linear interpolation is better.

0x03 image pyramid

Image pyramid is a collection of images. All images with multiple resolutions come from the same original image, which is often used in image scaling or image segmentation. Image pyramid structure is an image storage data structure for multi-resolution processing. Down sampling technology is Gaussian pyramid and up reconstruction technology is Laplace pyramid.

(1) Gauss pyramid

The generation process of Gaussian pyramid includes Gaussian kernel convolution and down sampling. Let the source image be G0(x,y) and the resolution be M*N. G0 means that the bottom layer of Gaussian pyramid is layer 0, that is, it is the same as the source image. If the Gaussian pyramid layer Gi+1 image is not obtained, first perform Gaussian kernel convolution on the source image Gi, and then delete all even rows and even columns. Then the values of common Gaussian kernel functions are as follows:

 

The downward reduction of Gaussian pyramid is as follows:

For level i, Gi=Reduce[Gi-1], w(m,n) is the generation core, and the window function is a low-pass filter. The above restrictions on the generation core are to ensure not only the low-pass property, but also the smooth brightness after image expansion, without boundary gaps.

It can be seen from the above formula that the image pyramid is obtained by sampling the next layer after low-pass filtering, and the size of the image of the current layer is 1 / 4 of that of the previous layer.

(2) Laplace pyramid

The Laplacian pyramid operation realizes the upward reconstruction of the image. The Gaussian above has been reducing the image downward, while the Laplacian pyramid is the opposite.

Interpolate Gi to obtain the enlarged image Gi *, so that the size of Gi * is the same as that of Gi-1, which can be expressed as the following formula:

Where the value range of i is (0,N), the value range of X is [0,M), and the value range of Y is [0,N). For Gi*((x+m)/2,(y+n)/2) in the above formula, when (x+m)/2 and (y+n)/2 are integers, it can be transformed into Gi((x+m)/2, (y+n)/2), and it is 0 in other cases.

The explanation of this formula in the book is:

The implementation of image pyramid in OpenCV provides pyrUp and pyrDoen functions for up and down sampling.

void pyrDown( InputArray src, 
			  OutputArray dst,
              const Size& dstsize = Size(), 	//Output image size
              int borderType = BORDER_DEFAULT );

void pyrUp( InputArray src, 
            OutputArray dst,
            const Size& dstsize = Size(), 
            int borderType = BORDER_DEFAULT );
  • Dstsize: indicates the size of the output image. Size((src.cols+1)/2,(src.rows+1)/2) is the default value. Then, the default parameters should meet the following conditions: pyrDown: |dstsize width*2-src. Cols < = 2 and dstsize height * 2-src. rows|<=2. pyrUp: |dstsize. width-src. cols * 2|<=(dstsize.width mod 2),|dstsize. height- src. rows * 2|<=(dstsize.height mod 2)

Let's look at the effect through the code:

We use the following picture to do the experiment:

First look at the code:

#include <iostream>
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include "opencv2/core/core.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <opencv2/imgproc/imgproc_c.h>
#include <opencv2/opencv.hpp>

// Image pyramid sampling operation
void Pyramid(cv::Mat srcImage)
{
	// Judge whether to scale according to the size of the image source
	if (srcImage.rows > 400 && srcImage.cols > 400)
		cv::resize(srcImage, srcImage, cv::Size(), 0.5, 0.5);
	else
		cv::resize(srcImage, srcImage, cv::Size(), 1, 1);
	cv::imshow("srcImage", srcImage);
	cv::Mat pyrDownImage, pyrUpImage;
	// Down sampling process
	pyrDown(srcImage, pyrDownImage,
		cv::Size(srcImage.cols / 2, srcImage.rows / 2));
	cv::imshow("pyrDown", pyrDownImage);
	// Up sampling process
	pyrUp(srcImage, pyrUpImage,
		cv::Size(srcImage.cols * 2, srcImage.rows * 2));
	cv::imshow("pyrUp", pyrUpImage);
	// Reconstruction of down sampling process
	cv::Mat pyrBuildImage;
	pyrUp(pyrDownImage, pyrBuildImage,
		cv::Size(pyrDownImage.cols * 2, pyrDownImage.rows * 2));
	cv::imshow("pyrBuildImage", pyrBuildImage);
	// Compare refactoring performance
	cv::Mat diffImage;
	cv::absdiff(srcImage, pyrBuildImage, diffImage);
	cv::imshow("diffImage", diffImage);
	cv::waitKey(0);
}
int main()
{
	cv::Mat srcImage = cv::imread("./image/BB.png");
	if (!srcImage.data)
		return -1;
	Pyramid(srcImage);
	return 0;
}

This function first analyzes the up sampling and down sampling, and then reconstructs the down sampling process, that is, enlarge it, and then compare the difference between the reconstructed image and the original image:

Then first compare the down sampled images:

We can find that the edge of the down sampled image is very blurred.

Then the image is used for comparison:

We can also find that the enlarged picture also becomes a little blurred.

Then we try to reconstruct the down sampled image. We enlarge it twice and compare it with the source image and the down sampled image and the reconstructed image:

It's too vague... Similar to myopia...

Then look at the difference between the source image and the reconstructed image to see the difference between them:

Then I was pleasantly surprised to find that this thing can draw an outline.. How to put it? I think it is suitable for the processing of blurred images.

Indeed, part of the sawtooth can be filtered out. The effect is quite satisfactory.

0x04 Fourier transform

(1) Mask operation image

The mask operation of the image refers to recalculating the value of each pixel in the image through the mask accounting sub, which describes the influence degree of neighborhood pixels on the new pixel value, describes the influence operation of neighborhood pixels on the new pixel value according to the mask operator, and weights and averages the original pixels according to the weight factor in the mask operator.

Image mask operation is usually used in the fields of image smoothing, edge detection and feature analysis. There are two common operations for calculating image mask in OpenCV:

(1) Pixel based neighborhood traversal

For the source image data f(x,y), the convolution accounting sub is 3 * 3, and the calculation of the 4 neighborhood mean mask of the source image data can be completed by the following formula:

Then, for the image matrix, the above formula can be transformed into the following matrix operation:

Traversal based on pixel neighborhood is to operate the source data matrix, use the above formula, take the current pixel as the calculation center target point, move the mask core operator template pixel by pixel, traverse the source image data, and then update the value of each pixel corresponding to the new image.

(2) Based on filter2D function

The filter2D function is provided in OpenCV, which is specially applied to the operation of computer image convolution:

void filter2D( InputArray src, 
			   OutputArray dst, 
			   int ddepth,							//Output drawing depth
               InputArray kernel, 					//Convolution kernel operator
               Point anchor = Point(-1,-1),			//Convolution kernel anchor
               double delta = 0, 					//Smoothing technology
               int borderType = BORDER_DEFAULT );
  • ddepth: if it is set to a negative value, its depth is the same as that of the input source data. Otherwise, it needs to be set according to the depth of the input source image.

    If Src depth()=CV_ 8U, then ddepth=-1/CV_16S/CV_32F/CV_64F;

    If Src depth()=CV_ 16U/CV_ 16S, then ddepth=-1/CV_32F/CV_64F;

    If Src depth()=CV_ 32F, then ddepth=-1/CV_32F/CV_64F;

    If Src depth()=CV_ 64F, then ddepth=-1/CV_64F.

  • The parameter kernel is a convolution kernel operator and a single channel floating-point matrix. If different convolution kernel operators are applied to multiple channels, the channels need to be separated and then operated individually.

  • The parameter anchor is the anchor point of convolution kernel, and the default is (- 1, - 1), indicating the center of convolution kernel.

  • Parameter delta, which can be set before image generation for smoothing operation.

filter2D function is often used in linear filtering technology. When the image target point calculated by convolution kernel operator is outside the image, it is necessary to interpolate the specified boundary.

This function actually calculates the image correlation rather than convolution operation. The calculation formula of filter2D is:

Then the implementation of the above two mask operations:

#include <iostream>
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include "opencv2/core/core.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <opencv2/imgproc/imgproc_c.h>
#include <opencv2/opencv.hpp>

using namespace cv;
using namespace std;
// Pixel based neighborhood mask operation
cv::Mat Myfilter2D(cv::Mat srcImage)
{
	const int nChannels = srcImage.channels();
	cv::Mat resultImage(srcImage.size(), srcImage.type());
	for (int j = 1; j < srcImage.rows - 1; ++j)
	{
		// Get neighborhood pointer
		const uchar* previous = srcImage.ptr<uchar>(j - 1);
		const uchar* current = srcImage.ptr<uchar>(j);
		const uchar* next = srcImage.ptr<uchar>(j + 1);
		uchar* output = resultImage.ptr<uchar>(j);
		for (int i = nChannels; i < nChannels * (srcImage.cols - 1); ++i)
		{
			// 4-neighborhood mean mask operation 
			*output++ = saturate_cast<uchar>(
				(current[i - nChannels] + current[i + nChannels] +
					previous[i] + next[i]) / 4);
		}
	}
	// Boundary treatment
	resultImage.row(0).setTo(Scalar(0));
	resultImage.row(resultImage.rows - 1).setTo(Scalar(0));
	resultImage.col(0).setTo(Scalar(0));
	resultImage.col(resultImage.cols - 1).setTo(Scalar(0));
	return resultImage;
}
// Native library mask operation
cv::Mat filter2D_(cv::Mat srcImage)
{
	cv::Mat resultImage(srcImage.size(), srcImage.type());
	Mat kern = (Mat_<float>(3, 3) << 0, 1, 0,
		1, 0, 1,
		0, 1, 0) / (float)(4);
	filter2D(srcImage, resultImage, srcImage.depth(), kern);
	return resultImage;
}
int main()
{
	cv::Mat srcImage = cv::imread("./image/flower.png");
	if (!srcImage.data)
		return 0;
	cv::Mat srcGray;
	cvtColor(srcImage, srcGray, CV_BGR2GRAY);
	imshow("srcGray", srcGray);
	cv::Mat resultImage = Myfilter2D(srcGray);
	imshow("resultImage", resultImage);
	cv::Mat resultImage2 = filter2D_(srcGray);
	imshow("resultImage2", resultImage2);
	cv::waitKey(0);
	return 0;
}

First look at the original grayscale image:

Operation based on pixel neighborhood mask:

Operation based on filter2D:

In fact, it doesn't seem to make any difference. Put it together:

In fact, I just feel that the color tone of the third image processed with filter2D is a little yellow.. I still like the effect of pixel neighborhood mask operation.

Tags: C++ OpenCV AI Computer Vision image processing

Posted by rimelta on Fri, 13 May 2022 03:46:16 +0300