### Introduction

In the article we will look at algorithm for Fast 2D Convolution.

### Separable Symmetric/Anti-Symmetric Convolution

- This article presents a convolution algorithm involving a separable symmetric/anti symmetric kernel.
- Such kernels are common in image processing like blurring,edge detection etc
- This optimization helps in speeding up many routines .
- Another requirement being addressed is we may have multiple kernels that need to be applied to the same image in application such as computing the basis representation of rectangular path of the image.
- since we are traversing the image,we can perform computation of all these kernels simultaneously instead of independent traversals.
- It is to be noted that this optimization is specific to symmetric/antisymmetric separable filters.
- We will be assuming that all the kernel are of the same size as well
- Since kernels are separable we will perform vertical convolution followed by horizontal convolution.
- One of the considerations is to minimize the row accesses .
- Another consideration is to perform a single pass over the entire source image
- Let (n*2+1) be the kernel size.
- The convolution filter is initialized by accepting all the row and column filter coefficients as well as the information if the filter is symmetric or not
- A class called SeperableSConvolution implemented this algorithm

01 class SeperableSConvolution 02 { 03 public: 04 SeperableSConvolution(); 05 vector<mat> rowk;//vector of row filter 06 vector<mat> colk;//vector of column filters 07 vector<bool> symmetric; //info if filter is symmetric or not 08 int N;//number of channels in the output image 09 //function to set the filter parameters 10 void setKernel(Mat row,Mat col,bool symm); 11 //function to clear the filter parameters 12 void clearKernel(); 13 //apply a set of separable symmetric/anti-symmetric filters on 14 //the source image and return a multi channel destination image 15 ////function to perform separable filtering 16 void apply(Mat &src,Mat &dst); 17 };

### Vertical Convolution

- First vertical convolution is considered
- If kernel size is (2N+1) ,we need to access same number of rows
- This if we are applying the filter at row j,we need to access elements from rows from (j-k) to (j+k) where (k \in 0..N)
$\begin{eqnarray} dst(x,y) = \sum_{i,k=-N}^{k=N} g(k+N)I(x,y+k) \end{eqnarray}$
- The above needs to be processed for each (x,y)
- Since we need to reduce the number of row access,access rows from (j-N,j+N) are performed for each j,compute the convolution for each x.
- Since with change in x no new rows are being accessed,hence we compute convolution for each element of the present row.
- Each of the rows need to be multiplied with suitable filter coefficients.
- since the filter is symmetric or anti symmetric the filter coefficients (g_k) is multiplied with elements of rows (j-k) and (j+k)
- The resultant sum of accumulated,this is performed for all the rows.

01 for(int y=0;y<s.height;y++) 02 { 03 //pointer to the source and destination 04 float *srow0 = (float*)(src.data + src.step*y),*srow1=0; 05 float *drow = (float*)(dst.data + dst.step*y); 06 07 08 //performing vertical convolution 09 //using the row kernel 10 //computing g[0]*f(x,y) 11 for( x = 0; x < s.width; x++ ) 12 { 13 for(int l=0;l<ch;l++) 14 { 15 row[x*ch+l] = srow0[x]*rowk[l].at<float>(n); 16 } 17 18 } 19 20 //computing g[-n/2]*f(x,y-n/2) 21 //performing vertical convolution accessing n rows about current row 22 //accessing a 2 rows at a time,performing computation 23 for(int k=1;k<=n;k++) 24 { 25 26 //accessing the vertical abouts about the current row 27 srow0 = (float*)(src.data + src.step*std::max(y-k,0)); 28 srow1 = (float*)(src.data + src.step*std::min(y+k,s.height-1)); 29 for(int x=0;x<s.width;x++) 30 { 31 //applying different vertical kernels 32 //accumulating the sum 33 for(int l=0;l<ch;l++) 34 { 35 float p=srow0[x]+(1*(symmetric[l]?1:-1))*srow1[x]; 36 row[x*ch+l]+=rowk[l].at<float>(k)*(p); 37 } 38 } 39 }

### Horizontal Convolution

- Once vertical convolution is done we proceed to perform horizontal convolution
- Since in horizontal convolution there is only a single row access ,it is relatively simple process.
- The output image is a multi channel image,containing number of channels as desired number of input kernels being applied to the source image.

01 for(x=0;x<s.width;x++) 02 { 03 04 //apply horizontal kernels 05 //g[0]*f(x,y) 06 for(int l=0;l<ch;l++) 07 { 08 res[l]=row[x*ch+l]*colk[l].at<float>(n); 09 } 10 11 //accumulating the sum for g[-N/2+k]f(x-n/2+k) 12 for(int k=1;k<=n;k++) 13 { 14 15 for(int l=0;l<ch;l++) 16 { 17 float p=(row[(x+k)]+(1*symmetric[l]?1:-1) 18 *row[(x-k)])*colk[l].at<float>(k); 19 res[l]=res[l]+p; 20 } 21 } 22 //storing the result of convolution 23 //output image contains number of channels as different input filters 24 for(int l=0;l<ch;l++) 25 { 26 drow[x*ch+l]=res[l]; 27 } 28 }

### Code

- The class SeperableSConvolution defines a class for performing separable symmetric/ant-symmetric convolution/ Code is available in repository https://github.com/pi19404/OpenVision/ at Improper/convolution.hpp and ImgProc/convolution.cpp files.