machine learning: Colorization using Optimization
Published: 2019-04-21

Today, we introduce an article by Siggraph 2004: Colorization using Optimization, which uses the optimization method to color gray images. Here, we use the very classical Poisson equation and the linear optimization of sparse matrices.In a nutshell, it is to color a gray-scale image first, and then fill other areas without color with an optimized method.These processes are all carried out in YUV color space.

Given a Y-channel image, we hope to restore the U and V channels of the image based on certain prior knowledge.An important assumption here is that for two adjacent pixels, if their brightness is similar, then their colors should also be similar.

assuming r \mathbf{r}, s \mathbf{s} represents the position of two adjacent pixels, then we want to minimize the following objective function:

J(U) =\sum_{ \mathbf{r} } \left( U(\mathbf{r}) - \sum_{\mathbf{s} \in N(\mathbf{r})} w_{ \mathbf{rs}} U(\mathbf{s} ) \right)^{2}

For V channel, we can establish a similar objective function, and the coefficient WRSW _ {\ MathBF {RS} can be represented by Y channel:

w_{ \mathbf{rs}} = e^{-( Y( \mathbf{r}) - Y( \mathbf{s}))^2 / 2 \sigma^{2}}

Given the Sigma \ sigma and the pixel position relationship, we can easily find the coefficient wrs w_{ \mathbf{rs}}.

First of all, some simple coloring is required for the image. We can obtain a series of color values of pixel points ri \mathbf{r}_i, where U (Ri) = Uiu (\ Mathbf {r} _ i) = U _ i, V (Ri) = Viv (\ Mathbf {r} _ i) = V _ i. According to the color values of these preset pixel points,Combined with the above objective function, we can establish a large sparse linear equation group. Assuming that the size of the image is M×NM \times N, then the number of pixels in the image is np=MN np = MN, the equation group we want to solve will be NP, and the size of the sparse matrix is NP× np×npnp \times np, such as an 800×600800 \For the image of times 600, the equations to be solved will be 480,000, and the size of the sparse matrix will be 480,000× 48,000,000 \ Times 480,000. This is a very large matrix.However, because this is sparse, there are many standard solutions.

matlab code is given below


g_name='example.bmp';
c_name='example_marked.bmp';
out_name='example_res.bmp';

%set solver=1 to use a multi-grid solver 
%and solver=2 to use an exact matlab "\" solver
solver=2; 

gI=double(imread(g_name))/255;
cI=double(imread(c_name))/255;
colorIm=(sum(abs(gI-cI),3)>0.01);
colorIm=double(colorIm);

sgI=rgb2ntsc(gI);
scI=rgb2ntsc(cI);

ntscIm(:,:,1)=sgI(:,:,1);
ntscIm(:,:,2)=scI(:,:,2);
ntscIm(:,:,3)=scI(:,:,3);


max_d=floor(log(min(size(ntscIm,1),size(ntscIm,2)))/log(2)-2);
iu=floor(size(ntscIm,1)/(2^(max_d-1)))*(2^(max_d-1));
ju=floor(size(ntscIm,2)/(2^(max_d-1)))*(2^(max_d-1));
id=1; jd=1;
colorIm=colorIm(id:iu,jd:ju,:);
ntscIm=ntscIm(id:iu,jd:ju,:);

if (solver==1)
  nI=getVolColor(colorIm,ntscIm,[],[],[],[],5,1);
  nI=ntsc2rgb(nI);
else
  nI=getColorExact(colorIm,ntscIm);
end

figure, imshow(nI)

imwrite(nI,out_name)


function [nI,snI]=getColorExact(colorIm,ntscIm)

n=size(ntscIm,1); m=size(ntscIm,2);
imgSize=n*m;


nI(:,:,1)=ntscIm(:,:,1);

indsM=reshape([1:imgSize],n,m);
lblInds=find(colorIm);

wd=1; 

len=0;
consts_len=0;
col_inds=zeros(imgSize*(2*wd+1)^2,1);
row_inds=zeros(imgSize*(2*wd+1)^2,1);
vals=zeros(imgSize*(2*wd+1)^2,1);
gvals=zeros(1,(2*wd+1)^2);


for j=1:m
   for i=1:n
      consts_len=consts_len+1;

      if (~colorIm(i,j))   
        tlen=0;
        for ii=max(1,i-wd):min(i+wd,n)
           for jj=max(1,j-wd):min(j+wd,m)

              if (ii~=i)|(jj~=j)
                 len=len+1; tlen=tlen+1;
                 row_inds(len)= consts_len;
                 col_inds(len)=indsM(ii,jj);
                 gvals(tlen)=ntscIm(ii,jj,1);
              end
           end
        end
        t_val=ntscIm(i,j,1);
        gvals(tlen+1)=t_val;
        c_var=mean((gvals(1:tlen+1)-mean(gvals(1:tlen+1))).^2);
        csig=c_var*0.6;
        mgv=min((gvals(1:tlen)-t_val).^2);
        if (csig<(-mgv/log(0.01)))
       csig=-mgv/log(0.01);
    end
    if (csig<0.000002)
       csig=0.000002;
        end

        gvals(1:tlen)=exp(-(gvals(1:tlen)-t_val).^2/csig);
        gvals(1:tlen)=gvals(1:tlen)/sum(gvals(1:tlen));
        vals(len-tlen+1:len)=-gvals(1:tlen);
      end


      len=len+1;
      row_inds(len)= consts_len;
      col_inds(len)=indsM(i,j);
      vals(len)=1; 

   end
end


vals=vals(1:len);
col_inds=col_inds(1:len);
row_inds=row_inds(1:len);


A=sparse(row_inds,col_inds,vals,consts_len,imgSize);
b=zeros(size(A,1),1);


for t=2:3
    curIm=ntscIm(:,:,t);
    b(lblInds)=curIm(lblInds);
    new_vals=A\b;   
    nI(:,:,t)=reshape(new_vals,n,m,1);    
end

snI=nI;
nI=ntsc2rgb(nI);