FCM fuzzy clustering algorithm

How to understand fuzzy clustering

The boundaries between things, some are clear, others are blurred. When clustering involves fuzzy boundaries between things, fuzzy cluster analysis methods need to be used.
How to understand the "fuzziness" of fuzzy clustering: Assume that there are two sets A and B, and one member a. The traditional classification concept a belongs to either A or B. In the concept of fuzzy clustering, a can belong to A at 0.3 , 0.7 belongs to B, which is the concept of "fuzzy".

There are two basic methods of fuzzy clustering analysis: systematic clustering method and stepwise clustering method.

My personal understanding of the system clustering method is similar to the density clustering algorithm, and the stepwise clustering method is the central point clustering method. (Please correct me if I am wrong here)

Stepwise clustering method is a fuzzy cluster analysis method based on fuzzy partition. It predetermines that the samples to be classified should be divided into several categories, and then classifies according to the optimal principle, and iterates for many times until the classification is reasonable. In the classification process, it can be considered that a certain sample belongs to a certain category with a certain degree of membership, and belongs to another category with a certain degree of membership. In this way, samples are not explicitly belonging or not belonging to a certain class. If there are n samples in the sample set to be divided into c categories, then its fuzzy partition matrix is ​​c×n.
This matrix has the following properties:
①. The sum of the membership degrees of each sample belonging to each category is 1.
②. Each type of fuzzy subset is not an empty set.

Fuzzy C-means Clustering Algorithm

Fuzzy c-means clustering algorithm fuzzy c-means (FCM). Among many fuzzy clustering algorithms, the fuzzy C-means (FCM) algorithm is the most widely used and successful. It obtains the membership degree of each sample point to all cluster centers by optimizing the objective function, so as to automatically classify samples.

FCM algorithm principle

Suppose we have a data set X, we want to classify the data in X, if these data are divided into c classes, then there are c class centers corresponding to Ci, and each sample Xj belongs to a certain class Ci The degree of membership is set as Uij, then define an FCM objective function and its constraints as follows:



The objective function (Equation 1) is composed of the membership degree of the corresponding sample multiplied by the distance from the sample to the center of each category. Equation 2 is a constraint condition, that is, the sum of the membership degrees of a sample belonging to all classes must be 1.
m in formula 1 is a factor of membership degree, generally 2, and ||Xj - Ci|| represents the Euclidean distance from Xj to the center point Ci.

The smaller the objective function J, the better. Since we require the minimum value of the objective function J, we will not deduce how to find the minimum value here (if you are interested in the derivation, you can read this article: https://blog.csdn .net/on2way/article/details/47087201), directly give the conclusion:

The iterative formula of Uij:

The iterative formula of Ci:

We found that Uij and Ci are interrelated and contain each other. Then the problem comes. When the fcm algorithm starts, there is neither Uij nor Ci, so how to solve it? It's very simple. At the beginning of the program, we will randomly generate a Uij, as long as the value satisfies the conditions, and then start to iterate, calculate Ci through Uij, and then calculate Uij with Ci, and repeat, the objective function in this process J has been changing, gradually crepe to stable. Then when J is not changing, it is considered that the algorithm converges to a better result.

Python FCM support

Reference article: https://blog.csdn.net/FrankieHello/article/details/79581315
Install related libraries

pip install -U scikit-fuzzy

skfuzzy.cmeans function description

def cmeans(data, c, m, error, maxiter, init=None, seed=None)

parameter:

  • data: training data, here you need to pay attention to the format of data, which should be (number of features, number of data), which is just the opposite of the shape of many training data.
  • c: The number of clusters that need to be specified.
  • m: Membership index, which is a weighted index, generally 2.
  • error: When the change of the membership degree is less than this, the iteration ends early.
  • maxiter: The maximum number of iterations.

return value:

  • cntr: cluster center
  • u: the final membership matrix
  • u0: initialized membership matrix
  • d: is a matrix that records the Euclidean distance from each point to the cluster center
  • jm: is the optimization history of the objective function
  • p: p is the number of iterations
  • fpc: The full name is fuzzy partition coefficient, which is an index to evaluate the quality of classification. Its range is 0 to 1, and 1 means the best effect. It can be used to select the number of clusters later.
Code
import numpy as np
import matplotlib.pylab as plt
from sklearn.cluster import KMeans
from skfuzzy.cluster import cmeans

cp = np.random.uniform(1, 100, (100, 2))
train = cp[:50]
test = cp[50:]

train = train.T


center, u, u0, d, jm, p, fpc = cmeans(train, c=3, m=2, error=0.005, maxiter=1000)
for i in u:
    label = np.argmax(u, axis=0)

for i in range(50):
    if label[i] == 0:
        plt.scatter(train[0][i], train[1][i], c = 'r')
    elif label[i] == 1:
        plt.scatter(train[0][i], train[1][i], c = 'g')
    elif label[i] == 2:
        plt.scatter(train[0][i], train[1][i], c = 'b')
plt.show()

operation result

FCM algorithm implementation (python)

Reference article: https://blog.csdn.net/a19990412/article/details/89361038

Algorithm process
  1. Randomly initialize the fuzzy matrix U (describe the degree of membership of each point in different classes)

  2. With the fuzzy matrix U, the class center is calculated by the following formula

  3. Update the fuzzy matrix U according to the calculated class center by the following formula

  4. End the iteration when the change of U is not large, otherwise go back to the second step

Code
import numpy as np
from sklearn import datasets

iris = datasets.load_iris()
print(iris.data.shape)

def FCM(X, c_clusters=3, m=2, eps=10):
    membership_mat = np.random.random((len(X), c_clusters))   # Generate a random two-dimensional array shape(150,3), and randomly initialize the membership matrix
    # The operation of this step is to make the sum of the membership degrees of Xi be 1
    membership_mat = np.divide(membership_mat, np.sum(membership_mat, axis=1)[:, np.newaxis])

    while True:
        working_membership_mat = membership_mat ** m   # shape->(150,3)
        # Calculate the cluster center point according to the formula Centroids.shape->(3,4)
        Centroids = np.divide(np.dot(working_membership_mat.T, X), np.sum(working_membership_mat.T, axis=1)[:, np.newaxis])

        # This matrix holds the Euclidean distances of all real points to each cluster center
        n_c_distance_mat = np.zeros((len(X), c_clusters)) # shape->(150,3)
        for i, x in enumerate(X):
            for j, c in enumerate(Centroids):
                n_c_distance_mat[i][j] = np.linalg.norm(x-c, 2)   # Calculate the l2 norm (Euclidean distance)

        new_membership_mat = np.zeros((len(X), c_clusters))

        # Calculate the fuzzy matrix U according to the formula
        for i, x in enumerate(X):
            for j, c in enumerate(Centroids):
                new_membership_mat[i][j] = 1. / np.sum((n_c_distance_mat[i][j] / n_c_distance_mat[i]) ** (2 / (m-1)))
        if np.sum(abs(new_membership_mat - membership_mat)) < eps:
            break
        membership_mat = new_membership_mat
    return np.argmax(new_membership_mat, axis=1)

print(FCM(iris.data))

The code is completely implemented according to the above process and formula, and it is very simple to understand. The iteration exit condition can be optimized, and the rest can not be changed.

Experimental dataset

The data set used in the above code experiment is the Iris data set, which contains 3 kinds of iris data, each with 50 data, a total of 150 data, each data contains 4 attributes: sepal length, sepal width, petal length , the petal width, you can use these four attributes to predict which category the iris flower belongs to.

Data source: https://www.kaggle.com/benhamner/python-data-visualizations/data

JAVA implementation

Reference article: https://blog.csdn.net/u010498696/article/details/45507071#commentsedit

Using the same iris data set, the membership matrix iteration function of this article is different from the iteration function above, readers should pay attention.

Implementation code:

import java.io.*;
import java.rmi.server.ExportException;

public class FcmRealize {
    private static final String FILE_DATA_IN = "/home/pzs/husin/FCM/iris.data";
    private static final String FILE_PAR = "";
    private static final String FILE_CENTER = "/home/pzs/husin/FCM/center.data";
    private static final String FILE_MATRIX = "/home/pzs/husin/FCM/umaxtrix.txt";

    public int numpattern;
    public int dimension;
    public int cata;
    public int maxcycle;
    public double m;
    public double limit;

    public FcmRealize(int numpattern, int dimension, int cata, int maxcycle, double m, double limit){
        this.numpattern = numpattern;   // Number of samples
        this.dimension = dimension;     // Dimensions of each sample point
        this.cata = cata;               // The number of categories that need to be clustered
        this.maxcycle = maxcycle;       // The maximum number of iterations
        this.m = m;                     // parameter m
        this.limit = limit;             // Iteration ends prematurely
    }

    /**
     * read sample
     * @param pattern Array to save iris data
     * **/
    public boolean getPattern(double[][] pattern){
        BufferedReader br = null;
        try{
            br = new BufferedReader(new FileReader(FILE_DATA_IN));
        }catch (FileNotFoundException e){
            e.printStackTrace();
        }
        String line = null;
        String regex = ",";  // regex = ","
        int row = 0;
        while(true){
            try{
                line = br.readLine();
            }catch(IOException e){
                e.printStackTrace();
            }
            if(line == null){
                break;
            }
            String[] split = line.split(regex);
            for(int i=0;i<split.length;i++){
                pattern[row][i] = Double.valueOf(split[i]);
            }
            row++;
        }
        return true;
    }

    /**
     * Find the maximum and minimum values ​​of an array
     * @param a input array
     * @return minmax an array with two elements
     * **/
    public static double[] min_max_fun(double[] a){
        double minmax[] = new double[2];
        double minValue = a[0];
        double maxValue = a[1];
        for(int i=1;i<a.length;i++){
            if(a[i] < minValue){
                minValue = a[i];
            }
            if(a[i] > maxValue){
                maxValue = a[i];
            }
        }
        minmax[0] = minValue;
        minmax[1] = maxValue;
        return minmax;
    }

    /**
     * Standardized sample, why should it be standardized here, it can be omitted
     * @param pattern sample
     * @param numpattern Number of samples
     * @param dimension The number of sample attributes
     * @return
     * **/
    public static boolean Normalized(double pattern[][], int numpattern, int dimension){
        double min_max[] = new double[2];
        double a[] = new double[pattern.length];
        double copypattern[][] = new double[numpattern][dimension];
        for(int i = 0;i<pattern.length;i++){
            for(int j=0;j<pattern[i].length;j++){
                copypattern[i][j] = pattern[i][j];
            }
        }
        for(int j=0;j<pattern[0].length;j++){
            for(int k=0;k<pattern.length;k++){
                a[k] = pattern[k][j];
            }
            for(int i=0;i<pattern.length;i++){
                min_max = min_max_fun(a);
                double minValue = min_max[0];
                double maxValue = min_max[1];
                pattern[i][j] = (copypattern[i][j] - minValue) / (maxValue-minValue);

            }
        }
        return true;
    }

    /**
     * Find the sum of the j th column of a matrix
     * @param array
     * @param j
     * @return sum
     * **/
    public static double sumArray(double[][] array, int j){
        double sum = 0;
        for(int i=0;i<array.length;i++){
            sum += array[i][j];
        }
        return sum;
    }

    /**
     * Realize the FCM clustering algorithm according to the formula
     *
     * **/
    public boolean Fcm_myself_fun(double[][] pattern, int dimension, int numpattern, int cata, double m,
                                  int maxcycle, double limit, double[][] umatrix, double[][] rescenter, double result){
        // Verify the rationality of input parameters
        if(cata >= numpattern || m<=1){
            return false;
        }

        // standardized sample
        Normalized(pattern, numpattern, dimension);

        int dai = 0, testflag = 0; // iteration count, iteration end flag

        // Randomly initialize the membership matrix
        double temp[][] = new double[cata][numpattern];
        for(int i=0; i<umatrix.length;i++){
            for(int j=0;j<umatrix[i].length;j++){
                umatrix[i][j] = Math.random();
                temp[i][j] = umatrix[i][j];
            }
        }

        // Constraints, the sum of the membership degrees of Xi is 1
        for(int i=0;i<umatrix.length;i++){
            for (int j=0;j<umatrix[i].length;j++){
                umatrix[i][j] = temp[i][j] / sumArray(temp, j);
            }
        }

        double[][] umatrix_temp = new double[cata][numpattern];
        while(testflag == 0){
            // The membership matrix before each save update
            for(int i=0;i<umatrix_temp.length;i++){
                for(int j=0;j<umatrix_temp[i].length;j++){
                    umatrix_temp[i][j] = umatrix[i][j];
                }
            }

            // Update the cluster center according to the formula
            for (int t=0; t<cata; t++){
                for (int i=0; i<dimension; i++){
                    double a = 0, b = 0;
                    for (int k=0; k<numpattern; k++){
                        a += Math.pow(umatrix[t][k], m) * pattern[k][i];
                        b += Math.pow(umatrix[t][k], m);
                    }
                    rescenter[t][i] = a / b;
                }
            }

            // update membership
            double c=0, d=0;
            for(int t=0; t<cata; t++){
                for (int k=0; k<numpattern; k++){
                    double e = 0;
                    for(int j=0; j<cata; j++){
                        for(int i=0; i<dimension; i++){
                            c += Math.pow(pattern[k][i] - rescenter[t][i], 2);   /// m
                            d += Math.pow(pattern[k][i] - rescenter[j][i], 2);
                        }
                        e += c/d;
                        c = 0; d = 0;
                    }
                    umatrix[t][k] = 1/e;
                }
            }

            // Determine whether to converge or reach the maximum number of iterations
            double cha[][] = new double[cata][numpattern];
            for(int i=0; i<umatrix_temp.length; i++){
                for (int j=0; j<umatrix_temp[i].length; j++){
                    cha[i][j] = Math.abs(umatrix_temp[i][j] - umatrix[i][j]);
                }
            }

            double f = 0; // record the maximum value in the matrix
            for(int i=0;i<cata;i++){
                for(int j=0; j<numpattern; j++){
                    if(cha[i][j] > f){
                        f = cha[i][j];
                    }
                }
            }
            if(f <= 1e-6 || dai>maxcycle){
                testflag = 1;
                System.out.print("maxcycle : ");
                System.out.println(maxcycle);
            }
            dai += 1;


        }

        return true;
    }

    /**
     * Output membership matrix and cluster center
     * @param umatrix
     * @param rescenter
     * **/
    public void Export(double[][] umatrix, double[][] rescenter){
        String str = null;
        String tab = "\t";
        // Matrix transpose for easy display in txt
        double[][] new_umatrix = new double[numpattern][cata];
        for(int i=0; i<new_umatrix.length; i++){
            for (int j=0; j<new_umatrix[i].length; j++){
                new_umatrix[i][j] = umatrix[j][i];
            }
        }

        // output affiliation label
        for(int i=0; i<new_umatrix.length;i++){
            int maxVlueIndex = 0;
            double maxVlue = new_umatrix[i][0];
            for (int j=0; j<new_umatrix[i].length; j++){
                if(new_umatrix[i][j] > maxVlue){
                    maxVlue = new_umatrix[i][j];
                    maxVlueIndex = j;
                }
            }
            System.out.println(maxVlueIndex);
        }

        // output membership matrix
        try{
            FileWriter maxtrixFileWrite = new FileWriter(FILE_MATRIX);
            for(int i=0; i<numpattern; i++){
                str = "";
                for(int j=0; j<cata; j++){
                    str += new_umatrix[i][j] + tab;
                }
                str += "\n";
                maxtrixFileWrite.write(str);
            }
            maxtrixFileWrite.close();
        }catch (IOException e){
            e.printStackTrace();
        }
        // output cluster center
        try{
            FileWriter centerFileWriter = new FileWriter(FILE_CENTER);
            for(int i=0; i<cata; i++){
                str = "";
                for(int j=0; j<dimension; j++){
                    str += rescenter[i][j] + tab;
                }
                str += "\n";
                centerFileWriter.write(str);
            }
            centerFileWriter.close();
        }catch (IOException e){
            e.printStackTrace();
        }
    }


    public void runFcm_myself(){
        double[][] pattern = new double[numpattern][dimension];
        double[][] umatrix = new double[cata][numpattern];
        double[][] rescenter = new double[cata][dimension];
        double result = 0;

        // get samples
        getPattern(pattern);
        // run fcm
        Fcm_myself_fun(pattern, dimension, numpattern, cata, m, maxcycle, limit, umatrix, rescenter, result);
        // output result
        Export(umatrix, rescenter);
    }

    public static void main(String[] atgs){
        FcmRealize fcm = new FcmRealize(150, 4, 3, 100, 2, 0.00001);
        fcm.runFcm_myself();
    }

}

Disadvantages of FCM s

FCM is to find the minimum value of the J objective function, which means that the result we get may be the local extreme point or saddle point of the objective function.

Tags: Algorithm clustering

Posted by Blob on Sat, 17 Dec 2022 23:51:12 +0300