The specification for Reduction Algorithm

 

For a set as follows:

Table 1

A1   A2   A3   A4   A5   Value

1     1     0     1     1     v1

1     1     0     1     1     v2

1     0     0     1     0     v3

0     0     1     1     1     v4

1     1     0     1     1     v5

0     0     1     1     1     v6

 

Ai is the attribute (feature), each line denotes a sample.

 

Then we use reduction algorithm in Rough Set Theory which is a data mining method to cut redundant attributes.

 

In table 1, line 1,2,5 have the same attributes value, so union them as a subset S1{v1,v2,v5}. So does S2{v3} and S3{v4,v6}.

 

Next, take off each attribute one by one:

We take off A1 first, and create a new table

Table2

A2   A3   A4   A5   Value

1     0     1     1     v1

1     0     1     1     v2

0     0     1     0     v3

0     1     1     1     v4

1     0     1     1     v5

0     1     1     1     v6

 

In table 2, we also defer multiple sets like we did. S1{v1,v2,v5}, S2{v3}, S3{v4, v6}. So A1 is an redundant attribute and can be represented by the combination of A2, A3, A4 and A5.

 

Repeat it again for rest attributes. Finally, we got a smallest attributes set without redundant ones: {A3, A4, A5}