A Comparative Study on the Performance of Fuzzy Rule Base and Artificial Neural Network towards Classification of Yeast Data

Автор: Shrayasi Datta, J. Paulchoudhury

Журнал: International Journal of Information Technology and Computer Science(IJITCS) @ijitcs

Статья в выпуске: 5 Vol. 7, 2015 года.

Бесплатный доступ

Classification of yeast data plays an important role in the formation of medicines and in various chemical components. If the type of yeast can be recognized at the primary stage based on the initial characteristics of it, a lot of technical procedure can be avoided in the preparation of chemical and medical products. In this paper, the performance two classifying methodologies namely artificial neural network and fuzzy rule base has been compared, for the classification of proteins. The objective of this work is to classify the protein using the selected classifying methodology into their respective cellular localization sites based on their amino acid sequences. The yeast dataset has been chosen from UCI machine learning repository which has been used for this purpose. The results have shown that the classification using artificial neural network gives better prediction than that of fuzzy rule base on the basis of average error.

Еще

Protein Localization, Classification, Neural Network, Fuzzy Rule Base, Yeast Dataset

Короткий адрес: https://sciup.org/15012283

IDR: 15012283

Текст научной статьи A Comparative Study on the Performance of Fuzzy Rule Base and Artificial Neural Network towards Classification of Yeast Data

Published Online April 2015 in MECS

A cell usually contains approximate 1 billion (or 109) protein molecules [1], [2]. These protein molecules reside in various compartments of a cell which usually called “protein subcellular locations”. The information about these subcellular locations helps to know the functions of the cell and the biological process executed by the cells. This information also has been used for the identification of drug targets ([3], [4]). Determining the subcellular localization of a protein by conducting bio-chemical experiments is a laborious and time consuming task. But with the development of machine learning techniques [5] in computer science, together with an increased dataset of proteins of known localization, fast and accurate localization predictions for many organisms have been done successfully. This is due to the nature of machine learning approaches, which performed well in domains where there is a vast collection of data but with a little theory –which perfectly describes the situation in bioinformatics [5]. Among various prokaryotic and eukaryotic organisms, yeast is important because these are widely used in medicine and in food technology field. Biological structure of yeast has also snatched the attention of researchers for many years because of their similarity with human cell.

For predicting the subcellular localization of yeast protein, the first approach has been developed by Kanehisa and Nakai([6],[7]). Horton and Nakai[8] have proposed a probabilistic model where expert has identified those features which learn its parameters from a set of training data. The authors also have implemented and tested three machine learning techniques namely k-nearest neighbor algorithm, binary decision tree, naïve Bayes classifier in yeast dataset and E.Coli dataset[9]. Performance of these three techniques with the Probabilistic method [8] has also been compared and it has been shown that the performance of k-nearest neighbor algorithm is better among these four. Chen Y.[10] has implemented three machine learning classification algorithms: decision tree, perceptron, two-layer feedforward network for predicting subcellular localization site of a protein of yeast and E.Coli dataset. And it is concluded that three techniques has similar performance measure for this two dataset. Qasim, R, Begum, K. Jahan, N. Ashrafi, T. Idris, S. Rahman, R.M. [11], have proposed an automated fuzzy interference system for protein subcellular localization. Bo Jin, Yuchun Tang, Yan-Qing Zhang, Chung-Dar Lu and Irene Weber [12], have proposed and designed SVM with fuzzy hybrid kernel based on TSK fuzzy model and have showed that fuzzy hybrid kernel has achieved better performance in SVM classification. Prediction of protein subcellular localization work has been done in ([13]-[16]). Out of these, support vector machine techniques have been used in ([13]-[15]). A lot of decent work also has been done on webserver design for subcellular prediction ([17]-[20]). Algorithm based on Fuzzy rule base technique is proposed in heart disease and in packet delivery time ([21]-[23]).

Classification is done with some widely used machine learning techniques, like, KNN, multilayered feed forward neural network, SVM etc.([6]-[16]), but most of the work is based on some comparison with other datasets, like E.Coli , fungi etc. They mostly have concentrated on the algorithm, i.e. which algorithm is best suited for classification task of medical datasets. But for a particular dataset, which algorithm is most efficient has not been checked. And that is why the work described in this paper has been taken. Here, a popular and very important protein subcellular localization dataset, yeast, has been taken for classification, and multilayered feed forward neural network and fuzzy rule base technique has been used and compared for classification task. Yeast dataset from UCI machine learning laboratories has been used in this paper. Each input of the dataset corresponds to a protein. The output is the predicted localization site of a protein. After the implementation, performance of the two techniques has been evaluated and compared on the basis of average error.

In this research work, the yeast data set obtained from UCI machine learning repository has been used[24]. The objective of this dataset is to determine the cellular localization of the yeast proteins. Yeast dataset, representing the kingdom of eukaryote, consists of 9 features (8 attributes, 1 sequence-name) .The attributes are mcg, gvh, alm, mit, erl, pox, vac, nuc. Each of the attributes has been used to classify the localization site of a protein which is a score (between 0 and 1) corresponding to a certain feature of the protein sequence. The higher the score is, the more possible the protein sequence has such feature. Proteins are classified into 10 classes, these are cytosolic or cytoskeletal (CYT), nuclear (NUC), mitochondrial (MIT), membrane protein without N-terminal signal (ME3), membrane protein with uncleaved signal (ME2), membrane protein with cleaved signal (ME1) , extracellular (EXC), vacuolar (VAC), peroxisomal (POX), endoplasmic reticulum lumen (ERL).

The paper is organized as follows, in section 1, the importance of this research work and a brief literature review is furnished. In section 2, a brief theoretical introduction is presented about the techniques used in this work with the description of the dataset used. Section 3 deals with the detailed procedure of the work and its result with error calculation. Finally, Section 4 concludes the paper.

II .Methodology

A. Artificial Neural Network.

Artificial neural network (ANN) follows a computational paradigm that is inspired by the structure and functionality of the brain. The ANN consists of an interconnected group of artificial neurons processing the information to compute the result.

B. Multilayered Feed Forward Neural Network

Multilayer Feed-forward ANNs (MLFFNN) is made of multiple layers. It possesses an input and an output layer and also has one or more intermediary layers called hidden layers (fig. 1). The computational units of the hidden layer are known as the hidden neurons or hidden units.

Fig. 1. A Multilayered feed forward neural network

C. Fuzzy Inference System

A fuzzy inference system (FIS) is a system that transforms a given input to an output with the help of fuzzy logic (fig. 2).The procedure followed by a fuzzy inference system is known as fuzzy inference mechanism or simply fuzzy inference.

Fig. 2. A fuzzy inference system

The entire fuzzy inference process consists of five steps. These are, fuzzification of the input variables, application of the fuzzy operators on the antecedent parts of rule, evaluation of the fuzzy rules, aggregation of the fuzzy sets across the rules, and defuzzification of the resultant aggregate fuzzy set.

D. Fuzzy Membership Function

Fuzzy membership function determines the membership functions of objects to fuzzy set of all variables. A membership function provides a measure of the degree of similarity of an element to a fuzzy set. There are different shapes of membership functions; triangular, trapezoidal, piecewise-linear, Gaussian, bellshaped, etc.

a. Trapezoidal Membership Function

It is defined by a lower limit a , an upper limit d , a lower support limit b , and an upper support limit c , where a < b < c < d .

0if ( x < a ) or ( x > d )

ц а ( x ) = •

x — a т , , ,

----ifa <= x <= b

b - a

. ⁽¹⁾

1ifb <= x <= d

d - x т _ . д

----ifc <= x <= d _ d - c

b. Gaussian Membership Function

It is defined by a central value m and a standard deviation k > 0 . The smaller k is, the narrower the “bell” is.

- ( x - m ) ² ц а ( x ) = e ² k k

c. Triangular Membership function

It is defined by a lower limit a , an upper limit b , and a value m, where a

	0 if ( x <= a )
	x — a
	----if ( a <= x <= m
Ц а ⁽ ^x ) = •	m - a	\ (3)
	b - x ,
	----ifm < x < b
	b - m
	0 ifx <= b

E. Error Analysis

The performance of the two methods of classification has been evaluated by estimated error and average error.

Estimated error (E i ) of an individual instance i is given by (4) :-

Where, Pi is the output class value estimated for a given instance, Ti is the actual output class value for that instance.

Average Error is derived using (5):

A =12 Ei n i=1

Where E i is the Estimated error and n is the number of instances.

III. Implementation and Result

A. Implementation.

a. Dataset Preprocessing.

Step 1.

As stated previously, yeast dataset[24] consists of 10 numbers of attributes. At first the first attribute (sequence name) is discarded, as this attribute is not necessary for the classification task.

Step 2.

The output class names are of non-numeric type for example MIT, CYT, VAC etc. These are replaced by numeric value 1, 2, 3 etc. The class names with their replaced numeric values are listed in table 1.

Table 1. Class name and numerical value

Class name	Numerical value
MIT	1
NUC	2
CYT	3
ME1	4
EXC	5
ME2	6
ME3	7
VAC	8
POX	9
ERL	10

Now the dataset consists of 9 attributes, out of which 8 attributes have been taken for input and the last one as class name. All the attributes have been changed to numerical value as furnished in table 1. Now the dataset is ready to be classified using artificial neural network and fuzzy rule base both.

b. Classification Using Fuzzy Rule Base.

Step 1.

One Fuzzy Inference System(FIS) with 8 inputs and 1 output has been used.

Step 2.

The range of the input and output variables are first retrieved and then decomposed based on the range of their values. These are furnished in table 2 to Table 8. It is to note that there are 8 attributes .these are mcg, gvh, aln, mit, erl, vac, nuc and pox. Out of these the attributes pox has not been used since this attribute contains 0.00 values in all the data sets.

Table 2. Classification of Attribute 1 (mcg)

Range	Fuzzy set value
0.42 to 0.64	Low1
0.33 to 0.61	Low2
0.40 to 0.73	Low3
0.91 to 0.70	Medium1
0.49 to 0.89	Medium2
0.54 to 0.94	Medium3
0.28 to 0.54	High1
0.28 to 0.80	High2
0.32 to 0.68	High3
0.7 to 0.86	Very high

Table 3. Classification of Attribute 2 (gvh)

Range	Fuzzy set value
0.40 to 0.67	Low1
0.31 to 0.60	Low2
0.39 to 0.63	Low3
0.66 to 0.88	Medium1
0.39 to 0.87	Medium2

0.42 to 0.75	Medium3
0.24 to 0.58	High1
0.32 to 0.82	High2
0.27 to 0.68	High3
0.56 to 0.92	Very high

Table 4. Classification of Attribute 3(aln)

Range	Fuzzy set value
0.45 to 0.66	Low1
0.43 to 0.69	Low2
0.42 to 0.60	Low3
0.30 to 0.47	Medium1
0.36 to 0.58	Medium2
0.33 to 0.58	Medium3
0.21 to 0.42	High1
0.26 to 0.57	High2
0.43 to 0.59	High3
0.38 to 0.58	Very high

Table 5. Classification of Attribute 4(mit)

Range	Fuzzy set value
0.13 to 0.65	Low1
0.13 to 0.43	Low2
0.11 to 0.35	Low3
0.23 to 0.78	Medium1
0.23 to 0.37	Medium2
0.4 to 0.49	Medium3
0.12 to 0.31	High1
0.08 to 0.28	High2
0.10 to 0.49	High3
0.25 to 0.40	Very high

Table 6. Classification of Attribute 5(erl)

Range	Fuzzy set value
0.00 to 0.1	low
1.00 to 1.11	high

Table 7. Classification of Attribute 7(vac)

Range	Fuzzy set value
0.22 only	Low1
0.22 to 0.34	Low2
0.22 to 0.40	Low3
0.22 to 0.63	Medium1
0.22 only	Medium2
0.22 to 0.35	Medium3
0.22 to 0.66	High1
0.22 to 0.40	High2
0.22 to 0.41	High3
0.53 to 0.58	Very high

Table 8. Classification of Attribute 8(nuc)

Range	Fuzzy set value
0.46 to 0.53	Low1
0.47 to 0.68	Low2
0.49 to 0.58	Low3
0.43 to 0.58	Medium1
0.39 to 0.56	Medium2
0.40 to 0.59	Medium3
0.43 to 0.55	High1
0.39 to 0.60	High2
0.40 to 0.54	High3
0.53 to 0.58	Very high

Based on the input and output data, a rule base has been created which has been furnished in table 9.

Now membership function has been applied to all input variables and output variable. Here, four combination of membership function for input and output variables has been applied. The combination has been listed in Table no 10. From table 10, it is to note that the input and output membership functions have been used Gaussian 2 for serial no 1. This means all input 8 attributes, Gaussian 2 membership function has been used for each rule. Similarly, this notation has been used for other rules.

Table 9. Rule base

Rule no.	Rules
1.	If (att1 is low1) and (att2 is low1) and (att3 is low1) and (att4 is low1) and (att5 is a5) and (att6 is a6) and (att7 is low1) and (att8 is lowc1) then (output1 is class1) (1)
2.	If (att1 is low2) and (att2 is low2) and (att3 is low2) and (att4 is low2) and (att5 is a5) and (att6 is a6) and (att7 is low2) and (att8 is low2) then (output1 is class2) (1)
3	If (att1 is low3) and (att2 is low3) and (att3 is low3) and (att4 is low3) and (att5 is a5) and (att6 is a6) and (att7 is low3) and (att8 is low3) then (output1 is class3) (1)
4	If (att1 is medium1) and (att2 is medium1) and (att3 is medium1) and (att4 is medium1) and (att5 is a5) and (att6 is a6) and (att7 is medium1) and (att8 is medium1) then (output1 is class4) (1)
5	If (att1 is medium2) and (att2 is medium2) and (att3 is medium2) and (att4 is medium2) and (att5 is a5) and (att6 is a6) and (att7 is medium2) and (att8 is lowc1) then (output1 is class5) (1)
6	If (att1 is medium3) and (att2 is medium3) and (att3 is medium3) and (att4 is medium3) and (att5 is a5) and (att6 is a6) and (att7 is medium3) and (att8 is medium3) then (output1 is class6) (1)
7	If (att1 is high1) and (att2 is high1) and (att3 is high1) and (att4 is high1) and (att5 is a5) and (att6 is a6) and (att7 is high1) and (att8 is high1) then (output1 is class7) (1)
8	If (att1 is high2) and (att2 is high2) and (att3 is high2) and (att4 is high2) and (att5 is a5) and (att6 is a6) and (att7 is high2) and (att8 is high2) then (output1 is class8) (1)
9	If (att1 is high3) and (att2 is high3) and (att3 is high3) and (att4 is high3) and (att5 is a5) and (att6 is a6) and (att7 is high3) and (att8 is high3) then (output1 is class9) (1)
10	If (att1 is very_high) and (att2 is very_high) and (att3 is very_high) and (att4 is very_high) and (att5 is a5c10) and (att6 is a6) and (att7 is very_high) and (att8 is very_high) then (output1 is class10) (1)

Table 10. Input and Output membership functions

Sl. No.	Membership function for Input variable	Membership function for Output variable
1	Gaussian2	Gaussian2
2	Gaussian2	Triangular
3	Trapezoidal	Trapezoidal
4	Trapezoidal	Triangular

The estimated output has been calculated based on the combination of membership functions as listed in table 10, and, using fuzzy rule base as furnished in Table 9 for all 50 data items. The output has been furnished in table 11.

Based on the actual output(available in the dataset) and estimated output(as calculated), estimated error has been calculated for all input-output membership functions and has been furnished in Table 12.

The average error for each combination of input-output membership function has been calculated which has been furnished in Table 13.

Table 11. Input and Output fuzzy values

Index no.	Best output value in FIS	Estimated output for Trapezoidal-Triangular combination for input-output membership function	Estimated output for Trapezoidal-Trapezoidal combination for input-output membership function	Estimated output for Gaussian2-Gaussian2 combination for input-output membership function	Estimated output for Gaussian2-Triangular combination for input-output membership function
1.	0.1	0.1	0.1	0.473	0.463
2.	0.1	0.1	0.1	0.396	0.411
3.	0.1	0.1	0.1	0.475	0.48
4.	0.1	0.1	0.1	0.465	0.462
5.	0.1	0.196	0.195	0.483	0.484
6.	0.1	0.1	0.1	0.472	0.472
7.	0.2	0.2	0.2	0.517	0.523
8.	0.2	0.5	0.5	0.398	0.399
9.	0.2	0.317	0.315	0.555	0.555
10.	0.2	0.2	0.2	0.465	0.465
11.	0.2	0.462	0.461	0.524	0.526
12.	0.2	0.345	0.347	0.59	0.489
13.	0.2	0.622	0.628	0.59	0.569
14.	0.2	0.3	0.3	0.59	0.561
15.	0.3	0.341	0.34	0.451	0.449
16.	0.3	0.2	0.2	0.453	0.448
17.	0.3	0.333	0.333	0.447	0.446
18.	0.3	0.3	0.3	0.548	0.547
19.	0.3	0.34	0.342	0.541	0.543
20.	0.3	0.346	0.345	0.463	0.445
21.	0.3	0.2	0.2	0.463	0.552
22.	0.3	0.391	0.39	0.463	0.552
23.	0.4	0.4	0.4	0.54	0.538
24.	0.4	0.4	0.4	0.54	0.489
25.	0.4	0.4	0.4	0.54	0.552
26.	0.5	0.5	0.5	0.52	0.526
27.	0.5	0.5	0.5	0.604	0.602
28.	0.5	0.5	0.5	0.604	0.602
29.	0.6	0.6	0.6	0.612	0.612
30.	0.6	0.559	0.559	0.484	0.482
31.	0.6	0.457	0.455	0.505	0.499
32.	0.7	0.2	0.2	0.708	0.711
33.	0.7	0.2	0.2	0.663	0.661
34.	0.7	0.2	0.2	0.662	0.66
35.	0.7	0.2	0.2	0.662	0.66
36.	0.7	0.2	0.2	0.637	0.631
37.	0.7	0.5	0.5	0.745	0.747
38.	0.7	0.7	0.7	0.698	0.703
39.	0.7	0.3	0.3	0.504	0.498
40.	0.7	0.5	0.5	0.748	0.749
41.	0.8	0.5	0.5	0.61	0.604
42.	0.8	0.561	0.565	0.551	0.551

43.	0.8	0.5	0.5	0.523	0.518
44.	0.8	0.5	0.5	0.562	0.557
45.	0.9	0.5	0.5	0.5	0.5
46.	0.9	0.5	0.5	0.0.5	0.5
47.	0.9	0.5	0.5	0.5	0.5
48.	1.0	0.5	0.5	0.5	0.5
49.	1.0	0.5	0.5	0.5	0.5
50.	1.0	0.5	0.5	0.5	0.5

Table 12. Estimated error for input-output membership function combination

Index no.	Estimated Error for Trapezoidal-Triangular combination for input-output membership function	Estimated Error for Trapezoidal- Trapezoidal combination for input-output membership function	Estimated Error for Gaussian2- Gaussian2 combination for inputoutput membership function	Estimated Error for Gaussian2-Triangular combination for input-output membership function
1.	0.0	0.0	3.73	3.63
2.	0.0	0.0	2.96	3.10
3.	0.0	0.0	3.75	3.8
4.	0.0	0.0	3.65	3.61
5.	0.96	0.95	3.83	3.84
6.	0.0	0.0	3.71	3.71
7.	0.0	0.0	1.585	1.615
8.	1.499	1.49	0.99	0.995
9.	0.585	0.575	1.775	1.775
10.	0.0	0.0	1.325	1.325
11.	1.31	1.306	1.61	1.63
12.	0.72	0.73	1.949	1.44
13.	2.11	2.13	1.949	1.84
14.	0.499	0.49	1.949	1.805
15.	0.136	0.133	0.50	0.49
16.	0.333	0.333	0.51	0.49
17.	0.11	0.11	0.49	0.48
18.	0.0	0.14	0.82	0.82
19.	0.133	0.0	0.80	0.81
20.	0.15	0.149	0.54	0.48
21.	0.333	0.333	0.54	0.84
22.	0.30	0.30	0.54	0.84
23.	0.0	0.0	0.35	0.345
24.	0.0	0.0	0.35	0.222
25.	0.0	0.0	0.35	0.38
26.	0.0	0.0	0.0.4	0.052
27.	0.0	0.0	0.207	0.203
28.	0.0	0.0	0.207	0.203
29.	0.0	0.0	0.02	0.02
30.	0.068	0.06	0.19	0.19
31.	0.23	0.24	0.15	0.16
32.	00.71	00.71	0.011	0.01
33.	00.71	00.71	0.05	0.055
34.	00.71	00.71	0.054	0.0571
35.	00.71	00.71	0.054	0.0571
36.	00.71	00.71	0.08	0.0985
37.	00.28	00.28	0.06	0.0671
38.	0.0	0.0	0.00	0.004
39.	0.571	0.571	0.27	0.2885
40.	00.28	00.28	0.06	0.070
41.	0.375	0.375	0.23	0.245
42.	0.298	0.293	0.3112	0.311
43.	0.375	0.375	0.346	0.3525
44.	0.375	0.375	0.2975	0.3037
45.	0.44	0.44	0.44	0.44
46.	0.44	0.44	0.44	0.44
47.	0.44	0.44	0.44	0.44
48.	0.5	0.5	0.5	0.5
49.	0.5	0.5	0.5	0.5
50.	0.5	0.5	0.5	0.5

Table 13. Average error for input and output membership function

Sl. No.	Membership function for Input variable	Membership function for Output variable	Average Error
1.	Trapezoidal	Triangular	0.36806
2.	Trapezoidal	Trapezoidal	0.3751
3.	Gaussian2	Gaussian2	0.92
4.	Gaussian2	Triangular	2.59

From Table 13, it has been observed that average error calculated using membership function for input variable as Trapezoidal and membership function for output variable as Triangular is minimum. Therefore the inputoutput membership function combination as trapezoidal-Triangular has to be used for classification of yeast data when using fuzzy rule base.

c. Classification Using Multi-Layered Feed Forward Artificial Neural Network.

Step 1.

16.

0.3

0.3989

0.329

17.

0.3

0.3121

0.040

18.

0.3

0.2569

0.14

19.

0.3

0.2970

0.01

20.

0.3

0.2764

0.07

21.

0.3

0.3558

0.186

22.

0.3

0.3626

0.208

23.

0.4

0.40680

0.016

24.

0.4

0.4393

0.09

25.

0.4

0.4638

0.15

26.

0.5

0.5002

0.00

27.

0.5

0.5035

0.00

28.

0.5

0.5035

0.00

29.

0.6

0.6093

0.01

30.

0.6

0.6284

0.04

31.

0.6

0.6322

0.05

32.

0.7

0.6951

0.006

33.

0.7

0.7229

0.032

34.

0.7

0.7563

0.080

35.

0.7

0.7563

0.080

36.

0.7

0.7507

0.072

37.

0.7

0.7302

0.0431

38.

0.7

0.7112

0.016

39.

0.7

0.7728

0.104

40.

0.7

0.7071

0.0101

41.

0.8

0.7115

0.11

42.

0.8

0.6794

0.15

43.

0.8

0.6817

0.14

44.

0.8

0.5080

0.365

45.

0.9

0.8808

0.021

46.

0.9

0.8702

0.033

47.

0.9

0.9029

0.003

48.

1.0

0.99

0.01

49.

1.0

1.01

0.01

50.

1.0

1.09

0.09

In order to improve the performance, the feed forward back propagation neural network (8 input node,10 hidden node and 1 output node) has been used.

Table 14. Neural Network characteristics

Architecture	Multilayer feedforward neural network (MLFNN)
Training Method	Backpropagation training algorithm
Learning method	Supervised Learning
Activation function	sigmoid

It is to note that from 1484 samples, 154 number of samples has been taken for training and 102 number of samples for tested. From those, estimated data and estimated error of total 50 samples have been furnished in Table 15. The average error has been found as 0.3416.

Table 15. Estimated output and Estimated error using MLFFNN

Index no.	Best output value in neural network	Estimated output	Estimated Error using ANN
1.	0.1	0.0916	0.08
2.	0.1	0.3117	2.116
3.	0.1	-0.0183	1.183
4.	0.1	0.0774	0.22
5.	0.1	0.4646	3.646
6.	0.1	-0.0085	1.085
7.	0.2	0.3071	0.535
8.	0.2	0.2609	0.3045
9.	0.2	0.3909	0.9545
10.	0.2	0.4819	1.409
11.	0.2	0.2217	0.108
12.	0.2	0.1435	0.28
13.	0.2	0.2452	0.225
14.	0.2	0.2648	0.323
15.	0.3	0.3335	0.111

B. Result.

A comparative study has been made on the basis of average error of fuzzy rule base using Trapezoidal-Triangular (input-output) membership function and neural network. The result has been furnished in table 16.It has been observed that multilayer feed forward back propagation neural network is more preferable than fuzzy rule base. Therefore multilayer feed forward

Back propagation neural network can be used for classification using yeast data.

Table 16. Methodology versus average error

Methodology	Average Error
Fuzzy rule base with Trapezoidal- as input and Triangular membership function for output	0.36806
Multilayered feed forward neural network	0.34158

IV. Conclusion and Future Scope

In this work, two methods for classifying the yeast dataset have been evaluated using MATLAB. And it is concluded that multilayered feed forward neural network is more suitable for this classification. In fuzzy rule base it has been further observed that Fuzzy rule base with Trapezoidal membership function as input and Triangular membership function for output is preferable than other combination of membership functions. The same technique may be used in other classification problems.

Список литературы A Comparative Study on the Performance of Fuzzy Rule Base and Artificial Neural Network towards Classification of Yeast Data

B. Alberts, D. Bray, J. Lewis, M. Raff, K. Roberts, J.D. Watson, Molecular Biology of the Cell, Garland, New York, 1994.
H. Lodish, D. Baltimore, A. Berk, S.L. Zipursky, P. Matsudaira, J. Darnell, Molecular Cell Biology, Scientific American Books, New York, 1995
Z.-P. Feng, An overview on predicting the subcellular location of a protein, Silico. Biol. 2 (3) (2002) page 291–303.
Q. Cui, T. Jiang, B. Liu, S. Ma, Esub8: a novel tool to predict protein subcellular localizations in eukaryotic organisms, BMC Bioinformatics 5 (1) (2004) 1–7.
Shavlik, J., Hunter, L. & Searls, D. (1995).Introduction. Machine Learning, 21: 5-10.
Nakai and Kanehisa . 1991.”Expert system for predicting protein localization sites in gram negative bacteria”,PROTEINS,structure,function and genetics,11:95-110.
Nakai and Kanehisa 1992, A knowledge base for predicting protein localization sites in eukaryotic cells.Genomics, 14:897-911.
Horton and Nakai,1996:A probabilistic classification system for predicting of cellular localization of sites of protein.In Proceedings of Fourth International Conference on Intelligent Systems for Molecullar Biology.109-115.St. Louis.AAAI Press.
Paul Horton , Kenta Nakai, “Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier”, Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, p.147-152, June 21-26, 1997
Yetian Chen, Predicting the Cellular Localization Sites of Proteins Using Decision Tree and Neural Networks, http://www.cs.iastate.edu/~yetianc/cs572/files/CS572_Project_YETIANCHEN.pdf.unpublished.
Qasim, R, Begum, K. ; Jahan, N. ; Ashrafi, T. ; Idris, S. ; Rahman, R.M.:” Subcellular localization of proteins using automated fuzzy inference system”, published at Informatics, Electronics & Vision (ICIEV), 2013 International Conference on May 2013,pages 1-5
Support Vector Machine with the Fuzzy Hybrid Kernel for Protein Subcellular Localization Classification “;Bo Jin, Yuchun Tang, Yan-Qing Zhang, Chung-Dar Lu and Irene Weber; The 2005 IEEE International Conference on Fuzzy Systems;pages 420-423.
X.-B. Zhou, C. Chen, Z.-C. Li, and X.-Y. Zou;”Improved prediction of subcellular location for apoptosis proteins by the dual-layer support vector machine”; Amino Acids (2008) 35: 383–388.
Ana Carolina Lorena, André C.P.L.F. de Carvalho:”Protein cellular localization prediction with SupportVector Machines and Decision Trees”; Computers in Biology and Medicine 37 (2007) 115 – 125.
Jing Huang and Feng Shi, “support vector machines for predicting apoptosis proteins types”; Acta Biotheoretica (2005) 53: 39–47; Springer 2005.
Ru-Ping Liang, Shu-Yun Huang, Shao-Ping Shi, Xing-Yu Sun, Sheng-Bao Suo, Jian-Ding Qiu:” A novel algorithm combining support vector machine with the discrete wavelet transform for the prediction of protein subcellular localization”; Computers in Biology and Medicine 42 (2012) 180–187.
K.C. Chou and H.B. Shen, “Euk-Mploc: A Fusion Classifier for Large-Scale Eukaryotic Protein Subcellular Location Prediction by Incorporating Multiple Sites,” J. Proteome Research, vol. 6, no. 5, pp. 1728-1734, 2007.
H.B. Shen and K.C. Chou, “Nuc-Ploc: A New Web-Server for Predicting Protein Subnuclear Localization by Fusing Pseaa Composition and Psepssm,” Protein Eng. Design and Selection,vol. 20, no. 11, pp. 561-567, 2007.
K.C. Chou and H.B. Shen, “Large-Scale Plant Protein Subcellular Location Prediction,” J. Cellular Biochemistry, vol. 100, no. 3, pp. 665-678, 2007.
H.B. Shen and K.C. Chou, “Gpos-Ploc: An Ensemble Classifier for Predicting Subcellular Localization of Gram-Positive Bacterial Proteins,” Protein Eng. Design Selection, vol. 20, no. 1, pp. 39-46, 2007.
P.S. Banerjee, J.Palchoudhury, S.R. Bhadra Choudhury, “Fuzzy membership function as a Trust Based AODV for MANET”, I.J. Computer Network and Information Security,2013,12,27-34.
M. Barman, J Palchoudhury, S. Biswas,”A Framework for the Neuro Fuzzy Rule Base System in the diagonosis of heart disease”, International journal of Scientific and Engineering Research,vol-4,Issue 11,November 2013.
M. Barman, J Palchoudhury, “A Framework for Selection of Membership Function Using Fuzzy Rule Base System for the Diagnosis of Heart Disease”,I.J. Information Technology and Computer Science, vol 5, no. 11,October 2013,pages 62-70..
UCI machine learning repository,: http://archive.ics.uci.edu/ml.

Еще

Статья научная

A Comparative Study on the Performance of Fuzzy Rule Base and Artificial Neural Network towards Classification of Yeast Data

Текст научной статьи A Comparative Study on the Performance of Fuzzy Rule Base and Artificial Neural Network towards Classification of Yeast Data

II .Methodology

III. Implementation and Result

c. Classification Using Multi-Layered Feed Forward Artificial Neural Network. Step 1.

B. Result.

IV. Conclusion and Future Scope

Список литературы A Comparative Study on the Performance of Fuzzy Rule Base and Artificial Neural Network towards Classification of Yeast Data

c. Classification Using Multi-Layered Feed Forward Artificial Neural Network.

Step 1.