The SNP data set came from 676 swine samples of 21 breeds. The data were collected from a Porcine Colonization of the Americas data set [1] and two data sets of pigs raised in Thailand [2], and Vietnam [3]. All data were put through a data cleansing step. However, there were still some missing values. Those were estimated by a single imputation method. The estimated values were modes of the three datasets that made up our entire dataset. Each sample contains 10,210 SNPs.
The details of each class are shown in the following table.
Class | Breed | Number of Sample |
---|---|---|
1 | Creole | 90 |
2 | Monteiro | 10 |
3 | Zungo | 10 |
4 | Jiangquhai | 11 |
5 | Jinhua | 16 |
6 | Meishan | 16 |
7 | Xiang pig | 11 |
8 | Iberian | 15 |
9 | Duroc | 44 |
10 | Landrace | 146 |
11 | Largewhite | 149 |
12 | Semi- feral | 10 |
13 | Wild boar | 13 |
14 | Yucatan | 10 |
15 | Mixed breed pig | 48 |
16 | Hampshire | 14 |
17 | Guinea hog | 15 |
18 | HU_TN | 11 |
19 | BA_ME | 11 |
20 | CP_SO | 12 |
21 | Bisaro | 14 |
Total | 676 |
If you are interested in further investigating your own study throughout this dataset, Please cite this article as:
Wanthanee Rathasamuth and Kitsuchart Pasupa, "A Modified Binary Flower Pollination Algorithm: A Fast and Effective Combination of Feature Selection Techniques for SNP Classification", In: Proceedings of the 11th International Conference on Information Technology and Electrical Engineering (ICITEE 2019), 10-11 October 2019, Pattaya, Thailand.
[1] Burgos-Paz, W., et al. (2013), "Porcine colonization of the Americas: a 60k SNP story," Heredity, 110(4):321-330.
[2] Tuangsithtanon, K (2019), "Population structure in porcine," Figshare, Dataset, https://doi.org/10.6084/m9.figshare.8830799.v1.
[3] Ishihara, S., et al. (2018), "Genetic relationships among Vietnamese local pigs investigated using genome‐wide SNP markers," Animal Genetics, 49(1): 86-89.