Skip to main content

Table 2 Performance of PMFFRC on H. sapiens dataset

From: PMFFRC: a large-scale genomic short reads compression optimizer via memory modeling and redundant clustering

Algorithm

Parameter

Compression

Decompression

cs (GB)

pss (%)

cr (bits/base)

crg (%)

cpm (GB)

ct (h)

dpm (GB)

dt (h)

HARC

Without

7.15

1.61

1.78

1.37

1.57

1.51

-u10, k3

4.68

34.55

1.06

52.77

9.49

1.29

5.22

1.65

-u20, k1

3.27

45.74

0.74

118.54

19.61

1.01

5.89

1.74

SPRING

Without

7.16

1.62

3.64

0.97

1.22

0.26

-u10, k4

4.75

33.66

1.07

50.68

7.09

1.39

2.85

0.19

-u20, k3

4.19

41.48

0.95

70.91

10.29

1.26

4.63

0.22

-u40, k1

3.16

55.87

0.71

126.95

20.71

1.20

4.95

0.23

Mstcom

Without

6.13

1.38

13.60

8.97

6.22

1.32

-u40, k4

5.70

7.01

1.29

7.50

38.09

12.25

33.71

1.51

-u80, k3

3.72

39.31

0.84

64.57

53.71

10.24

69.08

0.94

-u120, k1

2.94

52.04

0.66

108.51

107.39

10:91

69.08

0.82

FastqCLS

Without

7.99

1.80

23.08

8.46

5.32

5.08

-u40, k4

7.98

0.13

1.80

0.22

35.35

6.88

6.45

5.34

-u80, k3

7.89

1.25

1.78

1.24

35.43

6.76

6.47

5.31

-u120, k1

7.63

4.51

1.72

4.76

35.48

7.12

6.46

5.36

  1. The parameter without indicates that the PMFFRC algorithm is not used for compression optimization. (-u10, k3) denotes the clustering parameter K = 3 of PMFFRC when Uram = 10 GB. HARC, SPRING, Mstcom, and FastqCLS are used in with-order mode (lossless). Compression parameters: Pr = 20, T = 8 (threads for cascaded algorithm), βHARC = 1.05, βSPRING = 0.15, βFastqCLS = 0.15, βMstcom = 0.30, x1 = 100, x2 = 100,100. The compression gains obtained by the cascaded PMFFRC algorithms is marked in boldface. “–” means the result of not being optimized by the PMFFRC algorithm. For the experimentally optimized algorithms, we ensure the data integrity of the lossless compression optimization by comparing data hash fingerprints