Seznam finálních kod v psaném korpusu SYN2010 spolu s jejich frekvenčními charakteristikami.
Jednoduché kody | Kody CC | Kody CCC | Kody CCCC |
typ | koda | abs_frek (tokeny) | rel_frek (tokeny) | IPM (tokeny) | abs_frek (typy) | rel_frek (typy) |
---|---|---|---|---|---|---|
C | 17 kod | 20 036 751 | 100,00% | 164 684,61 | 11 298 | 100,00% |
C | m | 4 040 096 | 20,16% | 33 206,06 | 2 594 | 22,96% |
C | t | 3 621 355 | 18,07% | 29 764,38 | 2 021 | 17,89% |
C | l | 2 750 731 | 13,73% | 22 608,61 | 1 906 | 16,87% |
C | x | 2 010 314 | 10,03% | 16 523,03 | 1 792 | 15,86% |
C | k | 1 865 738 | 9,31% | 15 334,74 | 823 | 7,28% |
C | š | 1 592 837 | 7,95% | 13 091,73 | 220 | 1,95% |
C | n | 1 118 785 | 5,58% | 9 195,44 | 557 | 4,93% |
C | s | 780 010 | 3,89% | 6 411,00 | 273 | 2,42% |
C | c | 560 352 | 2,80% | 4 605,60 | 173 | 1,53% |
C | r | 365 264 | 1,82% | 3 002,15 | 259 | 2,29% |
C | j | 295 445 | 1,47% | 2 428,30 | 120 | 1,06% |
C | f | 264 357 | 1,32% | 2 172,78 | 196 | 1,73% |
C | ť | 228 982 | 1,14% | 1 882,03 | 51 | 0,45% |
C | p | 183 403 | 0,92% | 1 507,41 | 131 | 1,16% |
C | č | 131 793 | 0,66% | 1 083,22 | 60 | 0,53% |
C | ř | 113 907 | 0,57% | 936,22 | 74 | 0,65% |
C | ň | 113 382 | 0,57% | 931,90 | 48 | 0,42% |
typ | koda | abs_frek (tokeny) | rel_frek (tokeny) | IPM (tokeny) | abs_frek (typy) | rel_frek (typy) |
---|---|---|---|---|---|---|
CC | 54 kod | 1 026 850 | 100,00% | 8 439,81 | 682 | 100,00% |
CC | st | 510 462 | 49,71% | 4 195,55 | 339 | 49,71% |
CC | nt | 86 873 | 8,46% | 714,02 | 63 | 9,24% |
CC | ct | 51 098 | 4,98% | 419,98 | 14 | 2,05% |
CC | rt | 49 174 | 4,79% | 404,17 | 39 | 5,72% |
CC | kt | 44 670 | 4,35% | 367,15 | 19 | 2,79% |
CC | mš | 42 095 | 4,10% | 345,98 | 10 | 1,47% |
CC | xš | 33 608 | 3,27% | 276,23 | 5 | 0,73% |
CC | př | 23 004 | 2,24% | 189,07 | 3 | 0,44% |
CC | tř | 21 134 | 2,06% | 173,70 | 5 | 0,73% |
CC | šť | 20 106 | 1,96% | 165,25 | 17 | 2,49% |
CC | nš | 16 945 | 1,65% | 139,27 | 1 | 0,15% |
CC | sk | 15 568 | 1,52% | 127,96 | 17 | 2,49% |
CC | ks | 14 268 | 1,39% | 117,27 | 18 | 2,64% |
CC | nk | 13 346 | 1,30% | 109,69 | 17 | 2,49% |
CC | lm | 12 908 | 1,26% | 106,09 | 3 | 0,44% |
CC | rk | 8 156 | 0,79% | 67,04 | 13 | 1,91% |
CC | jn | 5 901 | 0,57% | 48,50 | 6 | 0,88% |
CC | lf | 5 588 | 0,54% | 45,93 | 8 | 1,17% |
CC | lt | 5 542 | 0,54% | 45,55 | 12 | 1,76% |
CC | rs | 4 211 | 0,41% | 34,61 | 4 | 0,59% |
CC | jl | 3 916 | 0,38% | 32,19 | 4 | 0,59% |
CC | jť | 3 596 | 0,35% | 29,56 | 2 | 0,29% |
CC | pt | 3 377 | 0,33% | 27,76 | 4 | 0,59% |
CC | jš | 3 051 | 0,30% | 25,08 | 3 | 0,44% |
CC | jt | 3 028 | 0,29% | 24,89 | 3 | 0,44% |
CC | nc | 2 995 | 0,29% | 24,62 | 3 | 0,44% |
CC | js | 2 217 | 0,22% | 18,22 | 2 | 0,29% |
CC | št | 2 147 | 0,21% | 17,65 | 4 | 0,59% |
CC | xť | 1 308 | 0,13% | 10,75 | 1 | 0,15% |
CC | ls | 1 220 | 0,12% | 10,03 | 3 | 0,44% |
CC | ns | 1 059 | 0,10% | 8,70 | 2 | 0,29% |
CC | mf | 1 045 | 0,10% | 8,59 | 1 | 0,15% |
CC | mp | 1 007 | 0,10% | 8,28 | 3 | 0,44% |
CC | kš | 995 | 0,10% | 8,18 | 3 | 0,44% |
CC | lp | 920 | 0,09% | 7,56 | 2 | 0,29% |
CC | rn | 903 | 0,09% | 7,42 | 2 | 0,29% |
CC | sť | 896 | 0,09% | 7,36 | 2 | 0,29% |
CC | rx | 828 | 0,08% | 6,81 | 1 | 0,15% |
CC | jm | 807 | 0,08% | 6,63 | 1 | 0,15% |
CC | rf | 738 | 0,07% | 6,07 | 3 | 0,44% |
CC | ps | 713 | 0,07% | 5,86 | 2 | 0,29% |
CC | lc | 691 | 0,07% | 5,68 | 2 | 0,29% |
CC | rš | 576 | 0,06% | 4,73 | 1 | 0,15% |
CC | ft | 550 | 0,05% | 4,52 | 2 | 0,29% |
CC | rm | 512 | 0,05% | 4,21 | 2 | 0,29% |
CC | rp | 467 | 0,05% | 3,84 | 2 | 0,29% |
CC | nč | 433 | 0,04% | 3,56 | 2 | 0,29% |
CC | jf | 415 | 0,04% | 3,41 | 1 | 0,15% |
CC | rč | 401 | 0,04% | 3,30 | 1 | 0,15% |
CC | rc | 359 | 0,03% | 2,95 | 1 | 0,15% |
CC | jk | 273 | 0,03% | 2,24 | 1 | 0,15% |
CC | xt | 265 | 0,03% | 2,18 | 1 | 0,15% |
CC | lk | 244 | 0,02% | 2,01 | 1 | 0,15% |
CC | lč | 241 | 0,02% | 1,98 | 1 | 0,15% |
typ | koda | abs_frek (tokeny) | rel_frek (tokeny) | IPM (tokeny) | abs_frek (typy) | rel_frek (typy) |
---|---|---|---|---|---|---|
CCC | 6 kod | 9 986 | 100,00% | 82,08 | 9 | 100,00% |
CCC | kst | 6 827 | 68,37% | 56,11 | 3 | 33,33% |
CCC | jsk | 934 | 9,35% | 7,68 | 1 | 11,11% |
CCC | nkt | 711 | 7,12% | 5,84 | 1 | 11,11% |
CCC | rkt | 677 | 6,78% | 5,56 | 1 | 11,11% |
CCC | rks | 637 | 6,38% | 5,24 | 2 | 22,22% |
CCC | jls | 200 | 2,00% | 1,64 | 1 | 11,11% |
typ | koda | abs_frek (tokeny) | rel_frek (tokeny) | IPM (tokeny) | abs_frek (typy) | rel_frek (typy) |
---|---|---|---|---|---|---|
CCCC | 1 koda | 429 | 100,00% | 3,53 | 1 | 100,00% |
CCCC | rnst | 429 | 100,00% | 3,53 | 1 | 100,00% |
© 2016 Filozofická Fakulta Univerzity Karlovy v Praze / Pavel Šturm, David Lukeš