comparison discussion.roff @ 122:c234656329e0

Wrote about modularization.
author markus schnalke <meillo@marmaro.de>
date Fri, 29 Jun 2012 22:51:25 +0200
parents edbc6e1dc636
children 740f4128dea7
comparison
equal deleted inserted replaced
121:edbc6e1dc636 122:c234656329e0
2536 Section 1.6 of 2536 Section 1.6 of
2537 .[ [ 2537 .[ [
2538 kernighan pike practice of programming 2538 kernighan pike practice of programming
2539 .], p. 23] 2539 .], p. 23]
2540 demands: ``Don't belabor the obvious.'' 2540 demands: ``Don't belabor the obvious.''
2541 Hence, I simply removed comments like the following: 2541 Hence, I simply removed all the comments in the following code excerpt:
2542 .VS 2542 .VS
2543 context_replace(curfolder, folder); /* update current folder */ 2543 context_replace(curfolder, folder); /* update current folder */
2544 seq_setcur(mp, mp->lowsel); /* update current message */ 2544 seq_setcur(mp, mp->lowsel); /* update current message */
2545 seq_save(mp); /* synchronize message sequences */ 2545 seq_save(mp); /* synchronize message sequences */
2546 folder_free(mp); /* free folder/message structure */ 2546 folder_free(mp); /* free folder/message structure */
2952 2952
2953 2953
2954 2954
2955 .H2 "Modularization 2955 .H2 "Modularization
2956 .P 2956 .P
2957 The \fIMH library\fP 2957 Mmh's code base is split into two directories,
2958 .Fn libmh.a 2958 .Fn sbr
2959 collects a bunch of standard functions that many of the MH tools need, 2959 (``subroutines'')
2960 like reading the profile or context files. 2960 and
2961 This doesn't hurt the separation. 2961 .Fn uip
2962 .P 2962 (``user interface programs'').
2963 whatnowproc 2963 The directory
2964 .Fn sbr
2965 contains the sources of the \fIMH library\fP
2966 .Fn libmh.a .
2967 It includes functions that mmh tools usually need.
2968 Among them are MH-specific functions for profile, context, sequence,
2969 and folder handling, but as well
2970 MH-independent functions, such as advanced string processing functions,
2971 portability interfaces and error-checking wrappers for critical
2972 functions of the standard library.
2973 .P
2974 The MH library is a standard library for the source files in the
2975 .Fn uip
2976 directory.
2977 There reside the sources of the programs of the mmh toolchest.
2978 Each tools has a source file with the name name.
2979 For example,
2980 .Pn rmm
2981 is built from
2982 .Fn uip/rmm.c .
2983 Some source files are used by multiple programs.
2984 For example
2985 .Fn uip/scansbr.c
2986 is used by both,
2987 .Pn scan
2988 and
2989 .Pn inc .
2990 In nmh, 49 tools were built from 76 source files.
2991 That is a ratio of 1.6 source files per program.
2992 17 programs depended on the equally named source file only.
2993 32 programs depended on multiple source files.
2994 In mmh, 39 tools are built from 51 source files.
2995 That is a ratio of 1.3 source files per program.
2996 21 programs depended on the equally named source file only.
2997 18 programs depended on multiple source files.
2998 The MH library as well as shell scripts and multiple names to the
2999 same program were ignored.
3000 .P
3001 Splitting the source code of one program into multiple files can
3002 increase the readability of its source code.
3003 This applies primary to complex programs.
3004 Most of the mmh tools, however, are simple and staight-forward programs.
3005 With the exception of the MIME handling tools,
3006 .Pn pick
3007 is the largest tools.
3008 It contains 1\|037 lines of source code (measured with
3009 .Pn sloccount ), excluding the MH library.
3010 Only the MIME handling tools (\c
3011 .Pn mhbuild ,
3012 .Pn mhstore ,
3013 .Pn show ,
3014 etc.)
3015 are larger.
3016 Splitting programs with less than 1\|000 lines of code into multiple
3017 source files leads seldom to better readability.
3018 The such tools, splitting makes sense,
3019 when parts of the code are reused in other programs,
3020 and the reused code fragment is not general enough
3021 for including it in the MH library,
3022 or, if has depencencies on a library that only few programs need.
3023 .Fn uip/packsbr.c ,
3024 for instance, provides the core program logic for the
3025 .Pn packf
3026 and
3027 .Pn rcvpack
3028 programs.
3029 .Fn uip/packf.c
3030 and
3031 .Fn uip/rcvpack.c
3032 mainly wrap the core function appropriately.
3033 No other tools use the folder packing functions.
3034 .P
3035 The task of MIME handling is complex enough that splitting its code
3036 into multiple source files improves the readability.
3037 The program
3038 .Pn mhstore ,
3039 for instance, is compiled out of seven source files with 2\|500
3040 lines of code in summary.
3041 The main code file
3042 .Fn uip/mhstore.c
3043 consists of 800 lines; the rest is reused in the other MIME handling tools.
3044 It might be worthwhile to bundle the generic MIME handling code into
3045 a MH-MIME library, in resemblence of the MH standard library.
3046 This is left open for the future.
3047 .P
3048 The work already done focussed on the non-MIME tools.
3049 The amount of code compiled into each program was reduced.
3050 This eased the understanding of the code base.
3051 In nmh,
3052 .Pn comp
3053 was built from six source files:
3054 .Fn comp.c ,
3055 .Fn whatnowproc.c ,
3056 .Fn whatnowsbr.c ,
3057 .Fn sendsbr.c ,
3058 .Fn annosbr.c ,
3059 and
3060 .Fn distsbr.c .
3061 In mmh, it builds from only two:
3062 .Fn comp.c
3063 and
3064 .Fn whatnowproc.c .
3065 Instead of invoking the
3066 .Pn whatnow ,
3067 .Pn send ,
3068 and
3069 .Pn anno
3070 programs
3071 their core function was compiled into nmh's
3072 .Pn comp .
3073 This saved the need to
3074 .Fu fork()
3075 and
3076 .Fu exec() ,
3077 two expensive system calls.
3078 Whereis this approach improved the time performance,
3079 it interweaved the source code.
3080 Core functionalities were not encapsulated into programs but into
3081 function, which were then wrapped by programs.
3082 For example,
3083 .Fn uip/annosbr.c
3084 included the function
3085 .Fu annotate() .
3086 Each program that wanted to annotate messages, included the source file
3087 .Fn uip/annosbr.c .
3088 The programs called
3089 .Fu annotate() ,
3090 which required seven parameters, reflecting the command line switches of
3091 .Pn anno .
3092 When another pair of command line switches was added to
3093 .Pn anno ,
3094 a rather ugly hack was implemented to avoid adding another parameter
3095 to the function.
3096 .Ci d9b1d57351d104d7ec1a5621f090657dcce8cb7f
3097 .P
3098 Separation simplifies the understanding of program code
3099 because the area influenced by any particular statement is smaller.
3100 The separating on the program-level is more strict than the separation
3101 on the function level.
3102 In mmh, the relevant code of
3103 .Pn comp
3104 comprises the two files
3105 .Fn uip/comp.c
3106 and
3107 .Fn uip/whatnowproc.c ,
3108 together 210 lines of code,
3109 the standard libraries excluded.
3110 In nmh,
3111 .Pn comp
3112 comprises six files with 2\|450 lines.
3113 Of course, not all of the code in these six files was actually used by
3114 .Pn comp ,
3115 but the code reader needs to understand the code first to know which.
3116 .P
3117 As I have read a lot in the code base during the last two years to
3118 understand it, I learned about the easy and the difficult parts.
3119 The smaller the influenced code area is, the stricter the boundaries
3120 are defined, and the more straight-forward the code is written,
3121 the easier is it to be understood.
3122 Reading the
3123 .Pn rmm 's
3124 source code in
3125 .Fn uip/rmm.c
3126 is my recommendation for a beginner's entry point into the code base of nmh.
3127 The reasons are that the task of
3128 .Pn rmm
3129 is straight forward and it consists of one small source code file only,
3130 yet its source includes code constructs typical for MH tools.
3131 With the introduction of the trash folder in mmh,
3132 .Pn rmm
3133 became a bit more complex, because it invokes
3134 .Pn refile .
3135 Still, it is a good example for a simple tool with clear sources.
3136 .P
3137 Understanding
3138 .Pn comp
3139 requires to read 210 lines of code in mmh, but ten times as much in nmh.
3140 In the aforementioned hack in
3141 .Pn anno
3142 to save the additional parameter, information passed through the program's
3143 source base in obscure ways.
3144 To understand
3145 .Pn comp ,
3146 one needed to understand the inner workings of
3147 .Fn uip/annosbr.c
3148 first.
3149 To be sure, to fully understand a program, its whole source code needs
3150 to be examined.
3151 Otherwise it would be a leap of faith, assuming that the developers
3152 have avoided obscure programming techniques.
3153 By separating the tools on the program-level, the boundaries are
3154 clearly visible and technically enforced.
3155 The interfaces are calls to
3156 .Fu exec()
3157 rather than arbitrary function calls.
3158 In order to understand
3159 .Pn comp ,
3160 it is no more necessary to read
3161 .Fn uip/sendsbr.c .
3162 In mmh,
3163 .Pn comp
3164 does no longer send messages.
3165 In nmh, there surely is
3166 .Pn send ,
3167 but
3168 .Pn comp
3169 \&... and
3170 .Pn repl
3171 and
3172 .Pn forw
3173 and
3174 .Pn dist
3175 and
3176 .Pn whatnow
3177 and
3178 .Pn viamail
3179 (!) ... all have the same message sending function included.
3180 The clear separation on the surface \(en the toolchest approach \(en
3181 it is violated on the level below.
3182 This violation is for the sake of time performance.
3183 On systems where
3184 .Fu fork()
3185 and
3186 .Fu exec()
3187 are expensive, the quicker response might be noticable.
3188 In the old times, sacrifying readability and conceptional beauty for speed
3189 might even have been necessary to prevent MH from being unusably slow.
3190 Whatever the reasons had been, today they are gone.
3191 No longer should we sacrifice readability and conceptional beauty.
3192 No longer should we violate the Unix philosophy's ``one tool, one job''
3193 guideline.
3194 No longer should we keep speed improvements that are unnecessary today.
3195 .P
3196 In mmh, the different jobs are divided among separate programs that
3197 invoke each other as needed.
3198 The clear separation on the surface is still visible on the level below.
3199
2964 3200
2965 3201
2966 3202
2967 .H2 "Separation 3203 .H2 "Separation
2968 3204