docs/master

changeset 122:c234656329e0

Wrote about modularization.
author markus schnalke <meillo@marmaro.de>
date Fri, 29 Jun 2012 22:51:25 +0200
parents edbc6e1dc636
children 740f4128dea7
files discussion.roff
diffstat 1 files changed, 243 insertions(+), 7 deletions(-) [+]
line diff
     1.1 --- a/discussion.roff	Tue Jun 26 22:06:20 2012 +0200
     1.2 +++ b/discussion.roff	Fri Jun 29 22:51:25 2012 +0200
     1.3 @@ -2538,7 +2538,7 @@
     1.4  kernighan pike practice of programming
     1.5  .], p. 23]
     1.6  demands: ``Don't belabor the obvious.''
     1.7 -Hence, I simply removed comments like the following:
     1.8 +Hence, I simply removed all the comments in the following code excerpt:
     1.9  .VS
    1.10  context_replace(curfolder, folder);  /* update current folder  */
    1.11  seq_setcur(mp, mp->lowsel);  /* update current message */
    1.12 @@ -2954,13 +2954,249 @@
    1.13  
    1.14  .H2 "Modularization
    1.15  .P
    1.16 -The \fIMH library\fP
    1.17 -.Fn libmh.a
    1.18 -collects a bunch of standard functions that many of the MH tools need,
    1.19 -like reading the profile or context files.
    1.20 -This doesn't hurt the separation.
    1.21 +Mmh's code base is split into two directories,
    1.22 +.Fn sbr
    1.23 +(``subroutines'')
    1.24 +and
    1.25 +.Fn uip
    1.26 +(``user interface programs'').
    1.27 +The directory
    1.28 +.Fn sbr
    1.29 +contains the sources of the \fIMH library\fP
    1.30 +.Fn libmh.a .
    1.31 +It includes functions that mmh tools usually need.
    1.32 +Among them are MH-specific functions for profile, context, sequence,
    1.33 +and folder handling, but as well
    1.34 +MH-independent functions, such as advanced string processing functions,
    1.35 +portability interfaces and error-checking wrappers for critical
    1.36 +functions of the standard library.
    1.37  .P
    1.38 -whatnowproc
    1.39 +The MH library is a standard library for the source files in the
    1.40 +.Fn uip
    1.41 +directory.
    1.42 +There reside the sources of the programs of the mmh toolchest.
    1.43 +Each tools has a source file with the name name.
    1.44 +For example,
    1.45 +.Pn rmm
    1.46 +is built from
    1.47 +.Fn uip/rmm.c .
    1.48 +Some source files are used by multiple programs.
    1.49 +For example
    1.50 +.Fn uip/scansbr.c
    1.51 +is used by both,
    1.52 +.Pn scan
    1.53 +and
    1.54 +.Pn inc .
    1.55 +In nmh, 49 tools were built from 76 source files.
    1.56 +That is a ratio of 1.6 source files per program.
    1.57 +17 programs depended on the equally named source file only.
    1.58 +32 programs depended on multiple source files.
    1.59 +In mmh, 39 tools are built from 51 source files.
    1.60 +That is a ratio of 1.3 source files per program.
    1.61 +21 programs depended on the equally named source file only.
    1.62 +18 programs depended on multiple source files.
    1.63 +The MH library as well as shell scripts and multiple names to the
    1.64 +same program were ignored.
    1.65 +.P
    1.66 +Splitting the source code of one program into multiple files can
    1.67 +increase the readability of its source code.
    1.68 +This applies primary to complex programs.
    1.69 +Most of the mmh tools, however, are simple and staight-forward programs.
    1.70 +With the exception of the MIME handling tools,
    1.71 +.Pn pick
    1.72 +is the largest tools.
    1.73 +It contains 1\|037 lines of source code (measured with
    1.74 +.Pn sloccount ), excluding the MH library.
    1.75 +Only the MIME handling tools (\c
    1.76 +.Pn mhbuild ,
    1.77 +.Pn mhstore ,
    1.78 +.Pn show ,
    1.79 +etc.)
    1.80 +are larger.
    1.81 +Splitting programs with less than 1\|000 lines of code into multiple
    1.82 +source files leads seldom to better readability.
    1.83 +The such tools, splitting makes sense,
    1.84 +when parts of the code are reused in other programs,
    1.85 +and the reused code fragment is not general enough
    1.86 +for including it in the MH library,
    1.87 +or, if has depencencies on a library that only few programs need.
    1.88 +.Fn uip/packsbr.c ,
    1.89 +for instance, provides the core program logic for the
    1.90 +.Pn packf
    1.91 +and
    1.92 +.Pn rcvpack
    1.93 +programs.
    1.94 +.Fn uip/packf.c
    1.95 +and
    1.96 +.Fn uip/rcvpack.c
    1.97 +mainly wrap the core function appropriately.
    1.98 +No other tools use the folder packing functions.
    1.99 +.P
   1.100 +The task of MIME handling is complex enough that splitting its code
   1.101 +into multiple source files improves the readability.
   1.102 +The program
   1.103 +.Pn mhstore ,
   1.104 +for instance, is compiled out of seven source files with 2\|500
   1.105 +lines of code in summary.
   1.106 +The main code file
   1.107 +.Fn uip/mhstore.c
   1.108 +consists of 800 lines; the rest is reused in the other MIME handling tools.
   1.109 +It might be worthwhile to bundle the generic MIME handling code into
   1.110 +a MH-MIME library, in resemblence of the MH standard library.
   1.111 +This is left open for the future.
   1.112 +.P
   1.113 +The work already done focussed on the non-MIME tools.
   1.114 +The amount of code compiled into each program was reduced.
   1.115 +This eased the understanding of the code base.
   1.116 +In nmh,
   1.117 +.Pn comp
   1.118 +was built from six source files:
   1.119 +.Fn comp.c ,
   1.120 +.Fn whatnowproc.c ,
   1.121 +.Fn whatnowsbr.c ,
   1.122 +.Fn sendsbr.c ,
   1.123 +.Fn annosbr.c ,
   1.124 +and
   1.125 +.Fn distsbr.c .
   1.126 +In mmh, it builds from only two:
   1.127 +.Fn comp.c
   1.128 +and
   1.129 +.Fn whatnowproc.c .
   1.130 +Instead of invoking the
   1.131 +.Pn whatnow ,
   1.132 +.Pn send ,
   1.133 +and
   1.134 +.Pn anno
   1.135 +programs
   1.136 +their core function was compiled into nmh's
   1.137 +.Pn comp .
   1.138 +This saved the need to
   1.139 +.Fu fork()
   1.140 +and
   1.141 +.Fu exec() ,
   1.142 +two expensive system calls.
   1.143 +Whereis this approach improved the time performance,
   1.144 +it interweaved the source code.
   1.145 +Core functionalities were not encapsulated into programs but into
   1.146 +function, which were then wrapped by programs.
   1.147 +For example,
   1.148 +.Fn uip/annosbr.c
   1.149 +included the function
   1.150 +.Fu annotate() .
   1.151 +Each program that wanted to annotate messages, included the source file
   1.152 +.Fn uip/annosbr.c .
   1.153 +The programs called
   1.154 +.Fu annotate() ,
   1.155 +which required seven parameters, reflecting the command line switches of
   1.156 +.Pn anno .
   1.157 +When another pair of command line switches was added to
   1.158 +.Pn anno ,
   1.159 +a rather ugly hack was implemented to avoid adding another parameter
   1.160 +to the function.
   1.161 +.Ci d9b1d57351d104d7ec1a5621f090657dcce8cb7f
   1.162 +.P
   1.163 +Separation simplifies the understanding of program code
   1.164 +because the area influenced by any particular statement is smaller.
   1.165 +The separating on the program-level is more strict than the separation
   1.166 +on the function level.
   1.167 +In mmh, the relevant code of
   1.168 +.Pn comp
   1.169 +comprises the two files
   1.170 +.Fn uip/comp.c
   1.171 +and
   1.172 +.Fn uip/whatnowproc.c ,
   1.173 +together 210 lines of code,
   1.174 +the standard libraries excluded.
   1.175 +In nmh,
   1.176 +.Pn comp
   1.177 +comprises six files with 2\|450 lines.
   1.178 +Of course, not all of the code in these six files was actually used by
   1.179 +.Pn comp ,
   1.180 +but the code reader needs to understand the code first to know which.
   1.181 +.P
   1.182 +As I have read a lot in the code base during the last two years to
   1.183 +understand it, I learned about the easy and the difficult parts.
   1.184 +The smaller the influenced code area is, the stricter the boundaries
   1.185 +are defined, and the more straight-forward the code is written,
   1.186 +the easier is it to be understood.
   1.187 +Reading the
   1.188 +.Pn rmm 's
   1.189 +source code in
   1.190 +.Fn uip/rmm.c
   1.191 +is my recommendation for a beginner's entry point into the code base of nmh.
   1.192 +The reasons are that the task of
   1.193 +.Pn rmm
   1.194 +is straight forward and it consists of one small source code file only,
   1.195 +yet its source includes code constructs typical for MH tools.
   1.196 +With the introduction of the trash folder in mmh,
   1.197 +.Pn rmm
   1.198 +became a bit more complex, because it invokes
   1.199 +.Pn refile .
   1.200 +Still, it is a good example for a simple tool with clear sources.
   1.201 +.P
   1.202 +Understanding
   1.203 +.Pn comp
   1.204 +requires to read 210 lines of code in mmh, but ten times as much in nmh.
   1.205 +In the aforementioned hack in
   1.206 +.Pn anno
   1.207 +to save the additional parameter, information passed through the program's
   1.208 +source base in obscure ways.
   1.209 +To understand
   1.210 +.Pn comp ,
   1.211 +one needed to understand the inner workings of
   1.212 +.Fn uip/annosbr.c
   1.213 +first.
   1.214 +To be sure, to fully understand a program, its whole source code needs
   1.215 +to be examined.
   1.216 +Otherwise it would be a leap of faith, assuming that the developers
   1.217 +have avoided obscure programming techniques.
   1.218 +By separating the tools on the program-level, the boundaries are
   1.219 +clearly visible and technically enforced.
   1.220 +The interfaces are calls to
   1.221 +.Fu exec()
   1.222 +rather than arbitrary function calls.
   1.223 +In order to understand
   1.224 +.Pn comp ,
   1.225 +it is no more necessary to read
   1.226 +.Fn uip/sendsbr.c .
   1.227 +In mmh,
   1.228 +.Pn comp
   1.229 +does no longer send messages.
   1.230 +In nmh, there surely is
   1.231 +.Pn send ,
   1.232 +but
   1.233 +.Pn comp
   1.234 +\&... and
   1.235 +.Pn repl
   1.236 +and
   1.237 +.Pn forw
   1.238 +and
   1.239 +.Pn dist
   1.240 +and
   1.241 +.Pn whatnow
   1.242 +and
   1.243 +.Pn viamail
   1.244 +(!) ... all have the same message sending function included.
   1.245 +The clear separation on the surface \(en the toolchest approach \(en
   1.246 +it is violated on the level below.
   1.247 +This violation is for the sake of time performance.
   1.248 +On systems where
   1.249 +.Fu fork()
   1.250 +and
   1.251 +.Fu exec()
   1.252 +are expensive, the quicker response might be noticable.
   1.253 +In the old times, sacrifying readability and conceptional beauty for speed
   1.254 +might even have been necessary to prevent MH from being unusably slow.
   1.255 +Whatever the reasons had been, today they are gone.
   1.256 +No longer should we sacrifice readability and conceptional beauty.
   1.257 +No longer should we violate the Unix philosophy's ``one tool, one job''
   1.258 +guideline.
   1.259 +No longer should we keep speed improvements that are unnecessary today.
   1.260 +.P
   1.261 +In mmh, the different jobs are divided among separate programs that
   1.262 +invoke each other as needed.
   1.263 +The clear separation on the surface is still visible on the level below.
   1.264 +
   1.265  
   1.266  
   1.267