docs/master

diff discussion.roff @ 123:740f4128dea7

Reworked and extended the text about Modularization.
author markus schnalke <meillo@marmaro.de>
date Sat, 30 Jun 2012 15:00:04 +0200
parents c234656329e0
children 3d30fd938aa9
line diff
     1.1 --- a/discussion.roff	Fri Jun 29 22:51:25 2012 +0200
     1.2 +++ b/discussion.roff	Sat Jun 30 15:00:04 2012 +0200
     1.3 @@ -2820,15 +2820,17 @@
     1.4  .Fu snprintf()
     1.5  function, for instance, was standardized with C99 and is available
     1.6  almost everywhere because of its high usefulness.
     1.7 -In mmh, the included implementation of
     1.8 +In project's own implementation of
     1.9  .Fu snprintf()
    1.10 -was dropped in favor for using the one of the standard library.
    1.11 -Such decisions could limit the portability of mmh,
    1.12 +was dropped in March 2012 in favor for using the one of the
    1.13 +standard library.
    1.14 +.Ci 0052f1024deb0a0a2fc2e5bacf93d45a5a9c9b32
    1.15 +Such decisions limit the portability of mmh
    1.16  if systems don't support these standardized and widespread functions.
    1.17 -This compromise is made.
    1.18 +This compromise is made because mmh focuses on the future.
    1.19  .P
    1.20 -I am not even thirty years old and my C and Unix experience comprises
    1.21 -only five to seven years.
    1.22 +I am not yet thirty years old and my C and Unix experience comprises
    1.23 +only half a dozen years.
    1.24  Hence, I need to learn about the history in retrospective.
    1.25  I have not used those ancient constructs myself.
    1.26  I have not suffered from their incompatibilities.
    1.27 @@ -2837,29 +2839,40 @@
    1.28  were well established already.
    1.29  I have only read a lot of books about the (good) old times.
    1.30  This puts me in a difficult positions when working with old code.
    1.31 -I needed to acquire knowledge about old code constructs and ancient
    1.32 -programming styles.
    1.33 -Other know these things by heart from their own experience.
    1.34 +I need to freshly acquire knowledge about old code constructs and ancient
    1.35 +programming styles, whereas older programmers know these things by
    1.36 +heart from their own experience.
    1.37  .P
    1.38 -Being aware of these facts, I rather let people with more historic
    1.39 -experience solve the task of replacing ancient code constructs
    1.40 -with standardized ones.
    1.41 +Being aware of the situation, I rather let people with more historic
    1.42 +experience replace ancient code constructs with standardized ones.
    1.43  Lyndon Nerenberg covered large parts of this task for the nmh project.
    1.44  He converted project-specific functions to POSIX replacements,
    1.45  also removing the conditionals compilation of now standardized features.
    1.46 +Ken Hornstein and David Levine had their part in the work, too.
    1.47  Often, I only needed to pull over changes from nmh into mmh.
    1.48  These changes include many commits; these are among them:
    1.49  .Ci 768b5edd9623b7238e12ec8dfc409b82a1ed9e2d
    1.50  .Ci 0052f1024deb0a0a2fc2e5bacf93d45a5a9c9b32 .
    1.51 -
    1.52  .P
    1.53 -During my own work, I have replaced the
    1.54 +During my own work, I tidied up the \fIMH standard library\fP,
    1.55 +.Fn libmh.a ,
    1.56 +which is located in the
    1.57 +.Fn sbr
    1.58 +(``subroutines'') directory in the source tree.
    1.59 +The MH library includes functions that mmh tools usually need.
    1.60 +Among them are MH-specific functions for profile, context, sequence,
    1.61 +and folder handling, but as well
    1.62 +MH-independent functions, such as auxiliary string functions,
    1.63 +portability interfaces and error-checking wrappers for critical
    1.64 +functions of the standard library.
    1.65 +.P
    1.66 +I have replaced the
    1.67  .Fu atooi()
    1.68  function with calls to
    1.69 -.Fu strtoul() ,
    1.70 +.Fu strtoul()
    1.71  with the third parameter \(en the base \(en set to eight.
    1.72  .Fu strtoul()
    1.73 -is part of C89 and thus safe to use.
    1.74 +is part of C89 and thus considered safe to use.
    1.75  .Ci c490c51b3c0f8871b6953bd0c74551404f840a74
    1.76  .P
    1.77  I did remove project-included fallback implementations of
    1.78 @@ -2879,19 +2892,20 @@
    1.79  In contrast to
    1.80  .Fu strcpy() ,
    1.81  it returns a pointer to the terminating null-byte in the destination area.
    1.82 +The code was adjusted to replace
    1.83  .Fu copy()
    1.84 -was replaced with
    1.85 +with
    1.86  .Fu strcpy() ,
    1.87  except within
    1.88  .Fu concat() ,
    1.89  where
    1.90  .Fu copy()
    1.91 -is more convenient.
    1.92 -Of course,
    1.93 +was more convenient.
    1.94 +Therefore, the definition of
    1.95  .Fu copy()
    1.96 -is locally defined in the source file of
    1.97 +was moved into the source file of
    1.98  .Fu concat()
    1.99 -now.
   1.100 +and its visibility is now limited to it.
   1.101  .Ci 552fd7253e5ee9e554c5c7a8248a6322aa4363bb
   1.102  .P
   1.103  The function
   1.104 @@ -2909,14 +2923,14 @@
   1.105  became desirable.
   1.106  Unfortunately, many of the 54 calls to
   1.107  .Fu r1bindex()
   1.108 -depended on its special behavior,
   1.109 +depended on a special behavior,
   1.110  which differed from the POSIX specification for
   1.111  .Fu basename() .
   1.112  Hence,
   1.113  .Fu r1bindex()
   1.114  was kept but renamed to
   1.115 -.Fu mhbasename()
   1.116 -and its second argument, the delimiter, was fixed to the slash.
   1.117 +.Fu mhbasename() ,
   1.118 +fixing the delimiter to the slash.
   1.119  .Ci 240013872c392fe644bd4f79382d9f5314b4ea60
   1.120  For possible uses of
   1.121  .Fu r1bindex()
   1.122 @@ -2947,6 +2961,9 @@
   1.123   * If yes, then return 1, else return 0.
   1.124   */
   1.125  VE
   1.126 +Two months later, it was completely removed by replacing it with
   1.127 +.Fu strncmp() .
   1.128 +.Ci b0b1dd37ff515578cf7cba51625189eb34a196cb
   1.129  
   1.130  
   1.131  
   1.132 @@ -2954,53 +2971,34 @@
   1.133  
   1.134  .H2 "Modularization
   1.135  .P
   1.136 -Mmh's code base is split into two directories,
   1.137 -.Fn sbr
   1.138 -(``subroutines'')
   1.139 -and
   1.140 +The source code of the mmh tools is located in the
   1.141  .Fn uip
   1.142 -(``user interface programs'').
   1.143 -The directory
   1.144 -.Fn sbr
   1.145 -contains the sources of the \fIMH library\fP
   1.146 -.Fn libmh.a .
   1.147 -It includes functions that mmh tools usually need.
   1.148 -Among them are MH-specific functions for profile, context, sequence,
   1.149 -and folder handling, but as well
   1.150 -MH-independent functions, such as advanced string processing functions,
   1.151 -portability interfaces and error-checking wrappers for critical
   1.152 -functions of the standard library.
   1.153 -.P
   1.154 -The MH library is a standard library for the source files in the
   1.155 -.Fn uip
   1.156 -directory.
   1.157 -There reside the sources of the programs of the mmh toolchest.
   1.158 -Each tools has a source file with the name name.
   1.159 +(``user interface programs'') directory.
   1.160 +Each tools has a source file with the same name.
   1.161  For example,
   1.162  .Pn rmm
   1.163  is built from
   1.164  .Fn uip/rmm.c .
   1.165 -Some source files are used by multiple programs.
   1.166 +Some source files are used for multiple programs.
   1.167  For example
   1.168  .Fn uip/scansbr.c
   1.169 -is used by both,
   1.170 +is used for both,
   1.171  .Pn scan
   1.172  and
   1.173  .Pn inc .
   1.174  In nmh, 49 tools were built from 76 source files.
   1.175 -That is a ratio of 1.6 source files per program.
   1.176 -17 programs depended on the equally named source file only.
   1.177 -32 programs depended on multiple source files.
   1.178 +This is a ratio of 1.6 source files per program.
   1.179 +32 programs depended on multiple source files;
   1.180 +17 programs depended on one source file only.
   1.181  In mmh, 39 tools are built from 51 source files.
   1.182 -That is a ratio of 1.3 source files per program.
   1.183 -21 programs depended on the equally named source file only.
   1.184 -18 programs depended on multiple source files.
   1.185 -The MH library as well as shell scripts and multiple names to the
   1.186 -same program were ignored.
   1.187 +This is a ratio of 1.3 source files per program.
   1.188 +18 programs depend on multiple source files;
   1.189 +21 programs depend on one source file only.
   1.190 +(These numbers and the ones in the following text ignore the MH library
   1.191 +as well as shell scripts and multiple names for the same program.)
   1.192  .P
   1.193 -Splitting the source code of one program into multiple files can
   1.194 +Splitting the source code of a large program into multiple files can
   1.195  increase the readability of its source code.
   1.196 -This applies primary to complex programs.
   1.197  Most of the mmh tools, however, are simple and staight-forward programs.
   1.198  With the exception of the MIME handling tools,
   1.199  .Pn pick
   1.200 @@ -3014,12 +3012,12 @@
   1.201  etc.)
   1.202  are larger.
   1.203  Splitting programs with less than 1\|000 lines of code into multiple
   1.204 -source files leads seldom to better readability.
   1.205 -The such tools, splitting makes sense,
   1.206 +source files seldom leads to better readability.
   1.207 +For such tools, splitting makes sense
   1.208  when parts of the code are reused in other programs,
   1.209  and the reused code fragment is not general enough
   1.210  for including it in the MH library,
   1.211 -or, if has depencencies on a library that only few programs need.
   1.212 +or, if the code has depencencies on a library that only few programs need.
   1.213  .Fn uip/packsbr.c ,
   1.214  for instance, provides the core program logic for the
   1.215  .Pn packf
   1.216 @@ -3031,6 +3029,14 @@
   1.217  .Fn uip/rcvpack.c
   1.218  mainly wrap the core function appropriately.
   1.219  No other tools use the folder packing functions.
   1.220 +As another example,
   1.221 +.Fn uip/termsbr.c
   1.222 +provides termcap support, which requires linking with a termcap or
   1.223 +curses library.
   1.224 +Including
   1.225 +.Fn uip/termsbr.c
   1.226 +into the MH library would require every program to be linked with
   1.227 +termcap or curses, although only few of the programs require it.
   1.228  .P
   1.229  The task of MIME handling is complex enough that splitting its code
   1.230  into multiple source files improves the readability.
   1.231 @@ -3040,14 +3046,15 @@
   1.232  lines of code in summary.
   1.233  The main code file
   1.234  .Fn uip/mhstore.c
   1.235 -consists of 800 lines; the rest is reused in the other MIME handling tools.
   1.236 -It might be worthwhile to bundle the generic MIME handling code into
   1.237 -a MH-MIME library, in resemblence of the MH standard library.
   1.238 +consists of 800 lines; the other 1\|700 lines of code are reused in
   1.239 +other MIME handling tools.
   1.240 +It seems to be worthwhile to bundle the generic MIME handling code into
   1.241 +a MH-MIME library, as a companion to the MH standard library.
   1.242  This is left open for the future.
   1.243  .P
   1.244 -The work already done focussed on the non-MIME tools.
   1.245 +The work already done, focussed on the non-MIME tools.
   1.246  The amount of code compiled into each program was reduced.
   1.247 -This eased the understanding of the code base.
   1.248 +This eases the understanding of the code base.
   1.249  In nmh,
   1.250  .Pn comp
   1.251  was built from six source files:
   1.252 @@ -3062,15 +3069,16 @@
   1.253  .Fn comp.c
   1.254  and
   1.255  .Fn whatnowproc.c .
   1.256 -Instead of invoking the
   1.257 +In nmh's
   1.258 +.Pn comp ,
   1.259 +the core function of
   1.260  .Pn whatnow ,
   1.261  .Pn send ,
   1.262  and
   1.263  .Pn anno
   1.264 -programs
   1.265 -their core function was compiled into nmh's
   1.266 +were compiled into
   1.267  .Pn comp .
   1.268 -This saved the need to
   1.269 +This saved the need to execute these programs with
   1.270  .Fu fork()
   1.271  and
   1.272  .Fu exec() ,
   1.273 @@ -3084,11 +3092,14 @@
   1.274  included the function
   1.275  .Fu annotate() .
   1.276  Each program that wanted to annotate messages, included the source file
   1.277 -.Fn uip/annosbr.c .
   1.278 -The programs called
   1.279 -.Fu annotate() ,
   1.280 -which required seven parameters, reflecting the command line switches of
   1.281 -.Pn anno .
   1.282 +.Fn uip/annosbr.c
   1.283 +and called
   1.284 +.Fu annotate() .
   1.285 +Because the function
   1.286 +.Fu annotate()
   1.287 +was used like the tool
   1.288 +.Pn anno ,
   1.289 +it had seven parameters, reflecting the command line switches of the tool.
   1.290  When another pair of command line switches was added to
   1.291  .Pn anno ,
   1.292  a rather ugly hack was implemented to avoid adding another parameter
   1.293 @@ -3105,21 +3116,27 @@
   1.294  .Fn uip/comp.c
   1.295  and
   1.296  .Fn uip/whatnowproc.c ,
   1.297 -together 210 lines of code,
   1.298 -the standard libraries excluded.
   1.299 +together 210 lines of code.
   1.300  In nmh,
   1.301  .Pn comp
   1.302  comprises six files with 2\|450 lines.
   1.303 -Of course, not all of the code in these six files was actually used by
   1.304 +Not all of the code in these six files was actually used by
   1.305  .Pn comp ,
   1.306 -but the code reader needs to understand the code first to know which.
   1.307 +but the code reader needed to read all of the code first to know which
   1.308 +parts were used.
   1.309  .P
   1.310 -As I have read a lot in the code base during the last two years to
   1.311 -understand it, I learned about the easy and the difficult parts.
   1.312 -The smaller the influenced code area is, the stricter the boundaries
   1.313 -are defined, and the more straight-forward the code is written,
   1.314 -the easier is it to be understood.
   1.315 -Reading the
   1.316 +As I have read a lot in the code base during the last two years,
   1.317 +I learned about the easy and the difficult parts.
   1.318 +Code is easy to understand if:
   1.319 +.BU
   1.320 +The influenced code area is small
   1.321 +.BU
   1.322 +The boundaries are stictly defined
   1.323 +.BU
   1.324 +The code is written straight-forward
   1.325 +.P
   1.326 +.\" XXX move this paragraph somewhere else?
   1.327 +Reading
   1.328  .Pn rmm 's
   1.329  source code in
   1.330  .Fn uip/rmm.c
   1.331 @@ -3137,36 +3154,42 @@
   1.332  Understanding
   1.333  .Pn comp
   1.334  requires to read 210 lines of code in mmh, but ten times as much in nmh.
   1.335 -In the aforementioned hack in
   1.336 +Due to the aforementioned hack in
   1.337  .Pn anno
   1.338  to save the additional parameter, information passed through the program's
   1.339  source base in obscure ways.
   1.340 -To understand
   1.341 +Thus, understanding
   1.342  .Pn comp ,
   1.343 -one needed to understand the inner workings of
   1.344 +required understanding the inner workings of
   1.345  .Fn uip/annosbr.c
   1.346  first.
   1.347 -To be sure, to fully understand a program, its whole source code needs
   1.348 +To be sure to fully understand a program, its whole source code needs
   1.349  to be examined.
   1.350 -Otherwise it would be a leap of faith, assuming that the developers
   1.351 +Not doing so is a leap of faith, assuming that the developers
   1.352  have avoided obscure programming techniques.
   1.353  By separating the tools on the program-level, the boundaries are
   1.354  clearly visible and technically enforced.
   1.355  The interfaces are calls to
   1.356  .Fu exec()
   1.357  rather than arbitrary function calls.
   1.358 -In order to understand
   1.359 -.Pn comp ,
   1.360 -it is no more necessary to read
   1.361 -.Fn uip/sendsbr.c .
   1.362 -In mmh,
   1.363 +.P
   1.364 +But the real problem is another:
   1.365 +Nmh violates the golden ``one tool, one job'' rule of the Unix philosophy.
   1.366 +Understanding
   1.367  .Pn comp
   1.368 -does no longer send messages.
   1.369 -In nmh, there surely is
   1.370 +requires understanding
   1.371 +.Fn uip/annosbr.c
   1.372 +and
   1.373 +.Fn uip/sendsbr.c
   1.374 +because
   1.375 +.Pn comp
   1.376 +does annotate and send messages.
   1.377 +In nmh, there surely exists the tool
   1.378  .Pn send ,
   1.379 -but
   1.380 +which does (almost) only send messages.
   1.381 +But
   1.382  .Pn comp
   1.383 -\&... and
   1.384 +and
   1.385  .Pn repl
   1.386  and
   1.387  .Pn forw
   1.388 @@ -3175,10 +3198,20 @@
   1.389  and
   1.390  .Pn whatnow
   1.391  and
   1.392 -.Pn viamail
   1.393 -(!) ... all have the same message sending function included.
   1.394 +.Pn viamail ,
   1.395 +they all (!) have the same message sending function included, too.
   1.396 +In result,
   1.397 +.Pn comp
   1.398 +sends messages without using
   1.399 +.Pn send .
   1.400 +The situation is the same as if
   1.401 +.Pn grep
   1.402 +would page without
   1.403 +.Pn more
   1.404 +just because both programs are part of the same code base.
   1.405 +.P
   1.406  The clear separation on the surface \(en the toolchest approach \(en
   1.407 -it is violated on the level below.
   1.408 +is violated on the level below.
   1.409  This violation is for the sake of time performance.
   1.410  On systems where
   1.411  .Fu fork()
   1.412 @@ -3186,16 +3219,44 @@
   1.413  .Fu exec()
   1.414  are expensive, the quicker response might be noticable.
   1.415  In the old times, sacrifying readability and conceptional beauty for speed
   1.416 -might even have been necessary to prevent MH from being unusably slow.
   1.417 +might even have been a must to prevent MH from being unusably slow.
   1.418  Whatever the reasons had been, today they are gone.
   1.419 -No longer should we sacrifice readability and conceptional beauty.
   1.420 +No longer should we sacrifice readability or conceptional beauty.
   1.421  No longer should we violate the Unix philosophy's ``one tool, one job''
   1.422  guideline.
   1.423 -No longer should we keep speed improvements that are unnecessary today.
   1.424 +No longer should we keep speed improvements that became unnecessary.
   1.425  .P
   1.426 -In mmh, the different jobs are divided among separate programs that
   1.427 +Therefore, mmh's
   1.428 +.Pn comp
   1.429 +does no longer send messages.
   1.430 +In mmh, different jobs are divided among separate programs that
   1.431  invoke each other as needed.
   1.432 -The clear separation on the surface is still visible on the level below.
   1.433 +In consequence,
   1.434 +.Pn comp
   1.435 +invokes
   1.436 +.Pn whatnow
   1.437 +which thereafter invokes
   1.438 +.Pn send .
   1.439 +The clear separation on the surface is maintained on the level below.
   1.440 +Human users and the tools use the same interface \(en
   1.441 +annotations, for example, are made by invoking
   1.442 +.Pn anno ,
   1.443 +no matter if requested by programs or by human beings.
   1.444 +The decrease of tools built from multiple source files and thus
   1.445 +the decrease of
   1.446 +.Fn uip/*sbr.c
   1.447 +files confirm the improvement.
   1.448 +.P
   1.449 +One disadvantage needs to be taken with this change:
   1.450 +The compiler can no longer check the integrity of the interfaces.
   1.451 +By changing the command line interfaces of tools, it is
   1.452 +the developer's job to adjust the invocations of these tools as well.
   1.453 +As this is a manual task and regression tests, which could detect such
   1.454 +problems, are not availabe yet, it is prone to errors.
   1.455 +These errors will not be detected at compile time but at run time.
   1.456 +Installing regression tests is a task left to do.
   1.457 +In the best case, a uniform way of invoking tools from other tools
   1.458 +can be developed to allow automated testing at compile time.
   1.459  
   1.460  
   1.461