INTRODUCTION

In this chapter the applicability of HGA to compiler construction is considered and is the core of the proof of the thesis spanning experimental studies for about two decades and has been pointed out by P B Hansen that the general technique of HGA has been used similarly by him throughout his life in the development of portable compilers and here the first pioneering compilers in India for many major prpgramming languages developed by the author or implementations of his design are considered with all of them using HGA for development.

The sixth application of HGA is a summary of applications of the method to pioneering compiler developmental efforts of major real-life programming languages in an industrial environment in the seventies and eighties in India using moderately reliable computer systems. The yield of HGA has been first in India experimental/commer cial compilers/interpreters for PASCAL (sequential and con current), APL (LL/1 syntax directed compiler), CORAL-66 (employed in real-time applications), BLISS (experimental implementation), subsets of ADA, subsets of CHILL, FORTRAN- IV,-77 (scientific applications); LISP, SNOBOL, PROLOG (for computer science education); subsets of PL/1 (experimental); ALGOL-60 (commercial); C (experimental and limited commercial use) etc. The HGA developmental efforts used LL(1) uniformly: more in the form of top-down without backup, with one symbol look-ahead. Extended-BNF formulation of the syntax was em ployed and such features as did not fit the LL(1) framework were coerced to do so by (ugly(?)) semantic routines. The initial HLL version was worked out by the author in PASCAL with little debugging and extensive testing of the same and hand-coding was done by M.Phil theses work ( of the Hyderabad Central University), trainees in Computer Science and Engineering and Information Systems Engineering. The hardware platforms were the slowly stabilising TDC-316 & System 332 ( a 32 bit medium-large copy of IRIS-55 of CII- Honeywell Bull from which (as Prof.J.Saltzer commented on his visit) very little software was inherited). Extensions of the studies of the applicability of HGA has been to various TWS including the SLR(1), LR(1) and LALR(1) techniques.

6.1 High level programming languages(HLL) made their appearance around 1950's in the West and circa early seventies in the indigenous environment. Early programming was at the low-level machine/ assembly language programming. The early languages were BASIC, FORTRAN, COBOL & ALGOL-60. The mastering of the translator writing area led to the acceptance of HLL programming with its advantages of ease of portability of software, ease of entry into the field of programming, ease of the realisation of software reuse. A separation of the abstract or symbolic component of the program from the machine level abstractions to use the computer system was thus attempted and realised to some extent. Once the early resistance to symbolic programming was overcome an explosion in symbolic programming languages occurred. The next step was to consider specifications of the problem in precise natural language which could then be converted to the symbolic programming language in a number of steps. The major HLLs that had found their first experimen tal/commercial applications in the environment are consid ered. i) FORTRAN whose name is derived from FORMULA TRANSLATION was the first and oldest programming language to find commercial acceptance. It was originally developed at IBM for large scale numerical applications oriented problems. It initiated the historical debate as to whether a compiler can totally do away with the assembly language handcoding. It has been extensively used for HGA purposes as reported in this chapter.

ii) COBOL which stands for COmmon Business Oriented Language is the work-horse of business applications. It was originally created in 1959 and the major standardisations considered are COBOL-74 and COBOL-85. The implementation was on the TDC-316 and Hansen's Approach (HA) was suggested for the same, but not used. HGA was used with COBOL-74 exten sively as reported in chapter 9 for educational purposes and for software upgradation or downgrading the software as reported in the appendices. The permanence of COBOL is a permanent debate but no replacement which is effective has been found.

iii) ALGOL-60 stands for ALGOrithimic Language and the '60' refers to the year of definition, circa 1960. It is the basic historical foundation of HLLs development and the origin of practical computer science. The first Indian implementation by the author on the TDC-316 is reported in chapter 14 and HC is based on the same. HGA was used for scientific application packages on the same. It was redefined to ALGOL-68 which has not found much acceptance. However, an experimental implementation of a subset of ALGOL-68 using HGA was done by the author as reported in this chapter. This is the only attempt in India.

iv) BASIC stands for Beginner's All-purpose Symbolic Instruction Code and was defined in 1964 at Dartmouth College, to make programming as easy as possible. It was to be learnt in a few hours. It was used with HGA for instructional purposes and the first Indian implementation was by the author in 1969 for his Master's thesis at the Indian Institute of Technology, Kharagpur. The first commer cial implementation using HGA was on the TDC-312. v) LISP stands for LIst Processing and is the second oldest language after FORTRAN. An experimental implementation, the first in India, was done by the author using HGA with assembly language as the target language. It was used for instructional purposes and reported in this chapter.

vi) SNOBOL created in the early sixties at Bell Labs is a string processing language with some number handling capabilities. An experimental implementation, first in India, was done by the author on the TDC-332. It used FORTRAN and assembly language as the target implementation vehicles and was used for instructional purposes and reported in this chapter.

vii) PASCAL named after the French philosopher was designed by N.Wirth. The stated intention of defining the language was to teach the fundamentals of computer programming to beginners and still have an efficient implementation. With its extensions to Object PASCAL and Concurrent PASCAL it has been the basic tool for HGA studies for two decades. The first implementation in India was by the author using a variety of target languages and hardware platforms. All implementations were studies of and used HGA as reported in this chapter.

viii) EULER named after the famous mathematician was a generalisation of ALGOL-60 by Wirth and Weber. It used simple precedence in its reported formal published algorithm of definition. Eminently suited to the use of HC it was implemented by the author (along with Bernd Krieg) as part of a Compiler Writing course under Prof. David Gries. The first experimental implementation of the same in India was done using HGA by the author with FORTRAN as the target language on the TDC-332/IRIS-55.

ix) MODULA-2, born in 1980, is a descendent of the original MODULA and PASCAL. It was designed by Prof. N.Wirth. The first implementation of a large subset was done for experimental purposes on the IRIS-55 using HGA with FORTRAN as the target handcoded vehicle.

x) PL/1 stands for Programming Language One and is as old as 1966. Meant as a programming language for all purposes it has not met its target. This is because there seems to be an intrinsic requirement of poly-Programming Languages especially when one considers applications over very wide spectrum. The only indigenous implementations were experimen tal realisations based on subsets like the ones defined by R.C.Holt and implemented on the TDC platforms. The implemen tations were all studies of and used HGA.

xi) APL stands for A Programming Language and was developed by K.Iverson at IBM in 1962. It is a general purpose language using generalised matrix operations. The implementation using LL(1) syntax was the first in India and was by the author using HGA.

xii) ADA named after the world's first programmer Lady Augusta Ada Byron, has been ordained by the Department of Defence (DoD) of USA as being a requirement for military applications. An experimental implementation using HGA has been carried out by the author for a subset.

xiii) PROLOG stands for PROgramming in LOGic and was first implemented in 1972 in the West. It is the vehicle of the Fifth Generation Project in Japan. An experimental implementation using HGA, first in India, was by the author and is reported in this chapter. xiv) The C programming Language defined in 1972 is the current workhorse of infrastructure programming in Controls, Communications and Computers in the environment. Its availability is the main alternative to HGA especially with highly optimising versions available. It does not find use in criti cal nuclear applications except as a vehicle for the use of HGA. Implementations of C in the early eighties by the author used HGA uniformly.

xiv) CORAL-66 stands for COmmon Real time Language and was defined in 1966 as the British standard for real-time software. An implementation, first in India, was done by the author using HGA and was used for real-time software.

xv) CHILL stands for CCITT HIgh Level Language and has been ordained by the international body for standardisation in Telegraphy and Telephony the CCITT as being the vehicle for communication software. The only implementation of the same in India of a very large subset for experimental purposes using HGA was by the author. xvi) Software command languages. Attempts in the early to mid-eighties at word processors and spreadsheets in the environment using HGA for commercial purposes were swamped by the availability of LOTUS 1-2-3 and WORDSTAR. HGA may not be of any use for such dedicated applications.

6.2 The experimental systems programs reported below were over the last two decades and mainly on the TDC family of machines and used HGA. Some of the concrete experimentation with HGA is reported.

6.3 Around 1975 there were only a couple of hundred computer systems in India. This was also the time when the TDC effort at the Electronics Corporation of India Limited(ECIL) was at its peak to establish the indigenous computer industry, by national policy.

6.4 The TDC-312 (Trombay Digital Computer, 3 for the generation and 12 for the word length) and the 16 bit word length TDC-316 were developed and productionised and efforts were underway to develop and productionise the medium large System 332.

6.5 To aid the software development on the indigenous machines the IRIS-55 of CII-Honeywell- Bull had been procured. This had the FORTRAN and COBOL language processors, working in a batch environment.

6.6 At the same time, around 1975, there were only a handful of Universities, in India, offering Computer Science/Engineering/Technology programs . The bulk of the personnel with a formal computer background were more available to the West than to India (circa 1975). The University of Hyderabad initiated an academic program, offering M.Phil, in Computer Methods and the entire program was conducted by the staff of the Computer Group of ECIL for a decade( 1975-84).

6.7 The non-availability of any programming language which was ALGOL-like or PASCAL-like on the IRIS-55 led to difficulties in instruction in Programming Methodology and practical Systems Programming, as the only languages available were FORTRAN & COBOL. The visits of the eminent Prof. Hansen and Prof. C.A.R.Hoare around this time was cru cial in motivating the developments of the Compiler Writing projects envisaged and also used for instructional and com mercial implementations.

6.8 The use of HGA led to the successful implementation of a couple of dozen practical systems programs. 6.9 One of the severe problems in some academic programs in Computer Science/Engineering/Technology in the Seventies was that the students never got down to developing, testing and debugging any programs. Circa 1975, the micro- computer revolution had not come to India and this led to computer professionals who hardly had ever written a program. In some cases this crucial and fundamental aspect (experience of extensive practical programming ) was not even realised. 6.10 Computer Science/Engineering/Technology suffers from the drawback of having involved terminology or jargon for which quite often no accepted standards exist. This has led to the phenomenon, in India, of quite a few people using the terminology or jargon intuitively. To avoid this it was decided that the M.Phil program and the internal training programs be primarily oriented around intensive and extensive practical systems programming. 6.11 As Programming Methodology can be naturally taught in PASCAL or ALGOL-60 and does not come as naturally in COBOL or perhaps not at all in FORTRAN the method advocated by Prof. Hansen proved to be crucial. 6.12 The method of practical Systems Programming allows one to see through a program, to see the assembly language and machine language equivalents and with a little bit of effort the microprogrammed equivalents of the program. 6.13 Apart from the above one picks up certain essential skills: the unwinding of recursion to iteration, the mapping of some high level data structures onto the addressing modes of the machine instruction set, the postfix representation of expressions, control structures and all HLL constructs, microprogrammed equivalents of HLL constructs and at times even the realisations in hardware. 6.14 It may appear that today in 1994 such a method has outlived its usefulness, especially with the easy availability of the microcomputers and the PC revolution, however it is a crucial requirement in real-time programming that the HGA experience be substantial. 6.15 The basic aim of the systems program given in the table was to experiment with a few ideas at the system level so that an appreciation is gained of the amount of effort involved when one tries out alternative ideas and also to appreciate in depth what goes on when one uses jargon to describe the ideas involved in a non-trivial systems program. 6.16 Most of the experimentation was with error recovery and error correction ideas involving compiler development as most of the work in the seventies in the TDC effort was in the area of language processors. The method of development followed allowed the students/trainees to get a good feel of compiler writing and programming methodology. The culture of using flowcharts, so prevalent in the TDC environment was also sought to be changed in this approach. 6.17 The students/trainees were supplied with three gross descriptions: a) The algorithm stated grossly, with only a few steps of refinement in a mixture of ALGOL-like, PASCAL-like or PL/1-like programming languages with abstract data types at times. For most of the first compilers IHLL was Standard PASCAL and HGA was used.

b) The algorithm stated in Standard PASCAL (a handwritten and undebugged version). Towards the early eighties it was possible to subject the IHLL to some degree of debugging and testing.

c) The algorithm with FORTRAN-like control and data structures (but handwritten and undebugged). This uses a MA for FORTRAN-IV representations of PASCAL constructs by handcoding.

6.18 The above gross descriptions were first subjected to a detailed study and discarded. The implementation was then worked out systematically in a number of steps of refinement to obtain the final implementation in FORTRAN and/or assembly language.

6.19 To consider larger and larger subsets of the language the entire lexical analyser (using a finite automata) and the entire syntax analyser for the whole language was considered with the LL(1) strategy. The semantics were added for subsets of the language of increasing complexity.

6.20 This was followed for some of the other programming language implementations on some other platforms and not reported here.

6.21 The M.Phil program in Computer Methods created around 100 trained and qualified personnel and the in- house training at Computer Group, ECIL around 700 trained personnel in computer science. These figures do not include customer training and dedicated hardware training programs on the TDC series.

6.22 A record of optimisations (RO) was maintained as discussed in chapter 0 for documentation purposes.

6.23 Though quite a few of these personnel have been garnered by the West, a sizable portion of them are spread over the Indian computer industry and academic institutions offering programs in computer science/engineering/technology.

6.24 Though the TDC effort has supplied the country with around 1000 systems one of the main achievements of the effort was the creation of trained personnel for the Indian computer industry and academic institutions.

6.25 A few comments may be in order to discuss the overall experimental conclusions:

6.25.1 Experiment 5 dealt with different symbol table strategies for a standard PASCAL compiler and the elegance of the solution used in the PASCAL(P) technique appreciated.

6.25.2 Experiment 9 is interesting as APL is a real life language which requires a right to left parse. The syntax analyser was LL(1) based. By and large the experiments (spread out over a decade or two) indicated that LL(1) was a sufficient practical technique for most programming languages. The IHLL was in PASCAL and a later implementation verified in the IHLL and then reduced the same by handcoding to FORTRAN- IV and assembly language. 6.25.3 Experiment 6 was support material for the TDC- 316 commercial ALGOL-60 compiler which followed the Whetstone Compiler design of Randell and Russell. The Whetstone compiler stands out as a published version of a more or less cor rect compiler.

6.25.4 Experiments 3, 5, 7 and 12 used a hand- conversion of recursive descent to FORTRAN. Though this was done it was found to be an error prone process. The use of the syntax scaffolding using LL(1) either formally or intuitively with the extended Backus Normal Form(BNF) notation but using the same solutions for semantics proves to be much better. In the extended BNF notation a bracket structure indicating one or more iterations was found useful along with the zero or more iterations bracket structure usually found. This corresponds to the repeat and while control structures found in a PASCAL- like language. It was found that handcoding recursive routines using HGA is not easy.

6.25.5Experiments 18 and 19 allowed the conclusion that it really does not matter as to what method of syntax analysis is used so long as it is sufficient to serve as a scaffolding on which to hang the semantic routines. LL(1) is perhaps the easiest to understand and most appropriate. HGA can be used to obtain the syntax anaylser by handcoding rather than by a formal use of a construct for the syntax of traditional major real-life HLLs.

6.25.6Experiment 14, uses a small subset of ALGOL-60 for which LL(1) was used in the syntax analyser. In fact the subset was so chosen as to use LL(1) only.

6.25.7Experiment 8 indicated that simple precedence error recovery is very poor and it is difficult to talk about error correction with this technique.

6.25.8Experiment 13 can be looked upon as a culmination of the introduction of macro facilities in the assembler development on the TDC family of machines. Here HGA was used with PASCAL as IHLL and handcoded to assembly language.

6.25.9Experiment 17 is perhaps the only SNOBOL interpreter implemented in India. Here IHLL was PASCAL and the handcoding was to FORTRAN-IV & assembly language.

6.25.10 Experiment 20 consisted of a number of projects to study various syntax analysis techniques. The results were used for student projects in compiler writing courses in the M.Phil and the computer science/engineering oriented in-house training programs. Here IHLL was PASCAL and the handcoding was to FORTRAN- IV and assembly language.

6.25.11 Experiment 1 used subsets of PASCAL, FORTRAN, ALGOL-60 and COBOL to study error correction strategies and hand experiments in semi-automatically generated transition matrices. The grammars used were historically tiny as the technique is demanding in space, time and code.

6.25.12 Experiments 9, 11, 14, 15, 18 and 19 indicated that though recursive descent ( as in experiments 3, 12 and 15) is a technique which allows the syntax analyser to be written almost as fast as one can write using LL(1) intuitively with the extended Backus Normal Form, one can with a little bit of practice use the stack explicitly and still generate the syntax analyser almost as fast as one can write and thus is an application of HGA.

6.26 One major conclusion drawn was that it is sufficient to use formal syntax analysis methods for guidance and use patches which use state variables as in the Whetstone Compiler. Thus a rigid conformance to formal techniques is not necessary.

6.27 Experiments 2 and 4 used LL(1) (recursive descent) equivalents and parsers using the extended BNF, the parsers being developed by handcoding.

6.28 Experiment 20 showed the use of a fast transitive closure algorithm especially as one was restricted by the memory size and segmentation feature of the IRIS-55. A fast transitive closure algorithm using the adjacency representation cuts down the time and memory required as discussed in chapter 2.

6.29 The experiments indicated that simple algorithms/solutions for syntactic error correction over a variety of languages most probably do not exist.

6.30 The code generation in some of the experiments was restricted to the compile and go concept where inefficient code is generated which is immediately executed. In some of the experiments studies were made to generate code for the TDC-312 and TDC-316 etc. also so that experience and knowhow is gained for code generation and code optimisation for 0- address, 1-address and 2-address machines. The bypassing of the code generation phase or keeping it simple made the application of HGA more tractable. 6.31 Experiment 20 used as test data large subsets of C, PASCAL, ALGOL-60, BASIC, COBOL & FORTRAN. The LL(1) parser experiment also used LISP & APL as test data. 6.32 The basic philosophy generally followed was that theory and formal techniques only guide one's thinking and need not be rigidly adhered to. Thus most of the scanners only intuitively and broadly used the finite automata concept and except in experiment 20 most syntax analysers used the overall formal techniques as gross guidelines of the parsing technique involved. 6.33 In all the experiments extensive code optimisation techniques were not used as the TDC environment did not require the same. A reasonable amount of optimisation was however used. This may have made HGA application easy. 6.34 Experiment 20 also experimented with the possibility of breaking up a grammar to see which technique fits which part. Test data was however for very small grammars. 6.35 The absence of any relevant reliable compiler (let alone optimising) was the motivation for using HGA in all the above experiments. The high unreliability of the compilers developed by HGA had to be taken into account owing to the high unreliability of the hardware. The background of the programmers to practice HGA is as indicated. The current availability of technology makes a repetition of the above using HGA unnecessary except for expositional purposes. 6.36 Formal verification techniques were used with the view that they need not be formally used but mainly guide one's thinking. These were mainly used on the PASCAL-like HLL specifications. The above implementations are not necessarily production software oriented except for (6). 6.37 The conclusion drawn was that HGA aids in controlling the complexity involved in 'thinking' out the implementation though in these applications the assembly language equivalents by hand are not generated except for experimental FORTRAN subsets on the TDC-316 and TDC-332 systems.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PRACTICAL SYSTEMS PROGRAMMING ORIENTED COMPILER AND COMPILER RELATED PROJECTS WITH EMPHASIS ON A STUDY OF SYNTACTIC ERROR RECOVERY AND CORRECTION AND WITH THE USE OF HGA.

COMPUTING ENVIRONMENT:

TDC-332/System332/IRIS-55/PC-286,-386,-486,Pentium/ND series /CDC series/PC-LAN. S.No. Project Assitance/

Implementation

01 Syntactic error recovery/ P.Gopalakrishnan correction with the use of M.Phil, Thesis

transistion matrices in compiling. 1975-76 Purpose: Experimental study of

adhoc strategies by the use of HGA.

02 A study of syntactic error correction J. Vaidyanathan

using the LL(1) strategy of parsing. M.Phil, Thesis Purpose: Experimental study as per 1975-76

Floyd's approach.

03 An implementation of PL/0 through a K. Gangaram programming technique(HGA). M.Phil Thesis

Purpose: A study of Hansin's suggge- 1977-78 stion with the IHLL as PASCAL

and the THLL as Assembly/ Machine/FORTRAN.

04 Review of error correction through T.Rajmouli

LL(1) techniques using HGA. M.Phil Thesis Purpose: A refined experimental 1977-78

study as per Floyd's approach with the IHLL

as PASCAL and THLL as PASCAL.

05 Generalisedsymbol tableorganisationG.V.Subramaniam for a PASCAL Compiler using HGA with M.Phil Thesis

IHLL as PASCAL and THLL as Assembly/ 1977-78 machine/FORTRAN IV.

Purpose: An unwinding of a generali- sation of the PASCAL(P) symbol

table using IHLL as PASCAL and THLL as assembly/machine/FORTRAN.

06 Design and implementaion of B.Sukumaran Nair

features for TDC-316 ALGOL-60. M.Phil Thesis Purpose: A review of the literature 1975-76

of I/O of ALGOL-60 and practical alternative impleme-

ntations of I/O withFORTRAN as IHLL and assembly as THLL.

 

 

07 Anatomy of a typical PASCAL compiler. M.Mohan Reddy

Purpose: An unwinding by hand of M.Phil.Thesis PASCAL(P) to FORTRAN as 1982-83

implementation language.

08 Experimentalimplementationwith syn- K.Subramaniam tactic error-correction of an EULER M.Phil.Thesis

interpreter. 1981-82. Purpose: An experimental study of

simple precedence error recovery and correction.

09 An experimental syntax oriented imple- Rita Siviah

mentation of APL. M.Phil.Thesis. Purpose: A syntax directed compiler for 1980-81.

APL and the use of APL as an hardware description language

for the TDC family of machines.

10 Acritical studyof implementaionof the IIT,Madras PASCAL(P) stack machine. Summer Trainees.

Purpose: Towards the frist implementation 1976 of PASCAL in India.

11 Syntaxanalyserwitherrorcorrectionof N.T.Sreekumar

Concurrent PASCAL. M.Phil.Thesis. Purpose: A study of Concurrent PASCAL for 1983-84.

systems programming using HGA.

12 An implementation of ALGOL-60 using the Trainees Project. description of Grau et al. 1978-79

Puposre: An unwinding by hand using HGA of the ALGOL-60 translator which

uses recursive descent.

13 AnimplementationofGPM usingHGA. Trainees Project. 1975.

14 An implementation of an ALGOL-68 Trainees Project.

subset using HGA. 1980 Purpose: A small subset of ALGOL-68

with IHLL as PASCAL and THLL as FORTRAN using HGA.

15 AnexperimentalLISP interpreter. Trainees Project.

Purpose: An implementation of Commom 1976 LISP with PASCAL as IHLL and

FORTRAN as THLL on the IRIS-55.

16 ImplementationofPL/1subsetswithHGA. Trainees Porject. Purpose: A small subset of PL/1 implemented 1980-82

as per the SP(K) subsets of HOLT with PASCAL as IHLL and FORTRAN as

THLL on the IRIS-55/System 332.

17 AnimplementationofSNOBOL usingHGA. Trainees Project. Purpose:The first implemnetation of 1982

SNOBOL in India with PASCAL as IHLL and FORTRAN as THLL on the

IRIS-55.

18 Asyntaxdirectedimplementationof FORTRAN Trainees Project. Purpose: An implemnetation of a large FORTRAN 1976

subset using 5 different parsing techniques but the same semantics.

19 Syntax directedimplementationsof PASCAL Trainees Project.

using HGA with FORTRAN as THLL as PASCAL 1979 as IHLL.

Purpose: An implementation of PASCAL using 5 different parsing techniques

but with the same semantics using HGA on the IRIS-55/System 332.

20 Implementaion of parser generators for simple Trainees Project.

precedence, operator precedence, extended 1974-82 precedence, transititon matrices, LL(1), LR(1),

LALR(1) techniques. Purpose: The first such attempts in India.

21 Anexperimental implementationof LOGO. Trainees Project.

Purpose: To develop knowhow in graphics. 1983

22 An experimental implemnetation of SmallTalk Trainees Project. using HGA with PASCAL as IHLL and C++ as THLL. 1985.

23 Anexperimental implementationof PHIGS with Trainees Project.

PASCAL as IHLL and C++ and assembly as THLL. 1986

24 An experimental implementation ADA subsets M.Tech. Thesis. using HG A.C. Subramaniam

1990-92

25 An experimental implementation of CHILL M.Tech.Thesis. using HGA for communication software. 1992-93.

26 An experimental implementation of M.Tech. Thesis FORTH using HGA with PASCAL as IHLL 1986-87. assembbly as THLL for astronomy departments.

27 An experimental word-processor on Trainees Project. the ND-55 using HGA with PASCAL as IHLL 1989 C as THLL.

28 An experimental implementation of MODULA M.Phil Thesis

using IHLL as PASCAL and THLL K.Neeraja as C++. 1985

29 An experimental implementation of PROLOG H.Padmashree

with C as THLL and PASCAL as IHLL 1987 R&D project

30 An experimental implemmentation of CORAL-66 R&D Project with PASCAL as IHLL and assembly as THLL. 1977

 

INTRODUCTION The seventh application of HGA elaborates on the third and shows a logical extension to bottom-up parsing techniques: from the constants in the parsing table that standard precedence techniques ( and most generalisations of the same) consider the ALCOR group considered algorithms in the parse table. Incorporation of error-correction suggested augmentation of the algorithms with data structures and hence encapsulation in an object is a logical extension of the generalisation. Thus an object in a table of objects is to be messaged based on the top of the parse stack and the incoming symbol. The necessity of having to deal with different ab stractions of the problem at the same level in the third application generalises to the concepts of Data Abstraction and Polymorphism in a very logical extension of ideas. The development of the TM parser first with error- correction and then incorporation of error-corrections a subsequent step (with syntactic and semantic backup) allows the development of the Inheritance concept.

Thus we have the concepts of Data Abstraction, Polymorphism, Encapsulation and Inheritance from the intrinsic ideas involved in the third application and this generalises to the concept of OOP(Object Oriented Programming). The parse table of objects obtained contains in effect a data-base obtained based on the syntax of the programming language considered and this generalises to the concept of OODBMS (Object Oriented Data Base Systems).

Thus a hypothetical point of view can be taken that OOP and OODBMS follow as logical generalisations of the intrinsic ideas involved in the efforts of the ALCOR group. This generalises HGA to use Object PASCAL in HLL descriptions especially in the Graphics Specialism.

7.1 To explore the use of HGA with OOP a cosmetic software package for a civil engineering application is underway from 1993. This is the development of a special Estimation and Costing Package for use by PC users and oriented to small to medium size (by Indian urban standards) construction of residential buildings/complexes and aiming at a target of corresponding builders/contractors. The special features of this package are:

a) The input through a special purpose interactive cum procedural language 'CIVIL' which with the FONT PROCESSING of the custom-made GIST card (Department of Electronics, Government of India), through CDAC (Pune, India) allows a single Machine Translation to be sufficient for a multi- lingual interface involving India's major languages (15 nos).

b) The language CIVIL's design is simplified by the standardisation of terminology and methods of the Costing and Estimation sub-area of Civil Engineering. LEX/YACC under UNIX are used here for a conversion from CIVIL (MULTI-LINGUAL) TO C++ for procedural features, the interactive features being mapped onto the 'breakpoint' feature.

c) A dBase oriented database (with a multi-lingual interface that GIST allows) is to be used for the package for spatial/temporal/historical information.

d) Extensions to the Man-machine interface by pen- input, is planned for Graphical I/O along with primitive intelligent Script Text Processing using neural nets.

e) Automatic OCR and Primitive Intelligent script processing techniques to input building plans that exist as a historical backlog or existing current information repositories.

f) Spoken Input/Output Interfaces in the Syllabaries that Indian Languages are. For Input primitive neural nets with limited vocabulary are being considered.

g) Extensions to other aspects of the Contractor/Builder's work by a suitable adaptive (temporal/historical) data-base with primitive transaction processing is being considered.

h) Application of existing 3-D graphics packages and picture data-bases to be integrated into the software to aid visual (cosmetic) views.

i) For commercialization suitable cosmetic engineer ing using graphics and multi-media is being considered like 'overdone' HELP and DIAGNOSTICS.

j) Existing standard PC based integrated packages for DSS are being exploited with a suitable WINDOWS environment along with the multi-lingual interface which GIST provides.

k) The aim of the project is to apply and study the use of HGA in emerging technologies.

7.2 A use of OOP with HGA has use in pattern matching over a range of Geometries. An elementary but encyclopaedic treatment of Geometry is found in Klein [Kle,39]. An extension to the methods of chapter 4 is possible to affine geometries and more generalised geometries.

7.3 In chapter 4 was considered a pattern-matching problem of a sub-pattern against a template of a full master-pattern. Here the digitisation was by representing both the patterns by finite sets of computable points. The patterns, by generalisation, can be considered to be digitised by a finite number of computable geometric manifolds i.e. either as a collection of lines, planes or in general the equivalent algebraic manifolds.

7.4 To perform the pattern-=matching subject to a threshold T it is only necessary to partition the patterns suitably and apply the redistribution theorem.

7.5 In affine geometry one can consider the transformation of points as being defined by the formulae--

---- ---- ----- ----- --- --- --- ---| x' | | a1 b1 c1| | x | | d1 |

| y' | = | a2 b2 c2|* | y | + | d2 || z' | | a3 b3 c3| | z | | d3 |

---- ---- ------ ---- --- --- --- ---

7.5.1 Thus we have 12 unknowns and given 4 points we have enough linear equations to obtain the transformation.

7.5.2 In projective geometry the transformation is:

x' =(a1x + b1y + c1z +d1)/(a4x + b4y + c4z + d4)

y' =(a2x + b2y + c2z + d2)/(a4x + b4y + c4z + d4)

z' =(a3x + b3y + c3z +d3)/(a4x + b4y + c4z + d4)

7.5.3 Here we have 16 unknowns and leaving out a constant factor 15. Given 5 points we can determine the transformation parameters.

7.6 The above is formalised in Kleens's Erlangen program as saying that affine geometry is represented by the affine group G12 and projective geometry by the projective group G15. A naive pattern matching it is only necessary to ensure that the number of points k is 4 for affine geometry and k =5 for projective geometry are in one partititon when applying the pigeonhole principle. If T(k-1) - 1 partitions are created and the points in the master-pattern or sub- pattern distributed among the same then we are sure by the pigeonhole principle that if a match of threshold T or more exists then k points in the match occur in some partition. We are also sure that all matches with T or more points matching will have k points in some partition. In Eculidean space we can optimise by discarding points based on the matching metric of lengths. The redistribution theorem then can be used to suppress matches less than T points.

7.7 A generalisation is to use geometric manifolds instead of points. If the master-pattern consists of a finite number of computable manifolds (lines, areas, volumes, algebraic manifolds) and the sub-pattern is similarly considered to be a finite number of geometric manifolds, lines, areas, volumes, algebraic manifolds) and a requirement exists of at least T manifolds matching, it is only necessary to create (T/(k-1)) -1 partitions. The k is the number of points that determine the manifold to be considered. By distributing the points among the partitions we ensure by the pigeonhole principle that k points occur in some one partition. Thus by considering the points in a partitions in groups of k one is certain to consider a manifold in a match, if it exists. Depending on the particular geometry considered one can apply the redistribution theorem or cut down the number of cases depending on some metric in metrical geometry. 7.8 The simplest cases are where a manifold in the master-picture can be considered to have a unique corresponding manifold in the sub-pattern (or vice versa). In the case of the geometry of continuous deformations we have a group of infinite elements and the method breaks down. 7.9 Once we have determined one or two matching manifolds, the transformation is to beapplied to all the points in the sub-pattern (master-pattern). Thus we still have a try all possibilities component as we still tryu all groups of k points in a partition. This 'try all possibilities' component can benefit from HGA and OOP the latter for software reuse purposes with HC.

7.10 It is known that in the application of affine geometry the procedure of considering the problem of estimating motion parameters from a pair of range images by a solution of linear equations based on 4 points is naive as one has to consider errors in the matching of points. A generalisation of chapter 5 techniques can be applied in this case though what is normally used is to use features as the geometric manifold to be isolated.

INTRODUCTION

The Eighth application of HGA, is in the non- traditional area of the applicability of real-life programming language (ALGOL-like, PASCAL-like or COBOL- like) descriptions, to aid popularisation of education and literacy in Automata Theory and the Theory of Computation in the Indian Environment and thus applied HGA to (reliable (!)) abstract target machines. It was found that the use of the RAM and RASP turned out to be crucial, in the training, instruction and mastery of the machine/assembly language instruction repertoire, of a given processor. One can easily master the entire instruc tion repertoire, repeating elementary programming examples, using HGA, by isolating, universal subsets of instructions in the repertories. In this one takes abstract views of the instruction repertoire subsets, as dealing with a set manipu lating machine, a propositional Calculus machine, at one end; and at the other extreme end, the full power of a Universal machine: a string manipulating machine or a machine doing fixed and floating point arithmetic through algorithms. Such studies and training, in a totally industrial environment, were crucial in training assembly/machine language program mers, from the dilettante programmer to hard-core systems programmers, with proven success for a decade and a half. The TDC-12 & TDC-312 (Trombay Digital Computer, 3rd genera tion, 12 bit word length) employed octal machine language programming, over a range of applications in Controls, Commu nications and Computers. An upgradation took place to assem bly language only with the TDC-316, as the assembler (paper tape oriented on the TDC-12, -312) was too slow. 8.1 An extension of HGA, not too popular, was to use other abstract machines, at the lower end of the hand-coding. While it was found that the RAM and RASP were directly useful, the equivalent abstract machines of the Chomsky hierarchy, only served as cultural background, barring the finite automata (which is everywhere). HGA when used for elementary Computational Complexity theorems, is effective, but laborious. It was concluded that while useful for introductory and moderate theorems (results), there is no other way of study, train, think, teach or do the area expect as propounded by the pre-eminent Prof. J.E. Hopcroft (1969, 1979). Similar attempts are seen recently and similar conclusions can be drawn.

8.1.2 A different line of generalisation of HGA is to the abstraction of Automata. An Automata are nothing but computing agents, and the full description of their behaviour, cumbersome and at times complex. HLL specifica tions have been usefully employed to describe their behaviour and construction.

8.1.3 The extreme length and complexity of detailed description, can be shortened by algorithmic descriptions at times. This can be considered to be a useful use of HGA. Thus all constructions in Formal Languages of Automata Theory can be described in detail by an algorithmic specification in real- life PASCAL-like language, with suitable ADT's and these can be reduced by-hand to the detailed abstract formalisations.

8.1.4 An extension of such constructions to post- machines and program-machine is relatively straight forward.

8.1.5 However such applications of HGA are fruitful only for simple constructions, in a practical sense. More advanced constructions, though effective, turn out to be meaningless unless they are integrated by the use of 'intuitive' arguments' normally employed, in more advanced constructions. Furthermore, demonstrations by counter- examples, as in the non-closure of cfg's under intersection, seem to have no mapping into HGA. Furthermore' demonstrations like the decidability of equivalence of fa seem to be easier in the mathematical formulation of sets rather than PASCAL-like descriptions, though these aid in controlling the abstraction of the formalism. The HLL descriptions in such cases may not be considered to be practical algorithms, but effectively an aid to control complexity and undue abstractions.

8.1.6 HGA as used here us an aid to describe the constructions and an aid in the understanding 'what is done', but no aid in mastering the intuitive arguments and techniques that go into how to 'do the area'.

8.2 HGA applications to the area of syntax analysis techniques is extremely practical and has been employed, but in the pure area of Formal Language and Automata Theory, perhaps a new environmental language and/or technique has to develop and just as ADTs help controlling complexity a new type of ADTs oriented towards this area has to develop.

8.2.2 It is felt that this should be on the lines of tradeoffs between complexity of formalism, complexity of abstraction, complexity of HLL specifications/descriptions, complexity arising from different abstractions operating at the same level of abstraction, and complexity of hand-translation. The extended PASCAL to be employed in HGA thus varies depending on the domain of problems considered.

8.2.3 Thus a universalisation of HGA will require, augmentation by suitable borrowings of features from different programming environments and the techniques, skills and methodologies will vary for effective practical applicability.

8.2.4 Thus we have logical extensions to HGA, to choose the appropriate 'source specifications', the 'translation specification' and the final 'target specification'. Thus if ultimately the target is machine language, we have to go through a variety of HGA stages. Thus HGA is not a single stage application, but in general will involve in general stages of greater and greater refinement.

8.2.5 Thus the HGA process will have to be considered by itself as a meta-process of stepwise refinements of the HGA process.

8.2.6 Assembly Language Programming (the use of RASP and Post Machines and Program Machines):

8.2.7 The instruction repertoire of real life digital computers, character-oriented (IBM 1401), decimal-oriented (IBM 1620), binary machines, is partitioned into subsets, such that each subject is universal the sense of being able to compute the partial recursive function.

8.2.8 An obvious subset consists of the arithmetic operations, the compare and conditional jumps, and the unconditional jumps. By reducing the arithmetic multiplication/division to addition, making at times the assumption of ultimate word length one gets more and more universal sets.

8.2.9 By dropping the restriction on the register size, assuming it to be unlimited in size, one gets new register- oriented universal sets.

8.2.10 By reducing arithmetic to repeated incrementation and operators one isolates smaller universal sets.

8.2.11 Once the coding scheme is understood (BCD,EBCIDIC or ASCII) one proceeds to isolate string oriented operations, using string sizes that fit only one word, two words, multiple words or at times ignoring the word size. Then we obtain an abstraction of a real-life digital computer as a string manipulating machine. A class of sub-machines considered is to view the real-life digital computer as a set manipulation machine, a prepositional calculus machine, by their equivalent realisations of bit string operations, as in standard Pascal implementations.

8.2.12 The above process of isolation of subsets is generalised to stack operations. The exercise of reducing the instruction repertoire to universal subjects, allows a comprehensive study and use with elementary programming examples of the instructions repertoire. A cultural skill & gain is the ability to migrate to different machines in the same environment without much cultural shock.

8.2.13 Before the above method was used in instruction the migration of assembly language programming from the TDC-12 (PDP-8 like) to TDC-316 (PDP 11 like) was found to have some mental blocks, owing to the powerful addressing nodes of the latter, not found in the former.

8.3 The mastery and appreciation of the use of the entire instruction repertoire is thus easily achieved and then applied to more sophisticated assignments; and is the on-the-job performance is considered it has turned out good entry- level assembly language programmers in the areas of Real-Time Application, Communications software and system programming.

8.4 The elementary programming techniques used were the PASCAL-like control structure and data structures (with limited use of pointers, and the aim was to think out the programming examples in the High Level Language, and map the same systematically to unoptimised assembly language equivalents, and in a final pass optimise the assembly language.

8.5 Introducing Many programming languages simultaneously

In an obvious use of finding universal subjects of features, one could study a particular programming language, or many programming languages simultaneously.

8.5.2 The crucial concept of effective computations through RASPs, now equipped more as subset - FORTRAN, subset COBOL, subset PASCAL, etc., is essential to allow the entry-level programmer to migrate from one programming language to anoth er.

8.5.3 A practical application of HGA is to the problem of Pattern Matching in Euclidean space considered in chapters 4 and 5. HGA is applied to obtain a practical algorithm which when mapped onto a LAN for the chance print matching problem leads to a viable solution.

8.5.4 It was informally opined by J E Hopcroft (circa 1972) that perhaps the only practical result of Automata Theory is Cook's theormem. The theorem states that is a 2-way deterministic pushdown automata can perform a computation then it can be simulated in linear time by a RAM and hence a practical linear algorithm emerges with current general purpose digital computers.

8.5.6 The traditional application of the theorem has been the determination of a substring in a given string or variants of the problem. By a suitable string encoding the pattern matching problem in Euclidean space is solved by an extension and application of the theorem, and using HGA to obtain a viable solution.

8.6 An extension of Cook's theorem is to the common subsequence problem. Given two strings x and y the common subsequence is c1c2c3---ck where with x=a1a2----an and y=b1b1---bm there exist j1<j2<j3---<jk, and l1<l2<l3---<lk , for all ji in 1..n and li in 1..m such taht aji = bli, for i in 1..k. The common subsequence probnlem can be solved in time O(mn) [Hop,74]. A more efficient solution is that it can in the present application be solved in time O(mlogn) [Hun,77].

8.7 A mapping of a 2DPDA computation to a RAM can be considered to be an application of HGA as the the mapping to a practical algorithm. In the pattern matching problem the data structure that practically arises is the position tree. Given a string x we form a string x$ with a new symbol $, not in the vocabulary of x. If x$ = a1a1---an a(n+1), then the indices refer to the positon in the string of the symbols in x. Associated with each position i is the shortest substring identifer of x which is a substring of x starting from ai and uniquely identifying the position i. A position tree is a tree with its leaves being the positions 1 to n+1, and the path from the root to the leaf being the substring identifier for that positon. It is known that a position tree always exists and can be constructed in O(n**2) time or even O(n) time by compacting the chains of nodes which have only one son. The common subsequence problem can be solved by using position trees in a straightforward manner.

8.8 The algorithm for the determination of the largest common subsequence is used as a scaffolding in the pattern matching problem in Eucildean space, by a suitable string encoding. The existence of a common subsequence in the encoding is shown to yield a coarse match which is then refined to obatin a fine match.

8.9 The more effiecient O(mlogn) algorithm [Hun,77] uses a variant. Given two strings (sequences) x and y a threshold matrix Tik is set up where the value of Tik is the portion of the string y that should be considered starting from the first positon to obntain a susequence of k matches with i. Tik can be computed in a straightforward manner from T(i-1)k and T(i-1)(k-1). By compacting the data structures the efficient algorithm arises.

Definition: Given a pattern M and a pattern N, a base vector ik is any line of M in the i-bunch-vector and a base vector jl is any line of N in the j-bunch-vector.

Definition: For an i-bunch-vector ( of M ) and a j- bunch-vector (of N) if vertices k,p of M; l,q of N are considered , i not= k not=p and l not= q not= j, then the included angles kip and liq are redfered to as alpha(kip) and beta(ljq) respectively.

Definition: An ordering of lines in the i-bunch- vector (of M) is a clockwise (or anticlockwise) ordering of lines (vectors) subject the the following conditions:

a) ((Ai,k,p,r) (i,k,p,r in M and i not= k not= p not= r)) , (alpha(kip) < alpha(kir) implies ir occurs later in the ordering, and (alpha(kip) = alpha(kir) implies ir occurs later iln the ordering, if L(ip) < L(ir), and (alpha(kip) = alpha(kir) and L(ip) = L(ir) implies ip occurs before ir (arbitrary choice).

Definition: A string encoding of the i-bunch-vector of M is defined as P1i alpha1i P2i alpha2i---PMi alphaMi where (Aj)(j in 1..M) Pji is the length of the lilne ji and apha(ji) is the included angle between the lines ji and (j+1)i.

Definition: A string encoding of the j-bunch-vector of N isdefined as P1j beta1j P2j beta2j---PMj alphaMj where (Ai)( i in 1..N) Pij is the length of the line ij and aphaij is the included angle between the lilnes ij and (i+1)j.

Terminology: The string encoding of the i-bunch-vector of M is abbreviated as C(r= 1 to M) Pri alphari and the string enclding of the j-bunch-vector of N as C(r=1 to N) Pri betarj. The C operator is analogous to the phi and sigma operators used for continued products and sums and is here used for concatenation of strings.

Definition: A string encoding of M, with all the bunch- vectors is X = C(i= 1 to M) ((C(r=1 to M) Pri alphari)##).

Definition: A string encoding of N, with all the bunch- vectors is Y = C(j=1 to N) (C (for s=1 to N) Psj betasj) ##).

Definition: A string encoding of the pattern matching problem is (C(i=1 to M) ((c(Pri alphari)##)Y###))#.

Definition: A matching sequence of lengths is any subsequence of x = C(r=1 to M) Pri and y=C(s=1 to N) Prj such that if x and y are reqritten as C(k=l to l+T-1) Pmk dmk and C(k=l to l+T-1) Pnk dnk respectively, the (Az)(z=l to l+T- 1)(Pmz =Pnz) i.e the lengths Pmz,Pnz matching dmk and dnk are arbitrary subsequences of x and y.

Lemma: A matching sequence of lengths is a necessary condition for a match to exist.

Proof: If a match exists a T-polygon exists and hence a vertex i exists such that the i-bunch-vector has a matching sequence of lines which is the longest common matching subsequence of lengths.

Definition: The included angle associated with dmz is defined as alpha(t-1) + alpha(t)+ --- + alpha(u-1) + alpha(u) where dmz is of the form alpha(t-1)Pt alpha(t) P(t+1)--- alpha(u-1)P(u)alpha(u) in M.

Definition: The included angle associated with dnz is defined as beta(t-1) + beta(t) + beta(t+1) + --- + beta(v-1) + beta(v) where dnz is of the form beta(t-1)Ptbeta(t)P(t+1) - --beta(v-1)PvBeta(v) in N.

Lemma: For three points in M, i,k,p with i not= k not=p and three points in N, j, l, q sucshs that j not= l not= q, a three point match i -->j, k__>l, p-->q exists iff dmz = dnz where dmz=alphaki(k+1)Pis alpha(k+1)i(k+2)---P(p-1)i alpha(p- 1)ip and dnz=betalj(l+1) Pj(l+1) alpha(l+1)j(l+2)---P(q-1)j beta(q-1)j beta(q-1)iq.

Proof: The included angles alphakip and betaljq must be equal.

Definitio;n: A linear encoding of the patern is defined as (C(i=1toM)(C(k=1to M)Prialphari)#)#(C(j=1to N)(C(s=1toN)Psj betasj)#)##.

Defintion: A prima-facie one-point match of a point i in M and a point j in N is defined as having aat least T lines in the i-bunch-vector matching at least T lilnes in the j- bunch-vector insofar as the lengths of the vectors are concerned.

Lemma: A coarse pattern matching can be done in time O(M**2N**2/T**2)log max(M,N).

Proof: In the linear encoding o;f the pattern matching problem only points with a prima-facie one-point match need be considered. Thus only M/T +1 and N?T+1 points need be considered for M and N respectively by the Pigeonhole principle. To determing prima-facie one-point matches with sorted M-lines and N-lines for each node, requires O(MMlog N + NNlogM) time i.e. O(KKlogK) time, where K=max(M,N). To find the largest common subsequence of matching lengths of the bunch-vectors associated with points i and j ( in M and N respectively) takes O(MN) time, by a straightforward algorithm. Thus the total time taken is O(MMNN/TT)logK.

Lemma: The coarse matching can be done in time O(KKK/TT)logK logK time.

Proof: by using the more efficeient algorithm for determining the largest common subsequence the lemma follows.

Lemma: A fine pattern match takes O(MMNNlogK) time in the worst case.

Proof: The previous coarse match yuields sets of T points that match. For a refinement the diagonals of the T- polygon associated with the T points have to be checked, and this takes O(T(T-1)/2) or O(TT) time for each set of T points.

Lemma: in the case of a match <T points existing an O(MMlog M ) algorithm is possible.

Proof: by partitioning and redistribution one can avoid pattern matching as all holes cases will arise. The time taken is only for sorting and comparison of lines. Here M>= N>= T is assumed.

Lemma: in the case of a match of >= T pints occurs a O(MMlogM) time is possible.

Proof: By partitioning and redistribution the only time taken is for sorting and comparison of line lengths and final pattern mataching. Here M>= N>=T is assumed.

Lemma: A pracitcal algorithm for the pattern matching problem takes O(MlogM) time with a LAn of M nodes.

Proof: A straightforward parallel processing speed-up of the algorithm with the LAn considered as M parallel processors

yields the result. Here M >= N>= T is assumed.

Outline of the final Algorithm:

[Step 1] Form the i-bunch-vectors and j-bunch-vectors.

[Step 2] Sort M-lines and N-lines using a precomputed table to determine the Pythogerean length from the cartesian coordinates.

[Step 3] Partition and redistribute either M or N and check for matching lines.

[Step 4] Have a matrix[1..M,1..N] to record one-point matches which is set up in step 3.

[Step 4] For each one-point match (i,j) with i in M and j in N use the algorithm for determining the longest common subsequences to set up count := maximum number of lengths of the i-bunch-vector that match lengths in the j-bunch-vector.

[Step 5] if count for all one-point matches is < T then the match fails.

[Step 6] Proceed to a fine match by considering all triangles in M and N with vertices i and j in the M-polygon and N- polygon to see if a match with T points exists.

Lemma: If a match of T points exists the time taken cannot be more than O(TTKlogK).

Proof: At most TT one point matches exist and the time taken to determine the longest common subsequences is O(KlogK). To determine a fine match one has to check TT triangles.

Lemma: If a match of T points exists then the time taken cannot be more than O(TTlogK) with a LAN of K nodes.

Proof: Obvious.

INTRODUCTION

The Eighth application of HGA, is in the non- traditional area of the applicability of real-life programming language (ALGOL-like, PASCAL-like or COBOL- like) descriptions, to aid popularisation of education and literacy in Automata Theory and the Theory of Computation in the Indian Environment and thus applied HGA to (reliable (!)) abstract target machines. It was found that the use of the RAM and RASP turned out to be crucial, in the training, instruction and mastery of the machine/assembly language instruction repertoire, of a given processor. One can easily master the entire instruc tion repertoire, repeating elementary programming examples, using HGA, by isolating, universal subsets of instructions in the repertories. In this one takes abstract views of the instruction repertoire subsets, as dealing with a set manipu lating machine, a propositional Calculus machine, at one end; and at the other extreme end, the full power of a Universal machine: a string manipulating machine or a machine doing fixed and floating point arithmetic through algorithms. Such studies and training, in a totally industrial environment, were crucial in training assembly/machine language program mers, from the dilettante programmer to hard-core systems programmers, with proven success for a decade and a half. The TDC-12 & TDC-312 (Trombay Digital Computer, 3rd genera tion, 12 bit word length) employed octal machine language programming, over a range of applications in Controls, Commu nications and Computers. An upgradation took place to assem bly language only with the TDC-316, as the assembler (paper tape oriented on the TDC-12, -312) was too slow. 8.1 An extension of HGA, not too popular, was to use other abstract machines, at the lower end of the hand-coding. While it was found that the RAM and RASP were directly useful, the equivalent abstract machines of the Chomsky hierarchy, only served as cultural background, barring the finite automata (which is everywhere). HGA when used for elementary Computational Complexity theorems, is effective, but laborious. It was concluded that while useful for introductory and moderate theorems (results), there is no other way of study, train, think, teach or do the area expect as propounded by the pre-eminent Prof. J.E. Hopcroft (1969, 1979). Similar attempts are seen recently and similar conclusions can be drawn.

8.1.2 A different line of generalisation of HGA is to the abstraction of Automata. An Automata are nothing but computing agents, and the full description of their behaviour, cumbersome and at times complex. HLL specifica tions have been usefully employed to describe their behaviour and construction.

8.1.3 The extreme length and complexity of detailed description, can be shortened by algorithmic descriptions at times. This can be considered to be a useful use of HGA. Thus all constructions in Formal Languages of Automata Theory can be described in detail by an algorithmic specification in real- life PASCAL-like language, with suitable ADT's and these can be reduced by-hand to the detailed abstract formalisations.

8.1.4 An extension of such constructions to post- machines and program-machine is relatively straight forward.

8.1.5 However such applications of HGA are fruitful only for simple constructions, in a practical sense. More advanced constructions, though effective, turn out to be meaningless unless they are integrated by the use of 'intuitive' arguments' normally employed, in more advanced constructions. Furthermore, demonstrations by counter- examples, as in the non-closure of cfg's under intersection, seem to have no mapping into HGA. Furthermore' demonstrations like the decidability of equivalence of fa seem to be easier in the mathematical formulation of sets rather than PASCAL-like descriptions, though these aid in controlling the abstraction of the formalism. The HLL descriptions in such cases may not be considered to be practical algorithms, but effectively an aid to control complexity and undue abstractions.

8.1.6 HGA as used here us an aid to describe the constructions and an aid in the understanding 'what is done', but no aid in mastering the intuitive arguments and techniques that go into how to 'do the area'.

8.2 HGA applications to the area of syntax analysis techniques is extremely practical and has been employed, but in the pure area of Formal Language and Automata Theory, perhaps a new environmental language and/or technique has to develop and just as ADTs help controlling complexity a new type of ADTs oriented towards this area has to develop.

8.2.2 It is felt that this should be on the lines of tradeoffs between complexity of formalism, complexity of abstraction, complexity of HLL specifications/descriptions, complexity arising from different abstractions operating at the same level of abstraction, and complexity of hand-translation. The extended PASCAL to be employed in HGA thus varies depending on the domain of problems considered.

8.2.3 Thus a universalisation of HGA will require, augmentation by suitable borrowings of features from different programming environments and the techniques, skills and methodologies will vary for effective practical applicability.

8.2.4 Thus we have logical extensions to HGA, to choose the appropriate 'source specifications', the 'translation specification' and the final 'target specification'. Thus if ultimately the target is machine language, we have to go through a variety of HGA stages. Thus HGA is not a single stage application, but in general will involve in general stages of greater and greater refinement.

8.2.5 Thus the HGA process will have to be considered by itself as a meta-process of stepwise refinements of the HGA process.

8.2.6 Assembly Language Programming (the use of RASP and Post Machines and Program Machines):

8.2.7 The instruction repertoire of real life digital computers, character-oriented (IBM 1401), decimal-oriented (IBM 1620), binary machines, is partitioned into subsets, such that each subject is universal the sense of being able to compute the partial recursive function.

8.2.8 An obvious subset consists of the arithmetic operations, the compare and conditional jumps, and the unconditional jumps. By reducing the arithmetic multiplication/division to addition, making at times the assumption of ultimate word length one gets more and more universal sets.

8.2.9 By dropping the restriction on the register size, assuming it to be unlimited in size, one gets new register- oriented universal sets.

8.2.10 By reducing arithmetic to repeated incrementation and operators one isolates smaller universal sets.

8.2.11 Once the coding scheme is understood (BCD,EBCIDIC or ASCII) one proceeds to isolate string oriented operations, using string sizes that fit only one word, two words, multiple words or at times ignoring the word size. Then we obtain an abstraction of a real-life digital computer as a string manipulating machine. A class of sub-machines considered is to view the real-life digital computer as a set manipulation machine, a prepositional calculus machine, by their equivalent realisations of bit string operations, as in standard Pascal implementations.

8.2.12 The above process of isolation of subsets is generalised to stack operations. The exercise of reducing the instruction repertoire to universal subjects, allows a comprehensive study and use with elementary programming examples of the instructions repertoire. A cultural skill & gain is the ability to migrate to different machines in the same environment without much cultural shock.

8.2.13 Before the above method was used in instruction the migration of assembly language programming from the TDC-12 (PDP-8 like) to TDC-316 (PDP 11 like) was found to have some mental blocks, owing to the powerful addressing nodes of the latter, not found in the former.

8.3 The mastery and appreciation of the use of the entire instruction repertoire is thus easily achieved and then applied to more sophisticated assignments; and is the on-the-job performance is considered it has turned out good entry- level assembly language programmers in the areas of Real-Time Application, Communications software and system programming.

8.4 The elementary programming techniques used were the PASCAL-like control structure and data structures (with limited use of pointers, and the aim was to think out the programming examples in the High Level Language, and map the same systematically to unoptimised assembly language equivalents, and in a final pass optimise the assembly language.

8.5 Introducing Many programming languages simultaneously

In an obvious use of finding universal subjects of features, one could study a particular programming language, or many programming languages simultaneously.

8.5.2 The crucial concept of effective computations through RASPs, now equipped more as subset - FORTRAN, subset COBOL, subset PASCAL, etc., is essential to allow the entry-level programmer to migrate from one programming language to anoth er.

8.5.3 A practical application of HGA is to the problem of Pattern Matching in Euclidean space considered in chapters 4 and 5. HGA is applied to obtain a practical algorithm which when mapped onto a LAN for the chance print matching problem leads to a viable solution.

8.5.4 It was informally opined by J E Hopcroft (circa 1972) that perhaps the only practical result of Automata Theory is Cook's theormem. The theorem states that is a 2-way deterministic pushdown automata can perform a computation then it can be simulated in linear time by a RAM and hence a practical linear algorithm emerges with current general purpose digital computers.

8.5.6 The traditional application of the theorem has been the determination of a substring in a given string or variants of the problem. By a suitable string encoding the pattern matching problem in Euclidean space is solved by an extension and application of the theorem, and using HGA to obtain a viable solution.

8.6 An extension of Cook's theorem is to the common subsequence problem. Given two strings x and y the common subsequence is c1c2c3---ck where with x=a1a2----an and y=b1b1---bm there exist j1<j2<j3---<jk, and l1<l2<l3---<lk , for all ji in 1..n and li in 1..m such taht aji = bli, for i in 1..k. The common subsequence probnlem can be solved in time O(mn) [Hop,74]. A more efficient solution is that it can in the present application be solved in time O(mlogn) [Hun,77].

8.7 A mapping of a 2DPDA computation to a RAM can be considered to be an application of HGA as the the mapping to a practical algorithm. In the pattern matching problem the data structure that practically arises is the position tree. Given a string x we form a string x$ with a new symbol $, not in the vocabulary of x. If x$ = a1a1---an a(n+1), then the indices refer to the positon in the string of the symbols in x. Associated with each position i is the shortest substring identifer of x which is a substring of x starting from ai and uniquely identifying the position i. A position tree is a tree with its leaves being the positions 1 to n+1, and the path from the root to the leaf being the substring identifier for that positon. It is known that a position tree always exists and can be constructed in O(n**2) time or even O(n) time by compacting the chains of nodes which have only one son. The common subsequence problem can be solved by using position trees in a straightforward manner.

8.8 The algorithm for the determination of the largest common subsequence is used as a scaffolding in the pattern matching problem in Eucildean space, by a suitable string encoding. The existence of a common subsequence in the encoding is shown to yield a coarse match which is then refined to obatin a fine match.

8.9 The more effiecient O(mlogn) algorithm [Hun,77] uses a variant. Given two strings (sequences) x and y a threshold matrix Tik is set up where the value of Tik is the portion of the string y that should be considered starting from the first positon to obntain a susequence of k matches with i. Tik can be computed in a straightforward manner from T(i-1)k and T(i-1)(k-1). By compacting the data structures the efficient algorithm arises.

Definition: Given a pattern M and a pattern N, a base vector ik is any line of M in the i-bunch-vector and a base vector jl is any line of N in the j-bunch-vector.

Definition: For an i-bunch-vector ( of M ) and a j- bunch-vector (of N) if vertices k,p of M; l,q of N are considered , i not= k not=p and l not= q not= j, then the included angles kip and liq are redfered to as alpha(kip) and beta(ljq) respectively.

Definition: An ordering of lines in the i-bunch- vector (of M) is a clockwise (or anticlockwise) ordering of lines (vectors) subject the the following conditions:

a) ((Ai,k,p,r) (i,k,p,r in M and i not= k not= p not= r)) , (alpha(kip) < alpha(kir) implies ir occurs later in the ordering, and (alpha(kip) = alpha(kir) implies ir occurs later iln the ordering, if L(ip) < L(ir), and (alpha(kip) = alpha(kir) and L(ip) = L(ir) implies ip occurs before ir (arbitrary choice).

Definition: A string encoding of the i-bunch-vector of M is defined as P1i alpha1i P2i alpha2i---PMi alphaMi where (Aj)(j in 1..M) Pji is the length of the lilne ji and apha(ji) is the included angle between the lines ji and (j+1)i.

Definition: A string encoding of the j-bunch-vector of N isdefined as P1j beta1j P2j beta2j---PMj alphaMj where (Ai)( i in 1..N) Pij is the length of the line ij and aphaij is the included angle between the lilnes ij and (i+1)j.

Terminology: The string encoding of the i-bunch-vector of M is abbreviated as C(r= 1 to M) Pri alphari and the string enclding of the j-bunch-vector of N as C(r=1 to N) Pri betarj. The C operator is analogous to the phi and sigma operators used for continued products and sums and is here used for concatenation of strings.

Definition: A string encoding of M, with all the bunch- vectors is X = C(i= 1 to M) ((C(r=1 to M) Pri alphari)##).

Definition: A string encoding of N, with all the bunch- vectors is Y = C(j=1 to N) (C (for s=1 to N) Psj betasj) ##).

Definition: A string encoding of the pattern matching problem is (C(i=1 to M) ((c(Pri alphari)##)Y###))#.

Definition: A matching sequence of lengths is any subsequence of x = C(r=1 to M) Pri and y=C(s=1 to N) Prj such that if x and y are reqritten as C(k=l to l+T-1) Pmk dmk and C(k=l to l+T-1) Pnk dnk respectively, the (Az)(z=l to l+T- 1)(Pmz =Pnz) i.e the lengths Pmz,Pnz matching dmk and dnk are arbitrary subsequences of x and y.

Lemma: A matching sequence of lengths is a necessary condition for a match to exist.

Proof: If a match exists a T-polygon exists and hence a vertex i exists such that the i-bunch-vector has a matching sequence of lines which is the longest common matching subsequence of lengths.

Definition: The included angle associated with dmz is defined as alpha(t-1) + alpha(t)+ --- + alpha(u-1) + alpha(u) where dmz is of the form alpha(t-1)Pt alpha(t) P(t+1)--- alpha(u-1)P(u)alpha(u) in M.

Definition: The included angle associated with dnz is defined as beta(t-1) + beta(t) + beta(t+1) + --- + beta(v-1) + beta(v) where dnz is of the form beta(t-1)Ptbeta(t)P(t+1) - --beta(v-1)PvBeta(v) in N.

Lemma: For three points in M, i,k,p with i not= k not=p and three points in N, j, l, q sucshs that j not= l not= q, a three point match i -->j, k__>l, p-->q exists iff dmz = dnz where dmz=alphaki(k+1)Pis alpha(k+1)i(k+2)---P(p-1)i alpha(p- 1)ip and dnz=betalj(l+1) Pj(l+1) alpha(l+1)j(l+2)---P(q-1)j beta(q-1)j beta(q-1)iq.

Proof: The included angles alphakip and betaljq must be equal.

Definitio;n: A linear encoding of the patern is defined as (C(i=1toM)(C(k=1to M)Prialphari)#)#(C(j=1to N)(C(s=1toN)Psj betasj)#)##.

Defintion: A prima-facie one-point match of a point i in M and a point j in N is defined as having aat least T lines in the i-bunch-vector matching at least T lilnes in the j- bunch-vector insofar as the lengths of the vectors are concerned.

Lemma: A coarse pattern matching can be done in time O(M**2N**2/T**2)log max(M,N).

Proof: In the linear encoding o;f the pattern matching problem only points with a prima-facie one-point match need be considered. Thus only M/T +1 and N?T+1 points need be considered for M and N respectively by the Pigeonhole principle. To determing prima-facie one-point matches with sorted M-lines and N-lines for each node, requires O(MMlog N + NNlogM) time i.e. O(KKlogK) time, where K=max(M,N). To find the largest common subsequence of matching lengths of the bunch-vectors associated with points i and j ( in M and N respectively) takes O(MN) time, by a straightforward algorithm. Thus the total time taken is O(MMNN/TT)logK.

Lemma: The coarse matching can be done in time O(KKK/TT)logK logK time.

Proof: by using the more efficeient algorithm for determining the largest common subsequence the lemma follows.

Lemma: A fine pattern match takes O(MMNNlogK) time in the worst case.

Proof: The previous coarse match yuields sets of T points that match. For a refinement the diagonals of the T- polygon associated with the T points have to be checked, and this takes O(T(T-1)/2) or O(TT) time for each set of T points.

Lemma: in the case of a match <T points existing an O(MMlog M ) algorithm is possible.

Proof: by partitioning and redistribution one can avoid pattern matching as all holes cases will arise. The time taken is only for sorting and comparison of lines. Here M>= N>= T is assumed.

Lemma: in the case of a match of >= T pints occurs a O(MMlogM) time is possible.

Proof: By partitioning and redistribution the only time taken is for sorting and comparison of line lengths and final pattern mataching. Here M>= N>=T is assumed.

Lemma: A pracitcal algorithm for the pattern matching problem takes O(MlogM) time with a LAn of M nodes.

Proof: A straightforward parallel processing speed-up of the algorithm with the LAn considered as M parallel processors

yields the result. Here M >= N>= T is assumed.

Outline of the final Algorithm:

[Step 1] Form the i-bunch-vectors and j-bunch-vectors.

[Step 2] Sort M-lines and N-lines using a precomputed table to determine the Pythogerean length from the cartesian coordinates.

[Step 3] Partition and redistribute either M or N and check for matching lines.

[Step 4] Have a matrix[1..M,1..N] to record one-point matches which is set up in step 3.

[Step 4] For each one-point match (i,j) with i in M and j in N use the algorithm for determining the longest common subsequences to set up count := maximum number of lengths of the i-bunch-vector that match lengths in the j-bunch-vector.

[Step 5] if count for all one-point matches is < T then the match fails.

[Step 6] Proceed to a fine match by considering all triangles in M and N with vertices i and j in the M-polygon and N- polygon to see if a match with T points exists.

Lemma: If a match of T points exists the time taken cannot be more than O(TTKlogK).

Proof: At most TT one point matches exist and the time taken to determine the longest common subsequences is O(KlogK). To determine a fine match one has to check TT triangles.

Lemma: If a match of T points exists then the time taken cannot be more than O(TTlogK) with a LAN of K nodes.

Proof: Obvious.

INTRODUCTION

In this chapter the use of HGA is extended based on empirical experience to the IS field and the use of the same to using COBOL for IHLL descriptions of core computer sciences related algorithms outlined. The ninth application of HGA is in the non-traditional area of using COBOL (the only language available at times to quite a few programmers for quite some time in the environment) as a vehicle for Computer Science and Information Systems Engineering education and thus attempts to show the applicability of HGA in the Business Software area. Though FORTRAN-IV, -V have been decried by the eminent Algolists COBOL-74 has many features to aid structured programming and step-wise refinement can be made visible when programming in COBOL-74. An attempted mapping successful from the feedback obtained over a decade and a half has been the use of algorithms - Knuth D E 1969 [Knu,69] with COBOL-74 and HGA. The environment as elsewhere those days allowed little memory the COBOL-74 translator poor optimisation and extensive overhead (both verbosity and time requirements) and HGA was essential to run the final programs in assembly language.

9.1 The advent of COBOL-74 made substantial applications in Information Systems possible. However, the high unreliability of the Hardware/Software combine required HGA as extensive patching was required both in the source code and in the target code. (Reliability was achieved only by the mid-eighties). There have been HGA uses in the sense of conversion of COBOL packages to assembly language for efficiency. Today, however the COBOL programmer views his COBOL machine as being transparent (as elsewhere).

9.2 If one were to generalise Hansen's method of Systems Programming in the direction of considering having a source specification and an effectively equivalent target specification in some HLL we have the case extended to using COBOL in the non-traditional application of instruction and use in basic Computer Science education: a method dictated by the requirements of the environment of the indigenous computer industry traditionally and now applied to the DOEACC ALCSS, A, B and C level courses of the Government of India for certification of computer courses. Traditional instruction in Data Structures, Algorithms, Programming Languages, Translator Writing, Operating Systems, Numerical Methods, HLL descriptions of Automata behaviour have relied on assembly, ALGOL-like, PASCAL-like and C-like programming language specifications. The Information Systems field did not fall in this ambit and was essentially based on using COBOL as the vehicle. In the seventies and eighties in the indigenous industrial environment a need arose for instruction of computer science but the only available HLL was COBOL. If one were to consider the HLL descriptions in the source language, a hand translation was used to convert the same to COBOL-68, -74, -85. The resulting COBOL specifications were 'optimised' i.e. converted into specifications that experienced COBOL programmers normally use in their programming.

9.3 Such a non-traditional specification can be likened to a logical generalisation of Hansen's method. Historically Knuth's use of MIX for Data Structures was applied by HGA to FORTRAN in courses in Data Structures. If one were to consider MIX as the starting point for substantial and authoritative literature in the art of programming techniques one can consider employing specifications in MIX as the initial step and hand-translation to COBOL or vice-versa.

9.4 It was found that COBOL has many advantages as a programming language: it has many constructs for structured programming (but for the do-while construct in earlier versions), has extensive readability criteria in the programs developed, poses challenging problems in the implementation of some features, has built-in TABLE data structure to simulate the main memory, thus enabling the non-traditional introduction of 'pointers' and 'dynamic storage allocation'. The main drawback in the use of COBOL has been its verbosity, but from the point of view of the fraction of time this verbosity occupies in the entire software life cycle, it is but a minor inconvenience.

9.5 The basic technique for instruction was to directly convert algorithms in MIX and the associated algorithmic specifications into COBOL and have them implemented.

9.6 This non-traditional use of COBOL had and has the advantage of fully exploiting the COBOL subroutine facilities and the verbs of the PROCEDURE DIVISION and essentially satisfies the need of the indigenous industry in 'looking ' at computer science through the Information Systems field.

9.7 The 'optimisation' of the target language relied on the empirical experience of the methodologies and preferences of advanced COBOL programming over the indigenous Information Systems life cycles.

9.8 For purposes of clarification it may be noted that COBOL has records, unions through REDEFINES, arrays through TABLES, sub-range through CONDITION NAMES, repeat, while (COBOL-85), FOR through PERFORM, if-then, if-then-else, assignment through COMPUTE and much more, The absence of pointers however requires a direct simulation of the heap usually done through tables as in Knuth vol. I.

9.9 The use in Numerical Methods was mainly for expository purposes, upto the level of Acton [Act, 70], though most of the time a FORTRAN compiler was parallely available.

9.10 The ALGOL-60 and PASCAL language processors were used mainly for streams in core computer science, but the wider base of requirement of COBOL programmers made this use of HGA indispensable.

9.11 In the Algorithms area the algorithms - Aho A V, Hopcroft J E & Ullman J D [Aho, 74] were considered in COBOL and the effective hand translation did not prove to be difficult. Thus, even if one starts with the specification in ADT's one finds that one could effectively use COBOL as an algorithmic description and implementation language with the use of HGA.

9.12 A further generalisation to the use of COBOL has been in the application of HGA to the classic treatment of Operating systems -Shaw A C 1974 [Sha,74] in Algol-like descriptions. These were effectively hand-translated by an application of HGA to suitable COBOL environments. Thus it may was concluded that the classical use of ALGOL, PASCAL and C was merely a historic accident rather than a necessity.

9.13 This method was also essential as the skills expected of a COBOL programmer in the environment were non- traditional in the sense that most programming had to keep track of the various machine abstractions, 'the opaque machine', owing to the slow stabilisation of hardware, system software and application software within the environmental constraints of the Information Systems life cycles and the product life cycles. This environmental requirement requires skills both in assembly (machine) and COBOL programming at a non-trivial systems programming level. 9.14 The availability of more reliable software and hardware environments allows more and more 'pure' COBOL programming oriented to just the COBOL-machine however the environmental problem of non-identical productionised systems and the effective unreliability of imported dumped computer components, not passing quality tests, still requires the COBOL programmer in the environment to be aware of the hard reality that the programming in COBOL is not for the pure abstract reliable COBOL-85 machine.

9.15 As most of the straightforward features of COBOL have an almost easy mapping to PASCAL-like control and data structures, the use of formal verification techniques in COBOL is easily applicable. These were at most used intuitively, more as a cultural background rather than as an essential skill, in-line with advanced COBOL programming practices rather than rigid Computer Science practices.

9.16 To summarise we have an effective and proven use of HGA in a line of generalisation, resulting from the use of COBOL for the non-traditional purpose of computer science education in the indigenous environment. A further line of generalisation and future validity of HGA is in the massive front office bank computerisation currently underway in India, where the effective Analyst Programmer, merely simulates the existing non-computer based Information Systems. Here the use of 4GLs like ORACLE has been considered to replace COBOL software. This can be considered to be an 'optimisation' in HGA where a COBOL program segment is replaced effectively by a 'command' in ORACLE. This is required as the current COBOL solution is much cheaper than the ORACLE solution and the number of branches running into thousands has a multiplicative effect in the overall savings. Thus we have a future line of generalisation of HGA in the direction of 4GLs.

INTRODUCTION

The tenth application of HGA is a summary of the attempts made to apply the technique to the development of communication software, real-time software and parallel processing applications.

10.1 The communication software applications have been oriented to radar development for military and civilian use (in assembly language with HGA historically and current ly involving upto 50,000 lines of 'C' code); programming of spc telephone exchanges (historically in assembly language with HGA, REMAL and currently in C AND C++) and applications in distributed architectures in SCADA (Supervisory Control and Data Acquisition) applications (historically in machine , assembly language with HGA, real-time FORTRAN, CORAL-66 languages and currently in 'C'.

10.2 The Parallel Processing applications of HGA could have been used in the development of Operating Systems in mini- computers,though in practice assembly language was directly employed. The O/S implementations were by hard-core, experienced and competent assembly language programmers, to whom HGA did not appeal. The consolation is that HGA with RAMs and RASPs was employed to make them learn assembly language in the first place.

10.3 The tenth application of HGA is in part, in real-time application software and deals with three aspects of applications. One is the use of Simulator Software for Nuclear and Thermal Power Plants ( running plants as opposed to designing them). The second is the development of Human Machine Interfaces for Nuclear Plants. The third area considered is SCADA (Supervisory Control and Data Acquisition) applications in:-

a) Gas/Oil flow as distributed over hundreds of miles (distributed computer architecture) to and from refineries.

b) Distribution of energy in railway traction after generation.

c) Water distribution over hundreds of miles.

10.3.2 Traditional applications used HGA partly but current distributed architectures tolerate the use of 'C' as HLL directly.

10.4 The direct application of HGA in the tenth application has been in the development of compilers for CHILL (Communication Software purposes) and ADA and CORAL-66 (for real-time software). These are experimental implementations. A special language REMAL using HGA was designed and implemented under the direction of the author in the seventies for communication software.

10.5 A natural application and crucial use of HGA with parallel processing concepts arises in the area of scientific software in the indigenous environment with its curious economy and way of life. The human programming effort is cheap, the degree of materialism is low, a PC is not a personal computer but a shared facility, a workstation is rare, owing to the relative levels of salary/income. However, the LAN as a centralised facility is common and it can be considered to be a multiprocessor configuration. This allows the polyalgorithms that naturally and intrinsically arise in the area of Numerical Methods to be mapped onto the nodes of a LAN. The use of HGA allows the complexity/sophistication of the nodes to be lesser than if a HLL is directly employed and this has a multiplicative effect in the savings in the cost of the LAN. The high cost of copyrighted imported software will as the environment develops lead to indigenisation of the software and the use of HGA will lead to substantial savings. An elementary case in point has been the use of HGA in the polyalgorithms associated with the determination of the roots of polynomials using FORTRAN as HLL being mapped onto the nodes of the LAN. Another line of application is the practical use in industry of the algorithms resulting from the exploding studies in recent years of parallel pro gramming speed-up of Numerical Methods algorithms. When the parallel algorithms are augmented by the use of HGA the substantial savings in the cost of the LAN allows practicality of commercial indigenous scientific software. By generalisation it is seen that the applicability of HGA is to all developing economies.

10.6 The main consideration in real-time systems is the requirement of time criticality in the response to an external event. 10.6.2 The historical uses have been in dataloggers in satellite launching, oceanography, cryogenics, numerical control,steel mills radio propagation control in mountainous terrains, static testing of turbines, aeronautics, robotics etc., These have had their software developed in assembly language as no High Level Language(HLL) was available. However, attempts at initial simulation were done in HLL and thus HGA was used to some extent. 10.6.3 The Data Acquisition Systems(DAS) have involved major super-thermal power plants with 250MW- 500MW generation capacity & nuclear reactors including a Fast Breeder Test Reactor(FBTR).

10.6.4 The SCADA (Supervisory Control And Data Acquisition) Systems have been in applications involving water, oil & gas flow over thousands of KM in distributed architectures; energy management systems(EMS) in power grids over thousands of miles using distributed computer architectures; process control in steel, cement and fertilizers with a centralized facility. 10.6.5 The primary requirements of practical real- time systems are the total reliability of the software and hardware components of data acquisition and control. The software part has to be of proven reliability, (if possible proven theoretically with RPC), in simulation models and finally in the field. An added crucial constraint is the time criticality of response of various sections.

10.7 The time criticality is shown roughly by the following table which summarizes typical real-time applications considered from real-life experience in the environment over the last 25 years.

--------------------+---------------------------+----------------

Example Area | Critical Component | Time of Application | Description | Criticality

--------------------+---------------------------+---------------- a. Information | Payroll, Inventory | Day(s)

Systems | Control |

b. Water,oil & gas | Leakage in the pipeline | Hour(s) flow in pipelines| |

-SCADA | |

c. Process control | Avalanche effect | Minute(s) in cement,steel &| |

fertilizer plants| |

d. Energy management| The grid isolation | Second(s) systems e.g. power| problem | generation & | | distribution | |

e. Nuclear Reactor | Response to some | Millisecond(s)

applications | conditions |

f. Satellite |Fastscanresponse |Microseconds launching & | 288 critical inputs to |

missiles | be scanned in 1 second out| | of 900 overall inputs in |

| 9 seconds. | | Trajectory control. |

g. Cyrogenics | Strengths of metals at | Minute(s)

| low temperatures |

h. Aeronautics | On line flight control | Second(s)

i. Numerical control| Control of machines | Second(s)

j. Turbine control | Control of running | Second(s) | turbines (500 control |

| loops) | k. Oceanography | Tide, wave statistics | Minute(s)

l. Radio Propagation| In mountainous terrains | Second(s)

control | |

m. Railway Traction | Restoration of break in | Minute(s0 | traction power |

n. Power Plants | Sequence of events, | Second(s)

| 100-200,5 msec resolution,| | Avalanche effects |

o. Energy manage- | Soaking pit automation | Second(s) ent in steel | project for energy savings|

plant | |

p. DAS for power | Data acquisition fed to | Second(s) plants | screen, 1000 analog & |

| 1000 digital inputs |

q. Tankodrome | Synchronization of | Second(s) | parallel processes | 10.8 The associated telecommunication software in distributed architectures has requirements which are given by the rough metric of upto 1 MB of code, but it is embedded in the real-time system it has to be totally reliable. In the applications considered, the nature of the software can be gauged by following table.

---------------------+-------------------------------------------

Communication Medium | Distances involved

---------------------+-------------------------------------------

a. Telephone Cables | 200 km, 300 band , highly unreliable

| telephone lines in India, very high error

| correction required.

|

b. VHF, UHF | Upto hundreds of kms and costly

|

c. Satellite | Leased channels both indigenous & foreign

Communication | satellites,for distances of thousand(s)

| km.

|

d. Optical fiber | Highly reliable, upcoming and valid |in the future, long distances, speeds of

| upto 1 M bits / sec possible easily. -----------------------------------------------------------------

10.9 The DAS in distributed architectures, had its software core on a RPMC(Remote Plant Monitoring & Control) package, developed in OMSI PASCAL, under the RSX-11M operating system on the PDP-11 platform. This being proven software, this was subjected to various alterations and transported onto VMS/VAX, Versados/UNIPOWER-30and onto UNIX/PC platforms.

10.10 Hansen's Approach (HA) finds extensive use in the simulation. of real-time system. Here the real-time SCADA system is simulated in an HLL and then mapped onto a pair (software on a computational platform, hardware components consisting for example relays or dedicated circuitry). In the case of critical components, a mapping to a triple is done (dedicated software on a dedicated microprocessor, hardware solutions like relays or circuitry, precomputation outside the critical period). An example of this is the use of df/dt relays in the grid isolation problem. Here df/dt has to be detected within one second and the grid isolated within one second. The time criticality is 2 seconds. An HGA solution mapping the problem to mechanical relays, involve only the 20 msec mechanical delay of relays. The relays that are to be tripped are determined by computing the load balancing, in period of 5 seconds sampling and determining the load shedding that should take place. The pre-computation in the larger time slot of 5 seconds, allows an HLL (real-time FORTRAN-77) to be used for the software of the critical period. The partial solution of the grid isolation problem, starts with the simulation of the solution to check out the procedure and maps it down to an interconnection of relays giving permissive contacts. To obtain the practical solution repeated application of HGA are used and checked for the timing constraints. A reverse use of HGA in the environment is in the upgradation of hardware turbine supervisory control logic, to a PC based control solution in software.

10.11 Any real-time system can in principle be describe by a program in a HLL. Thus a hypothetical view can be taken that a real-time system including the environment, monitors and possibly controls is only a computation described by the program (IHLL). The IHLL program is to be mapped into the real-time system. First it is simulated with various versions of IHLL and repeated applications of HGA take place to realize the real-time system software. The criticality of the time requirements requires extensive mapping of algorithms to hardware. In a typical Operator Information System(OIS) there is a criticality of requirement of about 2 seconds response time from the field to the operator's terminal for about 2000 points considered. To effectively use some graphics for the operator , the basic algorithms for line, circle, triangle, polygon, shading of the same have to be realized by hardware. At times firmware is employed to stay within the crucial timing constraints. The mapping of certain algorithms to relays or interconnection of relays is common. At times an isolation of critical parts of the computation and mappings of the same to a dedicated micro-computer is common. Thus the problem is thought out in a HLL, simulated and then by zero of more applications of HGA mapped onto a combination of software, firmware and hardware realizations. The essential simulation that the process involves and the repeated applications of HGA allow the use of software verification techniques to 'think out' the entire real-time system. The final realisation is done by human agent. It is experienced that each real-time system requires its own unique application specific mappings and thus precludes the use of any automated general solution.

10.12 Isolation of sub-skills and sub-tasks in Operator Information Systems(OIS) in Supervisory Control & Data Acquisition(SCADA) Applications.

To illustrate the fundamental considerations involved in the real time software, a simplified view of certain aspects of OIS is taken.

10.12.1 In monitoring & supervising a real- time process, one needs a 'mimic' of the process. The process mimic can be either by hard mimics or soft mimics. In a hard mimic, one has a very large mosaic panel, filling up a wall in the control room of the process, to aid the operator. However a hard mimic is more for visitors than for actual use for the operator. (An alternative hard mimic is to use tv- projection systems to fill up the wall with the process mimic). The operator prefers soft mimics. Soft mimics are displayed on operator workstations which are upto four in number. At a time two or more workstations may exist remotely for use for supervising senior management. An operator workstation is normally a PC 386/486, with floppy drives, a hard disk and a large (19" and above) color monitor. The advantages of soft mimics are that they are easy to modify and change by software, can display in detail with high resolution critical sections of the process mimic and are amenable to have much more detailed information (especially for analog values) than possible in a hard mimic. In a typical process having about 10000 parameters monitored, the process mimic is partitioned into about 50 sections.

10.12.2 In a typical process the 8000 - 10000 parameters to be monitored arise from 'points'. A point is a physical parameter arising from a physical component being monitored. The stimulus given by the point may be in the form of analog, digital or pulse variation of the parameter value. The inputs are subjected to a real-time computation and a response within a useful & relevant time frame given. The response is in the form of analog, digital or pulse outputs corresponding to the inputs received. The responses are translated into physical actions like closing a valve, changing the speed of a motor etc. The existence of a response gives rise to a 'control loop'.

10.12.3 The typical analog input arises from the speed of the rotation of a motor, a typical digital input arises from relays and a typical pulse input arises from metering (e.g. coal flow on belts, oil flow in pipelines, energy consumption). The analog input, for e.g. temperature, requires to be digitized. The physical parameter variation is converted by a transducer to an electrical current variation as a first step. The current is then digitized to a number, called raw data, by an analog to digital converter(ADC). A digital input/output is basically binary in form. A pulse I/O requires a counter.

10.12.4 In the digitisation, the range of variation of the analog parameter may be modeled either linearly, piece-wise linearly or in a non linear fashion. e.g. logarithmic.

10.12.5 The software process mimics are a mixture of graphics, text and enunciators (e.g. sound enunciators like alarms & hooters). Depending on the environment, the process mimics are specified on paper. This initial paper specification could possibly be Single Line Diagrams(SLD) for power plants and power distribution system, pipeline mimics for flow of oil/water in pipelines etc. These are stated in a precise language arising from the standards and conventions of the environment specialism. A typical simplified SLD of a power substation is shown below. Circuit Breaker

____ ____

______| |____/ ___ ______| |____/ ___

feeder |____| Isolator| | |____| 33KV/

| | 11KV

____ | ______ | ____

______| |____/ ___|____| |_____|______| |____/ ___

132 KV |____| | |______| | |____|

| Transformer |

____ | | ____

______| |____/ ___| |______| |____/ ___

High |____| |____| Low

Tension Tension

Highly simplified typical substation SLD

10.12.6 It may be noticed that, in the SLD the physical components (points) are shown by icons and correspond to icons in the workstation graphics. Dynamic variation of a digital parameter may correspond to an event or alarm. It is mapped on to the static diagram above, by specifying variation in colour, shape, shading of an icon or an audio alarm enunciator. The soft mimics must be able to specify static and dynamic aspects of the SLD specifications and thus logically have a static and dynamic component.

10.12.7 The static component of the soft mimics are normally stored in the operator workstation's local disk and have a 30 seconds relatively leisurely loading time. The dynamic component of the soft mimics are sent to the operator's workstation from the host computer and have a typical time criticality of 2-5 seconds. Considering the size of the process mimic, which reflects 10,000 parameters/points/icons, the soft mimics consists of about 50 sections of the process mimics having about 30-50 points per section.

10.12.8 The process of creating software mimics consists of two steps. The first step is a database creation of the points and the subsequent second step the picture generation.

10.12.9 In the first step the varying field input/output points are defined. The definition maps the point onto a record in PASCAL or a structure in C. The fields of the record contain the programmer defined name and various attributes of the point defined. These attributes could be type (a classification of the point by the actual physical component type), number (an unique identification number of the point), low, high, very low, very high, low return, high return (gives the digitized parameters as a raw data number as the X-coordinate ) and the engineering units along with scale and offset ( which define the Y coordinate converted to engineering units as given by y= mx+c, with m as the scale , c as offset and x as given earlier. The line is assumed to model a linear conversion in the digitisation). The low return & high return values along with the low and high in effect define an allowed band in the variation of the parameter. The very low and very high can be considered to be giving the degree to the urgency with which the event should be attended to.

10.12.10 The definition of points in the environment of the database creation of the SCADA software is done in the static picture mode of the static picture session.

10.12.11 The second step in the creation of the software mimics is picture generation. The pictures ( like SLDs ) are normally given on the paper by the user in the environment Specialism, with the icons shown and interconnected. Graphic tools are used to generate the pictures in the SCADA environment, with each icon mapping on to a point. The Metamorphoses Algorithm (MA) is thus elementary.

10.12.12 Thus in principle it is possible to design a systems program to implement the MA. In practice, human operators are used for the creation of software mimics.

10.12.13 Following the static picture session, the SCADA environment is modified to the dynamic picture mode and a dynamic picture session takes place.

10.12.14 In a dynamic picture session, the dynamic picture points are defined and the associated visual variation of icon shape, size, colour, shading or alarm enunciators specified. It can be viewed that a dynamic software mimic is superimposed on a static software mimic to reflect the variation of the state of the process. In dynamic pictures, alarms may be either informational (no action to be taken) or critical/urgent (requires operator's intervention).

10.12.15 To create the software mimics, the environment of a SCADA package is considered. A SCADA package (like the RPMC of Ferranti in OMSI PASCAL, S/3 of Texas Instruments (TI) in C or BECOS-30 of Brown Boveri Corporation in CORAL & FORTRAN) has the following components.

a. Telemetry Data Acquisition

b. Telecontrol

c. Data Processing (alarm recognition, reports, implementation associated actions, logging).

d. Picture display function

e. Report generation

f. Interface to application software

10.12.16 The SCADA package gives the environment of a HLL, which is sometimes called a Process Control Language (PCL). PCL has no standards and is normally proprietary.

10.12.17 However the complications to the MA arise from the presence of trend variables and the requirement of animated graphics displays. The trend of various critical or higher priority parameters are to be recorded either as current trends or as historical trends and displayed using graphical picturisation. As it would be cumbersome to monitor the trends of all the 8000 parameters , typically upto 400 trend variables are nominated, in a classification of four groups depending on various time frames. The animation of the software mimics could be elementary animation like picturising the flow of materials, like coal, the filling/emptying of a tank or flow of oil/gas/coal/energy through the process. At the more sophisticated level there could be a realistic mapping from the animation to the actual process.

10.12.18 The third phase in the creation of software mimics is report generation. Reports on the status of the process are given per shift, daily, weekly, monthly or yearly.

10.12.19 The highly demanding response times of the dynamic component of software mimics requires that bare software techniques & methods be used and overhead avoided. Thus application specific requirements may dictate the communication protocols to be used rather than employ a standard protocol like TCP/IP. The nature of HLL to be used again depends on satisfying the time constraints. The tradeoff between implementations in hardware & software are again dictated by the time constraints.

10.12.20 To simulate test, study & develop the software mimics an HLL can be used but the final resolution by a human agent is to a tradeoff between software & hardware and thus HGA is employed/used as an intrinsic requirement. The initial development can use formal verification techniques to great advantage.

10.12.21 As an OIS for nuclear power plants is radically different from one for a thermal power plant, a general purpose translator for SLDs cannot be written once & for all. However, certain parts of the OIS software, like the graphics modules, can be reused. This gives rise to the use of OOP with HGA for these modules.

10.12.22 To summarize it is seen than three levels of sophistication of the use of HGA emerge.

a. Routine static picture creation which can be done by a BSc(Computer Science).

b. The dynamic components of pictures which require some programming in PCL and process knowledge, MSc/B.Tech(Computer Science).

c. The animation of the software mimics which require graphics specialism (MSc / B.Tech. (Computer Science)).

d. The trend variables which require software, process knowledge & management skills to allocate the trend variables (MSc / B.Tech. with MBA).

e. The implementation of the picture creation tools which require highly specialized skills in real-time and communication software ( a virtuoso real-time & communication software programmer).

10.12.23 An extension & generalization of Hoare's clause is to an application in a Full Scope Power Plant Training Simulator. Here the process mimics of a power plant in Hong Kong are to be generated as an export project. The database definition relates to about 10000 points and a picture generation to about 300 pictures of 30-50 points. The report generation consists of test reports per shift, daily, weekly, monthly & yearly. Since the SLDs specified by the user are in a precise and standard format, the colour codes follow the relevant industry standards. In principle, the creation of the process mimics is an application of HC in HGA. Also, by using an extension of the fourth and fifth applications, it is possible to read the SLDs and automatically generate the process mimics, by the development of a dedicated systems programs with a reading attachment. In practice however programmers are used as data entry operators to create process mimics and thus HGA is used.

10.12.24 The above considers SCADA & OIS in the indigenous environment. A state of the art SCADA as reported by ABB, Germany for a coal-fired plant generalizes to use integrated multimedia with an integration of the operator into an integrated quintuple (Operator, Controls, Communica tions, Computers with integrated multimedia, the process). The report IEEE,Spectrum,June 1994) uses virtual reality, integrated multimedia, visualization, telecommunications on workstation platforms. However, for nuclear power plants the traditional simplicity is essentially what is recommended.

INTRODUCTION

In the Information Systems (IS) field one has the basic model: Industry Structure Modes(ISM) of the British Computer Society(BCS). As this model has been accepted by the European Commission, and by about 40 countries around the world and even by the DPMA of USA a study is undertaken to see the validity of the model in the environment considered here.

A study & survey of Information Systems Life Cycles & Audit of the same of IS Life Cycles on about 1000 systems, ranging from minicomputers to mainframes was initiated in 1987 in the indigenous environment. A reverse mapping of the said Audit to the ISM yields both a rationalisation of the Software Environment and points out the major bottlenecks and discrepancies in IS Life Cycles in the indigenous environment.

11.2 It is seen that basically what was involved in IS was Level 3 or Analyst Programmer work. Level 4 work at the Systems Analyst work was much less. The basic bottleneck to IS was the scarcity of the IS Resource. There are major cultural, traditional and ethical attitudes and practices which make the IS Resource difficult to garner and gather. Non-taxed income at all levels leads to a parallel economy which is not recorded. Even the introduction of a Payroll was difficult in quite a few environments.

11.3 Even applications like available IS resources like the database on fingerprints, Provident Fund(PF) backlogs would take decades to get into some computer readable form.

11.4 The absence of any scientific Management Practices(MP) makes the introduction of Decision Support Systems(DSS) difficult. The MP are more based on intuitive traditional regional and feudal practices rather than on scientific rationalistic principles. The trend is towards repairing the knowhow, practices and skills in scientific management, as required by Western industry, but successful cases are rare. Ethical, regional and economic considerations are bottlenecks. 11.5 Thus experience above Level 4 is rare. The problem is at first sight confusing as career development above Level 4 has not been in conformance with ISM requirements of technical, business and managerial skills required both in the core and non-core areas. The policy of even a major software house like Tata Consultancy Services(TCS), mainly engaged in the export of cheap Level 3 labor, has been to discourage any career development Level 4 and above in-house as a consistent policy over the last 25 years. This deliber ate policy has led to a shortage of Software Project Manage ment personnel in Level 4 and above in India. The trends in the Computer Maintenance Corporation (CMC) over the last decade have been bright in developing personnel with skills Level 4 and above in the indigenous environment.

11.6 In the IS area we have the basic phenomenon of local simplicity and global complexity. The global complexity is more of effective management, which is not too effective at present in the environment owing to the absence of sufficient experience.

11.7 In the Insurance Environment the applications were merely to replace the existing Unit Record equipment by the indigenous computers. There was resistance to any further automation efforts owing both to the fear of unemployment and the more fundamental fear of parting with the Information Resource.

11.8 In the Banking Environment the applications were restricted to front office computerization efforts. Poor communications infrastructure make networks of branches interconnected a concept of the future. Even the introduction of Payroll an Daily Wage accounting for casual labor proved to be herculean management tasks as they basically harnessed the Information Resource. Though the 1000 systems considered were in the Environments of Transportation, Shipping, Railways, Healthcare, Dairy development, Steel, Cement, Fertilisers and so on (in practically all sectors of the industry) the use was restricted to Level 3 or at most Level 4 work so far as the Technical Specialisms requirements were concerned, the applications being mainly in Payroll, Accounting and Inventory Control. Decision Support Systems where they existed were mainly restricted to MIS reports without much integrity of the data. Operations Research applications were mainly non-existent.

11.9 In the Information Systems area the need is still to get into the automation of mundane chores of library administration, rather than the use of any databases in Information Science. Exceptions exist to a few primitive databases.

11.10 The integrity of the few national databases that exist is doubtful as economic, social, historical, traditional and administrative factors make the creation and establishment of databases a one-time affair with poor main tenance with respect to upgradation of the Information Re source. However, an awareness of the use in commercial envi ronments of effective databases is seen to be the trend, especially in the consumer industry.

11.11 In the Education sector the shortage of faculty in the fields of Computer Science and Engineering and Information Systems Engineering and Management made the fields nascent and not too demanding on IS tools. Exceptions are present in a handful of institutions but the computer in these environments was restricted to a department or two. 11.12 In the vast Government Environment a resistance to automation arose from the unemployment consequences envisaged and feared, the reluctance to part with and harness the Information Resource, the time frames to create files and databases from existing data, a general fear of the loss of freedom arising from automation, the traditional high cost of computation, the time frames taken by bureaucratic clearances, the gaps between promised or expected performances and delivered realities. The easy availability of PCs and PC-LANs in the present context points to a major trend of change in the above, though it is doubtful if the historical cultural reluctance to harness the Information Resource can be overcome.

11.13 Though it is estimated that at present about half a million PCs exist in the environment, LANs are easily available, and the personnel required upto Level 4 easy to come by, the Information Resource is seen to be the main bottleneck for the development of sophisticated IS for quite some time to come.

11.14 On the software side the applications merely required a two-level hierarchy of software management struc ture. This allowed the individualistic approach to succeed despite the unreliability of the hardware/software combine. The trend would still be towards individualistic disciplined programming, though the standard Software Engineering process considerations are gaining ground, in the sense of an im provement of an awareness of the existence of the same.

11.15 The various Software Technology Parks seem to be more of an offshore closed extension to various companies of the West and their lasting effect on the indigenous environ ment would only be to generate an awareness but not a trans fer of process methodologies knowhow, as they are merely employers of programming labor. The shift to sophistication will only acquire knowhow from an application of the princi ples enunciated by the academicians of the West since it is the only source of free and authentic information. Attempts at collaboration with Western industry have always seen the commercial angle predominate when it came to acquiring know how both on the hardware and software fronts.

11.16 In the Technical Specialisms, however, higher levels of the IS have been achieved in some instances. However, the Software Safety Specialism is nascent and the Knowledge Engineering Specialism is found to be academic in the environment. Substantial higher levels have been achieved in Software Engineering, Graphics etc. The area of Audit in Software is rather new.

11.17 However, if the ISM is applied to study the career growth of personnel in the IS area a very clear relevance and applicability of the model has emerged. The gross career growth over the last 25 years in the environment has shown the dichotomy of the core and non-cores areas to be grossly applicable and valid. However, the Marketing Spe cialism is seen to dominate the Sales Specialism and be treated as a core area. The Education & Training Specialism is as per the model. The entire IS Service Delivery Special ism is mirrored as per the ISM. The necessity to indigenously manufacture the hardware and software has given a greater importance to the Hardware and Research Specialisms. The growth rate in the Technical Specialisms is as per the model with the career trajectories as reported in IS Management being as in the West and forming a basis for the ISM. There has been a paucity of IS Audit and adverse effects of the same are being felt and have created awareness and responsi bilities as per the ISM. The IS Sales Support growth exactly mirrors the ISM. The IS Procurement Specialism gives a much higher growth rate owing to the regulations and practices to import the components of the hardware portion of the IS Asset. The IS Policy and Management has been vague though the levels of responsibility are more or less mirrored by the ISM.

11.18 Thus the above empirical results show that barring a few isolated instances the ISM model, by reverse mapping is applicable to the indigenous IS industry. The deviations are minor in nature owing to environmental considerations. This despite the fact that we are considering an environment where the Cartesian rationalistic philosophy is not generally accepted, the IS Resource is traditionally scarce owing perhaps to the historical psyche of the environment over generations, the environment considers the dichotomy between truth and fiction vague when it comes to the short-term or long-term past information and the Law of Contradiction is accepted as a valid truth in everyday life.

11.19 Note 1:- The Audit had access to and collected a large store of information on which the above comments are based. However, most of the concrete facts and information are sensitive, classified and pertain to clients and thus not elaborated upon here. 11.20 Note 2:- Two cases which show the difficulty, owing to the mindset, in collecting information can be seen from the survey of engineering education and survey of the computer industry over the last two years by the IEEE Spectrum. Neither mirror the entire scenes considered and the difficulty of incompleteness can be traced to the cultural and social mores and attitudes towards the Information Re source in the indigenous environment. 11.21 In any software development project one has to consider the life cycle of the software. As it would be proper to consider a general framework the Industry Structure Model (ISM) of the British Computer Society(BCS) is chosen. This is a Performance Standard for Information Systems Engineering and Management. This model though operative from 1986 (Version I) and from 1990 (Version II) is considered to be a general enough framework from which one could extrapolate backwards into the three distinct environments considered in this study. The first environment is the self- reliance period, the second is the collaborative period and the third (current) is the open period being built around PCs and PC-LANs.

11.22 Though the software reported here is more oriented to System Software and Algorithm Design and Analysis, the ISM framework becomes an important frame of reference and uniformity in view is gained especially when one considers the fact that one of the primary goals of the environment was the development of Information Systems and their nurture, care and maintenance through their entire life cycles.

11.23 The idealistic self-reliance period is briefly described in the report on the TDC-ALGOL-60 implementation in Chapter 14.

11.24 The concern now is the mapping of the ISM model into the collaborative and open periods. 11.25 The primary consideration vis a vis the self- reliance period is that reliable hardware could be taken for granted and failures of hardware could be attended to by a well organized Service Level Delivery network which is effective in operation though a large amount of ad hoc techniques and methods are used. Efficiency of production has been achieved where two different pieces of the same system would be identical as far as software is concerned. The technology available is limited though in principle as one moves from the collaborative to the open period though everything is available from the point of hardware, financial considerations restrict the availability of technology. Thus the number of systems in the mainframe and super-mini classes is highly limited. Though there is a proliferation in the PC and PC-LAN systems it should be noted that the Personal Computer is still not "personal". Personal ownership of a PC or PC-LAN is still not a very common phenomenon in the environment. Even large organizations cannot afford to freely stock, distribute or make PCs available. The workstation is still a rarity, even for the software professional, even in leading organizations. The availability is centralized PC facilities. The LAN though accessible is still not freely available or accessible.

11.26 A few LANs may be found and time on the same is much sought after. However, the reliability has done away with much of the earlier round the clock working with the computer.

11.27 On the system software side the high degree of reliability of Compilers, Operating Systems and Utilities is taken for granted. Bugs are of course discovered but the standard and accepted practice, as anywhere else, is to bypass the bugs and live with the problem.

11.28 The easy availability of system software at a next to nothing price is a reality in the environment, especially when it comes to PCs and PC-LANS: software without adequate documentation is more or less a reality. The documentation is either little or non-existent except for standard PC, PC-LAN software for which the literature is enormous and easily available. The use of GUIs and Windows software is still nascent and has still to develop into a need for more and more Graphics Specialism uses.

11.29 One of the primary problems of the environment is the poor Communications infrastructure and is to be bypassed by cable tv networks or WANs. Hence the degree and need of the Communications Specialism is slowly on the rise.

11.30 The Information Resource is a major bottleneck and if one considers the Knowledge Engineering Specialism to come after a suitable degree of Database Specialism is created and achieved, it is perhaps one area where little progress has been made at present. The need of Environment based Knowledge Engineering Specialism is a great need.

11.31 The absence of a broad based education of IS professionals leads to a great vacuum when it comes to the Environment Specialism of the ISM. It is only now that different segments dealing with the Information Resource are maturing in different areas of the economy and an explosive growth in the Environment Specialism is anticipated. The combination of Environment Specialism and Knowledge Specialism in combination with 4GLs and LANs is expected to be the direction to be considered and in Engineering applications the LAN being considered as a parallel processor will lead to explosive growth of IS.

11.32 There is sufficient know-how in both software life cycles and IS life cycles but the knowledge is more empirical rather than based on Software Engineering or IS techniques, methods and principles. The Science and Methodology underlying the same has to be adapted to the Environment. The need is to bridge the gap between the Western studies available and the empiricism of the Environment. The HGA applications may be viewed as an attempt in this direction.

11.33 Sophisticated Man Machine Interfaces are a great rarity and only the traditional MMIs are what are used. An explosive growth in this area in the use of Multi- lingual Interfaces with cosmetic software is in the offing. Voice MMIs are one area where much promise is expected in the Environment.

11.34 The low degree of IS so far has bypassed the requirements for IS Audit and Quality Audit. The effects of this in an Environment like the Banking Industry is still to be felt. The trend is now towards IS Security demands and the simplicity and generality considerations of HGA are in this direction.

11.35 The Environment provides a substantial number of persons at a high level in the ISM insofar as the technical versatility and communication skills are concerned.

11.36 A primary requirement of ISM high levels is the Managerial skills required. It is not known at present as to what the principles are on which one should map Management Science and Theory as it exists in the West onto the Management levels and practices that exist. The low level of sophistication in the IS life cycles seen so far leads to only a few being available with experience in IS Development Management and only a few instances exist of IS management e.g. the railway reservation project in India.

11.37 The dichotomy between idealistic and pragmatic Vision, Mission and its translation to pragmatic rather than idealistic strategy, policies and tactics does not exist in the environment as Indian IS industry does not have the necessary experience. The use of DSS being at a minimum and the use of MIS still very little and the parallel illicit financial economy leads to a ethical and moral economy that makes the IS Resource a rarity; much more empirical experi ence has to be gained, collected and analyzed before the relevant principles of System Theory, Decision Theory, Commu nication Theory, Management Theory and Science applicable to software Development Management Strategies emerge.

11.38 Level migration is at present based on seniority, primarily, rather than as the ISM prescribes. The applicability of standard principles not having been found or demonstrated the common sense approach to IS Management at all levels becomes as hoc and at times can degenerate to feudal practices especially at higher levels of Management practices, as here only tradition trading practices and principles are the empirical and historical experience available to base the practices on. These practices and principles mostly pre-date the Industrial Era and relate to ethics, morals and value systems more based on Sociological principles rather than sound business practices and in such cases where they tend to business or financial practices they tend to rely on traditional experience in pre -Industrial era trades and business practices. 11.39 The mindset which affects the software life cycles, the management practices and value systems involved results in an IS life cycle which is not based at times on the rationalistic philosophy and principles. The non- acceptance of the Cartesian Philosophical dogma, the non- acceptance and only co-existence with the applicability of Scientific Methods and Practices is felt in the Managerial Practices which naturally result in a great deal of need for Professional Management Practices.

11.40 The collaborative period has neither resulted in a transfer of knowhow in either the technical area in the form of techniques, methods or tools or in the managerial area in the form of professional business practices and has even led at times to dubious practices and results on both sides, the West and the East, a chapter like many in history, which need not be analyzed but merely ignored, as is the aim and purpose of this study that only positive methods and results will have a a lasting impact.

11.41 In the technical specialisms are evident a very high level of the ISM model being achieved. The technical specialisms are more centered around individuals and their individual skills and knowledge and their effective operation in the environment, with the technical stream\sub-stream as a major foundation. It is seen here especially when we consider the yardstick, of a person with a technical specialism migrating to the West and his \her performance there in IS life cycles, we see that all levels of the ISM model are found in the technical specialism sub- stream.

11.42 An aspect that should be considered as very important is the multitudinous number of persons operating at a very high level in the IS field from various Environments who can perhaps be best classified as attempting the mature entry route into the ISM at high levels mostly at the Hybrid Management levels but not equipped with the required practical, technical or empirical in-depth knowledge left alone in-depth or versatile knowledge at the lower levels. It would perhaps be proper to classify then at a Vision Level which is outside the ISM. The industry still has to develop to a point where the vision level translates to IS core and noncore streams. It would perhaps wise to include in the ISM a Companion Level to accommodate persons of the mature entry route level who do not enter the ISM framework but who are sufficiently involved is IS policy.

11.43 A requirement of the ISM is that a particular job should not have IS functions across levels. The present situation is that, as in the case of the author, one is expected to operate and perform effectively over a number or levels and sub-streams, owing to the paucity of personnel and effective distribution of functions as required by the ISM.

11.44 The Vision and Mission of IS have changed drastically over the three periods considered, one find the effective translation into relevant policies, strategies and plans to achieve the same have been affected. Furthermore the psyche, mindset and value system of the environment has affected the strategies and policies. The primary dichotomy between idealism and practicality not being evident and being absent simultaneously, one finds effective managerial practices in strategy, tactics and operation ad hoc, arbitrary and mostly dubious. Perhaps as experience is gained over substantial period of time, IS life cycles in the Environment will stabilize to yield effective and relevant policies, techniques, methods, practices and tools just as in the ISM but with all shades modified to suit the environment. However as an IS is still to be judged on performance also, one finds that within the requirements and judgment of the Environments, such IS have been successful in the areas of MIS, Payroll, Police , Insurance, Banks etc. where the adherence is as per a reverse mapping that fits the ISM.

11.45 In the above only the basic considerations relating the ISM to the Environment have been considered, a detailed mapping which exists is beyond the scope of the study considered here.

INTRODUCTION In the software initiative currently underway to tackle the software crisis in real-time & communication software, one sees a replay of the historical Wagnerian epic as in the West and a debate as to whether software is an art, science, engineering, discipline or just plain management.

Four approaches to software development in the real-time and communications areas seem to be of relevance.

A) INDIVIDUALISTIC DISCIPLINED APPROACH:

The software development in projects with a successful track record has taken the view that software is an disciplined art with relevant, rigid structured disciplines. Thus the environment has taken to, adapted and used the individualistic approach as advocated by the Hoare/Hansen Generalised Approach (HGA).

This has yielded practical results in critical applications in real-time and communication software, with a logical mapping to at most two levels in the hierarchy and to date no better method seems to be yielding the same quality of results. It seems to be a must in critical nuclear appli cations software.

B) SOFTWARE PROJECT MANAGEMENT WITH MULTIPLE LEVELS IN THE HIERARCHY.

The growth in the complexity and size of software requires a departure from the above as in the Full Scope Power Plant Training Simulator Project. Software Management practices have to be introduced with an isolation of knowledge, skills and techniques required. A more than two level, multidisciplinary team with different skills is re quired. This is a major departure from the two-level success ful hierarchy of successful real-time software project expe rience. The success of Digitex of Sweden with more than a million lines of ADA code, shows the possibility of real-time software as a Software Management Practice, Such knowhow is however proprietary, the skills and practices have to be rediscovered and generated anew in the environment. This requires a planned, deliberate systematic, effort to isolate & separate software modules & to map the same onto different programmers with different specializations, skills and compe tence at different levels of the hierarchy.

C) THE PROCESS METHOD OF SOFTWARE DEVELOPMENT.

The need to use less-than-virtuoso programmers, a lenience in the competence & skills level, to allow for some relaxation in the rigid disciplined approaches and to cater to Technical

Specialists and Specialisms, paves the way for process methods known as Software Engineering Practices. This involves the well-known requirements of (structured) reviews, walkthroughs, (Quality ) Audits, Quality Assurance(QA), third party audits and the introduction of standards. These standards can be either those evolved by the institution with ISO-9000 certification or the higher Software Engineering Institute(SEI) levels, or the still higher Malcolm Balridge National standards of the USA or the asymptotically demanding Japanese standards. The applicability of the process method can only be to less demanding, non-critical sections of the real-time & communication software. A case in point being the picture generation portion of the software mimics of SCADA projects. The introduction of a hierarchy with more than two levels, demands knowledge & skills in group dynamics and various management skills & practices to be followed.

D) FORMAL METHODS AND TOOLS

Recently in real-time software for critical applications the use of formal verification is gaining ground. Here, with HGA one can be certain of the software correctness, though the complexity of formal verification must be tackled by demanding high simplicity in the real-time software algorithms employed.

The tools for the formal approach are the use of realistic case-tools which could possibly use state transition diagrams to model the real-time environment and the use of relevant computer-aided Formal Verification tech niques. In the highly critical nuclear real-time software to enable formal verification, IAEA (The International Atomic Energy Commission) dictates that operating systems, recursion & complexity in algorithms be totally avoided. The KISS (Keep It Simple Stupid) philosophy has been gaining ground & is of especial relevance to real-time software and has been broadly the philosophy of the environment.

CONCLUSION

The above four methods and approaches are comple mentary and co- existing current solutions to the development of software in the real-time & communication software in the environment.

Thus it can be seen that the indigenous real-time and communication software development experience over the last 25 years, has resulted in empirical conclusions which are identical with those arrived at in the West. This despite the fact that the environment considered does not fully subscribe to the Cartesian rationalistic philosophy & the validity of the Law of Contradiction is accepted as a normal way of life.

 

 

 

 

 

INTRODUCTION

The short-term future directions considered are the anticipated improvements in the capture of the Information Resource in the Indian Environment which will enable extensions of HGA to more sophisticated Information System Life Cycles involving substantial use of the Database Specialism extending HGA to 4GLs ( a case in point being the ongoing use of ORACLE in front office bank computerization efforts). At present ethical, cultural, traditional etc. constraints make the capture of the Information Resource the main bottleneck in India in the use rather than the applicability of computers. Increased industrialisation and competition will demand more cosmetic applications extending the use of HGA to the Graphics Specialism ( a case in point being the ongoing cosmetics of military radar application software) with associated extensions of HGA to OOP. The long- term future directions are in the applicability of HGA to the Knowledge Engineering Specialism ( a case in point being ORACLE based Knowledge bases in medical applications) and in the direction of the lower end of the handcoding the non- traditional non-Von Neumann Architectures ( a case in point being the use of systolic processors for Image Processing); and non-Electrical Engineering based technologies ( a case in point being fingerprint recognition and processing with Optical Computers). The advent of Man Machine interfaces (MMI) make possible studies of HGA in non-traditional computation specifications at the higher end of the hand translation i.e. here we move from the HLL specification/representation to the computation specification (cases : diagnostics, comments, computation specifications and results of computation representations using pictorial and audio MMI). As an indication an overdone cosmetic application in elementary Civil Engineering: an Estimation and Costing package oriented to the middle class Indian housing projects is outlined. This employs pictorial, audio and text as input/output, in the 14 major languages of India (a multi-lingual interface with a single machine translation).

The role of HLL descriptions turns out to be very crucial and in principle the full (total) verification of the software the key factor. However, in practice, it appears that as pointed out for practical compiler writing: formal techniques need not be used directly but only help to guide one's thinking is also applicable to the total verification of the HLL specifications and descriptions.

Thus in principle HLL description verifications in non-traditional computations specifications that advances in Human Machine Interfaces make possible form a useful and crucial study. The paucity of Information Systems Audit and Quality requirements so far will, when the environment develops require major applications of verification techniques. The paucity of traditional requirements in scientific applications (Numerical Methods software) and the Electronics Design Automation(EDA) areas in the practical scientific engineering and industrial environments have led only to cursory studies in the area and these are pointed out as future crucial applications of HGA.

The contemporary drive to streamline software development is to be integrated with the HGA options. The methods and algorithms given by the various generalisations of HGA will need to be considered in the current requirements of software of shortening the

time to market improved quality ( by using program verification wherever possible) a need to tailor ones soft ware to available technology and a need to hold down costs. A study of software life cycles in the indigenous environment has shown that most software (98%) is delivered late and the cost of maintaining software has been 20-30 times the cost of developing the same. The need to reuse, update and re- engi neer existing software becomes of prime importance.

A recognition of the need of maintainability has given rise to the ISO-9000 awareness and a trend to the application of rationalistic management and sound engineering principles to software development. A major migration from assembly to C having already taken place the migration to C++ is actively being contemplated.

The use of editors, debuggers and a searcher for determining software reuse by simpler specification of ob jects is the trend.

The types of algorithms considered depth first search, pattern matching, polyalgorithms for pattern matching etc. can basically be converted to objects for reuse. Most of these have been considered at machine level for which re sponse time is critical in so far as execution time is con sidered and is time-consuming to write and debug.

Porting of the same to new processors is a very time consuming job as most of the code has to be rewritten. As assembly code is processor specific it is not modular and thus basically not given to reusability easily. The new generation of optimising compilers in the near future are attempting to squeeze most out of the performance of the processor an are expected to do away with hand-coding as is being propagated. However, it is seen that the types of optimisations HGA requires involves program modification, conversion of parts of the computation to table lookup, mapping onto parallel processing configurations for speed-up, and thus in a way constitute 'critical' sections of the overall system program, which will still have to be done by hand. This on the first study appears to make portability to the new processors difficult. The much publicized optimising compiler cannot be expected to give more than 5 to 10% larger size than the best handwritten assembly but the types of optimisation HGA considers will call for a expert systems base to be incorporated into the optimising compiler and this seems far away. The shrinking costs of memory is seen to be a major optimisation possibility of using memory level table look-up or by the use of hash tables with pre-computed possi bilities of the various computational results.

Some of the directions shown by the pattern recognition applications of HGA are radar signal processing control systems, robotics , instrumentation and multimedia applications involving image processing. Here multiprocessing speed-ups will prove to be a major problem of managing software development teams. However the tasks are easily seen to be breaking down to a subproblem per processor. As the tools at higher level language level for handling inter- processor communication become available it will be easier to have parallel processing speed-ups as in the pattern matching problem.

The type of multi-processor code- development systems that automatically generate good C code and now available can be used at the higher level. As technology becomes better and more available the order of magnitude deterioration in speed that a migration to C and C++ involves can be tolerated.

The next step would be to concentrate on the 'what if' portion of the system software by using the HGA developed algorithms as icons. The basic icons represent objects at a different level. In a multi-level approach one could perhaps specify how close to the machine level one should be and what level of shift to a high-level language with its attendant deterioration in response is tolerable. At a very high kernel level operation one could consider invocation of the subroutines/messaging the objects as being mapped into the interrupt structure of the processor being used.

The next step of sophistication would be to use the HGA developed core algorithms in a client/server environment using remote procedure calls (RPC).

The high optimisation of critical sections of the code need not be done everytime by the designers. As in DSP libraries one could build up hand-optimised C- callable assembly routines tailored to a particular processor or LAN configuration for functions critical to peak performance. A possible extension would be to a graph, string or matrix algorithms.

Thus it is seen that though dramatic breakthroughs have occurred in processor speeds this will not totally do away with assembly language programming for critical appli cations. A dent will be made by a migration to C and C++ but the technology will definitely have a higher price tag. HGA as used here the only solution when the HLL solution gives too high a overhead percentage and the solution is to take out such modules and hand-code the same using HGA.

HGA combines the biggest advantages of HLL, by using their comparatively easy documentation facility which the final assembly language routines do not have. In the area of embedded software, object oriented programming is easily implemented and it is here that HGA helps by using the inherent modularity of embedded applications the C or C++ or PASCAL-like aids to documentation/design/implementation and to finally implement the object in assembly language.

The type of HGA applications considered point to another line of approach. One can migrate from one processor to another if it is known what CPU's are available off-the- shelf, it will be possible to consider different code generation phases for the C/C++ compilers for the target machine.

A major trend in the computer industry is the practically available parallel-computing environments. In the indigenous environment supercomputers, LANs, multiprocessor embedded systems are available.

A choice of programming model is being considered as traditional versions of HLL like C and FORTRAN were de signed for uniprocessor machines executing a single thread of code. It is expected that FORTRAN 90 will contain data types and statements that allow entire vectors and arrays to be manipulated.

HGA is more oriented to tackle the issue of performance tuning. It is seen that in the area of parallel debuggers and performance monitoring tools few standards exist and till such time as standards evolve HGA will find a place.

REFRENCES

[1][Act,70] Acton, F.S., Numerical Methods that almost work, Harper & Row, 1970.

[2][Ahi,90] Ahitov,N & Neumann,S., Principles of Information Systems for Management, Wn.C.Brown, 1990.

[3][Aho,72] Aho,A.V. and Ullman,J.D.,The theory of Parsing,Translation and Compiling, vols. I and II,Prentice- Hall,72.

[4][Aho,74] Aho,A.V. et. al., The Design and Analysis of algorithms, Addison-Weley, 1974.

[5][[Aho,74b] Aho,A.V. and Johnson,S.C., LR Parsing, Computing surveys of acm, 6:2,99-124, 1974.

[6][Aho,77] Aho,A.V. and Ullman,J.D., Principles of Compiler Design, Addison-Wesley, 1977.

[7][Aho,83] Aho,A.V. et. al., Data Structures and Algorithms, Addison-Wesley,1983.

[8][Aho,86] Aho,A.V. et. al., Compilers, Principes and Techniques & Tools, Addison-Wesley, 1986.

[9][Aki,89] Aki, S.G., The design and analysis of parallel algorithms, Prentice-Hall, 1989.

[10][Ala,78] Alagic,s & Arbib, M.A., The design of well- structured and correct programs, Springer-Verlag, 1978.

[11][Ans,74] ANSI COBOL, American National Standards Omstotite,Inc., 1974. [][And,86] Fingerprint identification--data format for information intercahange, American National Standards Institute, New York(1986). [12][Awa,91] Awad,E.M., Systems Analysis and Design, R.D.Irwin Inc., 1991. [][[Bai,84] Baird,H. S., Model-based image matching using location, Ph. D. thesis, Dept. of Electrical Engineering and Computer Science, Princeton University, 1984. [13][Bar,67] Barron, D.W., Pascal- The language and its implementation, Wiley,1981.

[14][Bat,70] Bates, F., & Douglas, M.L., Programming Language/One, Prentice-Hall, 1970.

[15][Bau,76] Bauer, F. L., & Eickel,J., Compiler Construction: An advanced course, 2nd ed., Lecture Notes in Computer Science 21, Springer-Verlag, 1976.

[16][Bay,67] Bayer,R.,et. al, The ALCOR LLINOIS 7090/7-94 post-moterm dump, CACM 10, 804-808, 1967.

[][Blu,73] Blum,H., Biological shape and visual science, J. Theoretical Biology, 38, 205-287,1973.

[17][Bro,74] Brooks,F, The mythical man-month, Datamation, 1974.

[][Bun,93] Bunke, H and Buhler, U., Applications of approximate string matching to 2D shape recognition, Pattern Recognition, 12, pp 1797-1812, December 1993.

[][Cow,83] Cowger, J. F., friction Ridge skin: Comparison and Identification of Fingerprints, Elsevier, New York,1983.

[18][CCI,85] CCITT High Level Language, vol VI, fascicle v 1.12, Malaga-Torremolineos, 1985.

[19][Cha.87] Charette, Software Engineering Environments, McGraw Hill, 1987.

[][Cha,92] Chaudhuri, D., et al, A modified metric to compute distances, Pattern Recognition, 25, pp 667-677, 1992. [20][Cha,93] Chaudhuri,B.B. & Mazumdar, D.D., Two-tone Image Processing and Recornition, Wiley-Eastern, 1993.

[21][Clo,86] Cloksin,N.F. & Mellish, C.S., Programming in Prolog, Springer-Verlag, 1986.

[22][Coh,90] Cohen, D.I.a., Introduction to Computer theory, Wiley, 1990.

[23][Con,63] Conway, M.E., Design of a separble transition diagram compler, Comm ACM 6:7, 396-408, 1963.

[24][Con,63b] Conway, R.W. & Maxwell,W.L., CORC- the Cornell Computing Language, Comm ACM 6:6, 317-321, 1963.

[25][Con,70] Conway, R.W> et.at., PL/C- A high performance subset of PL/I., Tr 70-55, Computer Science Department, Cornell, 1970.

[26][Con,73] Conway, R.W. & Wilcox, T.R., Design & implementation of a diagnostic compler for PL/1.,Comm ACM 16:3, 169-179, 1973.

[][Cor,94] Cortelazzo, G et al, Trademark shapes detection by string matching techniques, Pattern Recognition, 27, pp 1005- 1018, August 1994.

[27][Cow,93] Cowart,R, Mastering Windows 3.1, bpb publications, 1993.

[28][Dat,86] An Introduction to Database Systems, 4th ed., Addison-Wesely, 1986.

[29][Dav,88] Davis, G.B. & Hoffman, T.R., FORTRAN 77, McGraw Hill,1988.

[30][Der,71] DeRemer, F., Simple LR(k) grammars, comm ACM 14:7, 453-460, 1971.

[31][Dij,76] Dijkstra, E.W., A discipline of programming, Prentice-Hall, 1976.

[32][Fai,85] Fairley, R., Software Engineering Concepts, McGraw-Hill, 1985. [][Fai,91] Fairhurst et al, Matching structural implementational models in the specification of image classifiers, Pattern Recognition, 24, pp 555-566, 1991. [][FBI,82] The science o;f fingerprints: classification and uses, FBI, US Govt. printing office, Washington DC(1984). [33][Fei,88] Feitelson, D.G., Optical Computers, The MIT Press, 1988.

[34][Flo,68] Feldman, F.A. & Gries,D., Translator writing Systems, CACM 11, 77-113, 1968.

[][Gal,1892] Galton, F., Fingerprints, Macmillan, London, 1982.

[][Gao,93] Gao, Q. G., et al, Curves detection based on perceptual organisation, Pattern Recogfnition, 26,pp 1039- 1046, July 1993.

[35][Geh,88] Gehani, N. & McGettrick, A.D. ed., Concurrent Programming, Addison-Wesley, 1988.

[36][Gol,82] Goldschlager Les & Lister, A., Computer Science- A modern introduction, Prentice-Hall, 1982. [][Gol,92] Goldback, L., What is distance and why do we need the metric model for pattern learning?, Pattern Recognition, 25, pp 431-438,1992. [][Gol,92b] Goldgol et al, Matching and motion estimation of three dimensional point and line sets using eigenstructures without correspondences, Pattern Recognition, 25, pp 271- 286, 1992. [37][Gor,88] Gordon, M.J.C., Programming Language Theory and its implementation, Prentice-Hall, 1988.

[38][Gra,67] Grau, A. A., et al., Translation of Algol-60, Springer-Verlag, 1967.

[39][Gri,65] Gries, D., et.at., Some techniques used in the ALCOR ILLINOIS 7090, CACM 8, 496-500, 1965.

[40][Gri,68] Gries, D., The use of transition matrices in compiling, CACM, 26-34, 1968.

[41][Gri,71] Gries, D., Compiler Constrution for Digital Computers, John Wiley & Sons, 1971.

[42][Gri,68] Griswold, R.E., et al., The SNOBOL 4 Programming Language, Prentice-Hall, 1968.

[][Gri,90] Griffin, P. M., et al, A methodology for pattern matching of complex objects, Pattern Recognition, 23, pp 245- 254, 1990.

[43][Han,77] Hansen, P.B., The Architecture of Concurrent Programs, Prentice-Hall, 1977.

[44][Har,87] Harrington, S., Computer Graphics- A Programming Approach, McGraw Hill, 1987. [][Har,91] Haralick, R. M. et al, Glossary of computer vision terms, Pattern Recognition, 24, pp 69-93, 1991. [][Hen,1900] Henry, E.R., Classification and uses of fingerprints, Roultedge, London,1900. [][Her,1880] Herschel, W. J., Skin furrows on the hand, Nature 23, 76(1880). [][Hre,90] Hrechak, A.K. anmd McHugh,J.A., Automated fingerprint recognition using structural matching, Pattern Recognition, 23, pp 893-904,1990. [][Hil,93] Hildebrandt, T. H., et al, Optical recognition of chinese characters: advances since 1980, Pattern Recognition 26, pp 205-225, 1995. [45][Hoa,73] Hoare, C.A.R. & Wirth, N., An axiomatic definition of the programming language PASCAL, Acta Informatica,2, 1973.

[46][Hop,69] Hopcroft, J. E. & Ullman, J. D., Formal Languages and their relation to Automata., Addison-Wesley, 1979.

[47][Hop,79] Hopcroft, J.E. & Ullman, J.D., Introduction to Automata Theory, Languages and Computation, Addison-Wesley, 1979.

[48][Hun,77] Hunt, J. W. & Szymanski, T. G., A fast algorithm for computing the largest common subsequences, Comm ACM 20:5, 350-353, 1977.

[49][IBM,68] IBM PL/I(F) Compiler-Programming Logic Manual, Form Y28-6800, 1968.

[49][Ind,91] Industry Structure Model, Release 2, British Computer Society, 1991.

[50][Inf,93] Information Technology Software Life Cucle Process LTD 33(1094)P1, Bureau of Indian Standards, 1993.

[51][Ive,62] Iverson, K, A Programming Language, Wiley, 1962.

[52][Jai,93] Jain, M.K., et al., Numerical Methods for Scientific & Engineering Computation, Wiley Eastern, 1993.

[53][Kah,89] Kahaner, D., et. al., Numerical Methods & Software, Prentice-Hall, 1989.

[54][Ker,78] Kernihan, B.W. and D.M. Ritchie, The C Programming Language, Prentice-Hall, 1978.

[55][Ker,84] Kernighnan, B.W. and Pike,R., The UNIX Programming Environment, Prentice-Hall, 1984.

[56][Kle,56] Kleene, S.C., Representation of events in nerve-sets in Automata Studies, 3-42, Princeton University Press, 1956.

[57][Kle,39] Klein, F., Geormety, Dover, 1939.

[58][Kne,85] Knepley, Ed. & Platt, R., Modula-2 Programming, Reston, 1985.

[59][Knu,71] Knuth, D.E., An empirical study of FORTRAN programs, Software Practice and Experience, 1:2,105-133, 1971.

[60][Knu,73] Knuth, D.E., The Art of Computer Programming, vols 1-3, Addison-Wesley, 1973.

[][Koc,87] Kock, M. K. and Kashyap, R. L., Usding polygons tso recognise and locate partially occuled objectss, Pattern Recognition Mach. Intell. 9(4), 483-494(July, 1987).

[][Kun,94] Kumar, S., et al, Parallel algorithms for circle detection in images, Pattern Recognition, 27, pp 1019-1028, August 1994.

[][Kur,94] Kurita, T et al, Invariant distance measures for planar shapes based on complex autoregressive modesl, Pattern Recognition, 27, pp 903-911, 1994.

[][Lan,86] Landau, G.M. et al, Efficient string matching with k mismatches, Theoretical Computer Sci., 43, 239-249, 1986.

[][Lav,83] Lavine, D., et al, Recognition of spatial point patterns, Pattern Recognitin, 16, 289-295, 1983.

[61][Law,87] Lawrence, P.D. & Marek,K., Real-time microcomputer system design- an introduction, McGraw Hill, 1987.

[62][Lee,87] Lee,D.T. and Preparata, F.P., Computational Geometry: A survey, IEEE Transactions in Computers, vol C- 33,no.12, 1072-1101, 1984. [][Lee,94] Lee, S. H., et al, A dynamic programming approach to line segment matching in stero vision, Pattern Recognition, 29, pp 65-78, 1994. [][Li,92] Li, S.Z., est al, Matching: Invariant to translations, rotations and scale changes, Pattern Recognition, 25, pp 583-594, 1992. [63][Lin,72] Lindsey,C. and van der Meulen, S.G., Informal Introduction to Algol-68, American Elsevier, 1972. [][Lin,82] Lin, C.H. et al, Fingerprint comparison I: Similarity of prints, J. Forensic Sci., 27, 290-304,1982. [][Lin,82b] Lin, C. H., et al, Fingerprint comparison II: on the development of a single fingerprint filing and searching system, J. Forensic Sci., 27, 305-317,1982. [][Liu,91] Liu, Y., et al, Determining straight lilne correspondences from intensity images, Pattern Recognition, 24, pp 489-504, 1991. [][Liu,92] Liu, Y., et al., Three-dimensional motion determination from real scene images using straight lilne correspondences, Pattern Recognition, 25, pp 617-639,1992. [][Llo,87] Lloyd, S.A., et al, A parallel binocular stero algorithm utilizing, dynamic programming and relaxation lebelling, Pattern Recogniton, 23,pp 305-317, 1987. [64][Loe,87] Loeckx, J & Sieber, K., The foundations of program verification, Wiley-Teubner, 1987. [][Mas,93] Masuda, T et al, Detection of partial symmetry using correlation with rotated-reflected images, 8, 1245- 1253, 1993. [65][Mck,70] McKeeman, W.M., et al., A compiler generator,Prentice-Hall, 1970.

[66][Mic,87] Microsoft- GW BASIC- User's Guide and Reference Manual,1987.

[67][Mor,72] Morgan,H.L., Spelling correction in systems programs, CACM 13:2, 90-94, 1972.

[68][Mor,87] Morgan, R., et al., Introducig the UNIX System V, McGraw Hill, 1987.

[69][Nor,81] Nori, K.V. et al., Pascal(P) Implementation Notes in Barron[1981], pp 125-170, 1981.

[70][Nau,63] Revised Report on th Algorithmic language ALGOL-60, CACM 6, 1-17, 1963.

[71][Nau,76] Naur,P., et. al., ed., Software Engineering, Concepts & Techniques, Petrocelli, 1976. [][Oga,86] Ogawa,H., Labelled point pattern matching by Delaunay triangulation and maximal clilques, Pattern Recognition, 19, 35-40, 1986. [72][Oxf,86] Oxford dictionary of Computing, Oxford University Press, 1986.

[73][Pak,72] Pakin,S APL\360, Science Research Associates, 1972.

[][Par,93] Park, J. H. et al, Three-dimensional object representation and recognition based on surfacr normal images, Pattern Recognition, 26, pp 913-922, 1993.

[74][Pec,71] Peck,J.E.L., Algol 68 Implementation, North- Holland, 1971.

[75][Per,77] Perrott,R. H., ed., Software Engineering, Academic Press, 1977.

[76][Phi,86] Phillippakis, A.S. & Kazmier, C.J., Structured COBOL, McGraw Hill, 1986.

[77][Ran,64] Randell, B & Russell, L.J., Algol 60 Implementation, Academic Press, 1964. [][Ran,80] Ranade, S. and Rosenfeld, A., Point pattern matching using relaxation, Pattern Matching, 12, 269-275,1980. [][Ran,91] Ranka, S., et al, Two-dimensional pattern matching with k mismatches, Pattern Recognition, 24, pp. 31-40, 1991. [][Rei,94] Rei, S.C., et al., Finding the motion, position and orientation of a planar patch in 3D space from scaled- orthographic projection, Pattern Recognition, 27, pp 9-26, 1994. [78][Rit,79] Ritchie, D.M., A tour through the UNIX C Compiler, AT & T Bell Laboratories, 1979.

[79][Rit,74] Ritchie, D.M. and Thompson, K., The UNIX time- sharing system, CACM 17:7, 365-375, 1974.

[80][Rob,93] Robbins, J., Mastering DOS 6, BPB Publications, 1993.

[81][Ros,67] Rosen,S., Programming Systems 7 Languags, McGraw Hill, 1967.

[82][Roy,93] M. K. & Dastidar, D.G.,COBOL Programming, Tata McGraw Hill, 1993.

[83][Sub,91] Subata,B. & Agarwal, J.K., Estimation of motion from a pair of range images: A review., Image Understanding, vol 54, no 3, 309-324,1991.

[84][Sam,69] Sammet, J.E., Programming Languages: History & Fundamentals, Prentice-Hall, 1969.

[85][Sam,60] Samelson,K and Bauer, F.L., Sequential Formula Translation, Comm ACM 3:2,76-83, 1960.

[][Sam,92] Samal, A., et al, Automatic recognition and analysis of faces and facial expressions: a survey, Pattern Recognition, 25, pp 65-77, 1992.

[86][San,81] Sanders, D.H., Computers in Business, McGraw Hill, 1981.

[][Sav,87] Saviers, K.D., Friction skin characteristics: A study and comparison of proposed standards, Garden Grove California Police Department, July(1987).

[87][Sch,90] Scheifler,R.W. & Gettys, J., X Window Syste, Prentice-Hall, 1990.

[][Sco,51] Scott, D., Fingerprint Mechanics-A Handbook, Charles C. thomas, Springfield, Illinois(1951).

[88][Ser,93] Serpes, M., Mastering ORACLE 6.0, BPB Publications, 1993.

[89][Sha,88] Shaw, A.C., The Logical Design of Operating Systems, Prentice-Hall, 1988. [][Spa,85] Sparrow, M.K. et al, A topological Approach to the matching of single fingerprints: Development of algorithms for use on latent fingermarks, National Bureau of Standards Special Publication 500-126, U.S Govt. Printing Office, Washington D.C.(1985). [][Spa,85b] Sparrow, M.K. et al, A topological approach to the matching of single fingerprints: Development of algorithms for use on rolled impressions, National Bureau of Standards Special Publicataion 500-124, U.S. Govt. Printing Office, Washington, D.C.(1985). [][Spa,91] Spragur, A. A. et al, A method for the automatic inspection of printed circuit boards, Image Understanding, 54, pp 401-415, 1991. [90]Sta,83] Starnes, T.W., Design Philosophy behind Motorola's MC 68000, BYTE, 70-92, April 1983. [][Sto,86] Stoney, D.A. et al, A critical ananlysis of quantitative fingerprint models, J. Forensic Sci, 31, 1187- 1216(1986). [91][She,90] Sheldon, T., Novell Netware, Osborne McGraw- Hill, 1990.

[92][Ste,71] Stearns, R.E., Deterministic top-down parsing, Proc. 5th Annual Princeton Conf. on Information Sciences and Systsems, 182-188, 1971.

[93][Ste,84] Steele, G.L., Jr., Common LISP, Digital Press, 1984.

[94][Ste,90] Stevens, W.R., Unix network programming, Prentice-Hall, 1990.

[95][Sto,80] Stoneman Requirements for Ada Programming Support, Department of Defence, USA, 1980.

[96][Str,86] Strousturup, B., The C++ Programming Language, Addison-Wesley,1986.

[97][Tan,85] Tannenbaum A.S., Computer Networks, Prentice- Hall, 1981.

[98][Tan,91] Tannenbaum, A.S., Structured Computer Organisation, Prentice-Hall, 1991.

[99]Ten,81] Tennent, R.D., Principles of Programming Languages, Prentice-Hall, 1981.

[][Ter,84] Teresa, Ma A., et al, An iterative Hough procedure for three-dimensional object recognition, Pattern Recognition, 17, pp 621-630.

[][Ton,89] Ton, J and Jain, A.K., Registering landsat images by point matching, IEEE Trans. Geosci. Remo;te Sensing, 27, 642- 651, September 1989.

[100][Toy,72] Toynbee, Arnold., A study of History, Thames & Hudson, 1972.

[101][Tre,85] Tremblay, J & Sorenson, P.G., The theory & practice of compiler writing, McGraw Hill, 1985.

[102][Vic,84] Vick, C.R. & Ramamoorthy, C.V., Handbook of Software Engilneering, Van Nostrand, 1984.

[][Vin,93] Vinod, V.V. and Ghose, S., Point matching using assymmetric neural networks, Pattern Recognition, 8, 1207- 1214, 1993.

[103][Weg,75] Wegner,P., Programming Language, Information Structures and Machine Organisation, McGraw-Hill, 1968.

[][Weg,82] Wegstsein, J. H., An automated fingerprint identification systsem, National Bureau of Standards Special Publication 500-589, US Govt. Printing Office, Washington DC(1982).

[104][Wir,66] Wirth, N & Weber, H., Euler: a generalilsation of Algol and its formal difinition, Part I, Comm ACM 9:1, 13- 23, 1966.

[105][Wir,76] Wirth, N., Algorithms + Data Stuctures = Programs, Prentice-Hall, 1976.

[][Won,92] Wong, E.K., Modesl matching in robot vision by subgraph isomorphism,25, pp271-286, Pattern Recognition, 1992.

[107][Woo,87] Wood, D., Theory of Computation, John Wiley & Sons, 1987.

[108][Wul,71] Wulf, W.A., et. al., BLILSS: A language for Systems Programming, CACM, 14:12, 780-790, 1971.

[109][Wul,75] Wulf, W. A., et. al., The Design of an Optimising Compiler, American Elsevier, 1975.

[][Yan,94] Yang, G et al, Human face detection in a complex background, Pattern Recognition, 27, pp 53-64, Jan 1994.

[][Yoo,93] Yoo, J.H., An ellipse detection method from the polar and pole definition on conics, Pattern Recognition, 26, pp 307-315, 1993.

[110][You,79] Yourdon, E., et al., Learning to program in Structured COBOL, Prentice-hALL, 1979.

[][Yuc,93] Yuceer,C et al, A rotation, scaling and translation invariant pattern classifrication system, Pattern Recognition, 26, pp 687-710, May 1993.