| Australian Journal of Educational Technology 1995, 11(1), 36-49. |
AJET 11 |
The teaching of relational database design to business students poses many problems. This paper looks at these problems and outlines an integrated approach which addresses the piecemeal and disjointed procedures common in many textbooks. A number of strategies for dealing with other problem areas of the topic are also described from the perspective of the prevailing business environment.
The lack of a single universally accepted design methodology for relational databases poses many problems for teachers. Designers tend to rely more on experience, trial and error, intuition and educated guesses rather than on carefully designed steps (Awed & Gotterer, 1992). This makes the subject difficult to teach, particularly to business students, who need to be given a management perspective rather than the usual in-depth treatment of specific technicalities popular in computer science oriented textbooks. A further complication has been a transition from the hierarchical structure of the 60s, through the network structure of the 70s to the relational model of the 80s and 90s. A great deal of the design work which evolved during the early periods, along with the hierarchical and network technology, is not really relevant to today's business student since they will most likely embark on a career in an environment where the relational model predominates. The role object oriented databases will play in business in the future is unclear at the moment due to the developing nature of the technology. Although some relational database software developers have included the object handling facility in their products, they have remained predominantly in the niche they carved out in handling complex data such as graphics and annotated text.
Rob and Adam (1990) have cited a tendency in industry to abandon database technology due to problems that arise from faulty database design and argue that the logical conclusion to this is to improve database design instruction. Research by Carpenter (1992) also casts doubt on the effectiveness of current teaching of database concepts. As a result of a survey of database students and their teachers, he found both to be deficient in even the fundamentals of the subject. His conclusion was that there is a "need to train present and future database teachers in proper database design concepts and techniques". A rather disturbing finding given the number of students passing through our colleges to positions in organisations having attained degrees and diplomas indicating a knowledge of database. This can only reflect badly on the institutions claiming to teach database courses. Kleen (1993) also expresses doubts about the teaching of database, citing problems of too much button pushing and memorising and not enough of "grasping the wider picture". She claims this will not only restrict the ability of students to recognise the ways that database could enhance their efficiency and effectiveness on the job, but will prevent them encouraging the installation of much needed databases in their organisations.
This paper considers the problems inherent in the teaching of database to business computing students in the light of this recent criticism of its effectiveness. Suggestions that university trained database students are not providing business with the type of expertise they need are born out by the proliferation of "short courses" offered by third party or vendor organisations. This places an extra financial burden on business when organisations are forced to enrol their employees in these courses to make up for inadequacies in basic training. The paper focuses on business planning and logical/conceptual design and provides a blueprint for business database teachers to help overcome the problems raised.
The main point to emerge from the literature is that business planning must be the prelude or first step in the design of a database. Unfortunately, it has gained a reputation for dubious cost effectiveness and suffers from a lack of acceptance in the business world. There has been more evidence of problems than success for information engineering methodologies such as James Martin's (1981) Strategic Data Planning (SDP) or IBM's (1981) Business Systems Planning (Hoffer et al, 1989; Lederer and Sethi, 1988; Goodhue et al, 1988, 1992).
Numerous database design methodologies have been proposed by theoreticians and practitioners alike in attempts to find the elusive, incontestably correct, logical design. One possible solution to this rather intractable problem could be to give students a spattering of many methodologies so that hopefully, they can glean enough from each to enable them to get by in the workplace. Alternatively one methodology could be selected from the many in the hope that it hasn't become discredited or redundant by the time students enter the labour market. Yet another approach could be to adopt a 'good' CASE tool and use it as the basis for a universal methodology.
The enthusiasm devoted to the normalisation process by successive textbook writers leaves everyone with the impression that nothing short of domain-key normal form will result in complete failure of the database. From an educational viewpoint, knowing when to stop and how far to delve into the normalisation process is a major hurdle to overcome. Although normalisation is an integral part of the relational model, it isn't necessarily the only way of rationalising data. Koch (1993) explains:
Normalisation is analysis not design. Design encompasses issues, particularly related to performance, ease of use, maintenance, and straightforward completion of business tasks, that are unaccounted for in simple normalisation.The fact that normal forms past third are rarely used outside of academia is reason enough to be sceptical of teaching them.
Bottom up (Data structuring and normalisation)
The bottom up approach starts with individual attributes or data items and attempts to synthesise these to form viable logical entities. The individual data items are defined by their association with specific applications and corresponding user views. The bottom up approach will produce the optimum logical structure provided it is possible to gather all the relevant data from the mass of detail into which the designer is immediately plunged.
Top Down (Entity-Relationship approach)
The top down approach starts with an analysis of the organisation at the functional level by identifying all function, processes and activities. Entities are the devised which represent groups of attributes upon which the functions operate, and relationships between the entities are then analysed to represent 'real world' associations. The approach proceeds in an incremental manner by incorporating new entities, establishing new relationships and resolving any conflicts which arise as a result of this gradual integration. Its basic philosophy is that the integration is driven by an analysis of the semantic properties of the input views until the best compromise is reached. Main problems with this approach include the omission of data and a final organisation of data that does not lend itself to efficient processing.
It is possible to use combinations of these two basic approaches so that one can complement the operation of the other by crosschecking and assessing the results it produces. There are also other methods of classifying design, namely, by the sequence of actions which involve the data. If the processes are used to derive the data structures, then the design is said to be process driven. Here the dynamics of the business (what happens? when? how? how often?) are described first and the data derived from them. If, on the other hand, the structures are derived from general semantics of information used in the business, then the design is data driven. This latter approach concentrates on the fundamental building blocks of systems which, it is claimed, are more stable than processes and hence, lead to better design. Rarely are either of these two methods applied exclusively, and a combination of both seems to be the most appropriate.
The newly emerging object-oriented database model should not be overlooked in database design and, although not within the scope of this paper, many aspects of the integrated approach outlined above can also be applied to it.
Although much more research has still to be done in understanding the human factors side of the design process, it is clear that intuition and imagination must be encouraged rather than stifled by methodologies that induce designers to slavishly follow a set of rules. This is where the integrated approach, as suggested here, can help build a good understanding of the overall process and aid budding designers in gaining enough knowledge so that later they are able to use their imagination. While some knowledge of dependency theory and semantic modelling is necessary, it is not mandatory to deliver a full "computer science" course in these areas in order to equip business students with the ability to apply the basic principles. It is very easy for students to get bogged down in such technicalities as the difference between fourth and fifth normal form, and not be able to see how it can be applied to situations for which it is relevant. The time factor must also be considered. Typically, time allocated to database in business courses is one or two subjects, usually of a semester's duration. This in itself can reduce any attempt at detailed data analysis to a piecemeal look at its various components. Unless an approach is adopted that encompasses a global view of the subject, and teaching is done with the objective of imparting an understanding of the overall concepts, students will be in danger of not being able to see the 'woods for the trees'.
We need to distance ourselves from the piecemeal approach adopted by many textbook writers where the nuts and bolts are covered in great detail and the overall picture is lost.
The myth that there is some mechanical way of deriving the 'best' design from a given set of requirements must be dispelled. This does not mean that formal techniques are not valuable, particularly as a starting point for students who are unfamiliar with the field. The important thing is not to give the false impression of invincibility for any particular technique. This is very well illustrated in Howe (1989) where a 12 step approach to design is presented and the comment made at step 12 that:
Having got this far, you may find that your choice of attributes, entities and relationships is now suspect as an accurate representation of the enterprise. As you now understand the problem better, start again from step 1!The topic of referential integrity is critical. It must be taught thoroughly and illustrated by examples of databases which have been set up to prevent integrity violations. Assignments given to students must include penalties for ignoring this problem area. Unfortunately many textbooks, although mentioning referential integrity, do not pay much attention to it in the examples they present in the text.
The impact of CASE tools is difficult to judge due to the huge diversity of products on the market. Their major contribution may be in the documentation area due to the ease with which E-R diagrams can be modified graphically. Opinions are divided as to whether designing 'on screen' offers many advantages over pencil and paper, and it is unlikely that CASE will turn a bad designer into a good one. It is in the area of referential integrity that they offer the most promise, provided the CASE package interfaces directly with a database without the need to retype the entities and attributes. It is probably easier to specify the integrity rules directly from the E-R diagram and have them automatically enforced by the database system when it is set up.
We should adopt the Piagetian philosophy of teaching from the concrete to the abstract rather than vice versa. The problem here is how to get started. Although as Shanks et al (1993) point out 'there is no mechanical way of proceeding directly from requirements to the best design' some sort of mechanical technique may well be a good concrete starting point for first time designers. It is here that the data structuring and normalisation technique presents a 'mechanical' first step in producing relations free of redundancy. It is important that students take the next step of realising that with very little practice the same set of relations can be derived intuitively or non-mechanically using E-R diagrams, and that there is very little difference between the results. An interesting account of research along these lines is given in Shanks et al (1993) and although the classification of techniques is confusing it seems to indicate the superiority of the entity-relationship approach.
The role of the data flow diagram as a tool for relational database design needs to be very carefully handled. Students who have completed standard systems analysis and design courses may well be familiar with this technique and be confused when faced with deriving them from E-R diagrams as suggested by Chen (1991).
It is important that we do not get caught up in a bottom-up versus top down competition. Both approaches have their followers each attempting to give the impression that their competitors methods are substandard. While there may be evidence that some techniques give superior results in a variety of situations than others (Howe 1989), and that some work better for different classes of user (Batra and Davis, 1992), the exclusive use of one method in a teaching syllabus can only produce narrowly focused students. In an interesting study by Guindon (1990) where experienced systems analysts were studied in depth as they designed data models, results indicated that an opportunistic approach was more in evidence than preference for either top-down or bottom-up strategies.
When teaching normalisation techniques we should keep in mind a statement on databases by George Koch (1993), vice-president of Oracle Corporation. His prediction was that in the foreseeable future "no major application will run in third normal form". His belief is that demand for information and analysis w ill probably continue to outpace the ability of machines to process it in a fully normalised fashion. In business planning it may be well to keep as much as possible of the enterprise analysis separated from the data analysis by not mixing enterprise entities with data entities.
Awad, E. M and Gotterer, M. H. (1992). Database Management. Byod and Fraser.
Bansler, J. P. and Boedker, K. (1993). A reappraisal of structured analysis: Design in an organisational context. ACM Transactions on Information Systems, 11(2), April.
Batra, D. and Davis, J. (1992). Conceptual data modelling in data base design: Similarities and differences between novices and expert designers. International Journal of Man-Machine Studies, 37, 83-101.
Baker, R. (1989). CASE*METHOD: Entity-Relationship Modelling. Addison-Wesley.
Batini, C., Ceri, S. and Navathe, S. B. (1992). Conceptual Database Design. Benjamin/Cummings, Redwood City, CA.
Carpenter, D. A., (1992). Are we teaching database design properly? Journal of Computer Information Systems, Fall, 9-12.
Chen, P. (1991). The Entity-Relationship Approach to Logical Database Design. MA: QED Information Sciences.
Chen, P. (1976). The entity-relationship model - toward a unified view of data. ACM Transactions On Database Systems, 1(1), 9-36.
Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of ACM, 13(6).
Date, C. J. (1991). An Introduction To Database Systems, 5th Ed. Reading, MA: Addison-Wesley.
Goodhue, D. L., Quillard, J. A. and Rockart J. R. (1988). Managing the data resource: A contingency perspective. MIS Quarterly, 12(3), September, 373-392.
Goodhue, D. L., Kirsch, L. J., Quillard, J. A. and Wybo, M. D, (1992). Strategic data planning: Lessons from the field. MIS Quarterly, 16(1).
Guindon, R. (1990). Knowledge exploited by experts during software system design. International Journal of Man-Machine Studies, 33, 279-304.
Harrington, J. L. (1994). Database Management for Microcomputers, 2nd ed. Fort Worth, TX: The Dryden Press.
Hoffer, J. A., Michaele, S. J. and Carroll, J. J. (1989). The pitfalls of strategic data and systems planning: A research agenda. Proceedings of the 22nd Annual Hawaii International Conference on Systems Sciences, Kona Hawaii, January.
Howe, D. R. (1989). Data Analysis For Data Base Design, 2nd ed. London: Edward Arnold.
Hawryszkiewycz, I. T. (1991). Database Analysis and Design, 2nd ed. New York: Maxwell-Macmillan.
IBM Corporation (1981). Business Systems Planning, IBM Manual #GE20-0527-3.
Kleen, B. (1993). Are we missing the boat when teaching database concepts and applications? Journal of Computer Information Systems, 34(1), Fall, 1.
Koch, G. (1993). Oracle 7: The Complete Reference. Berkeley, CA: Osborne McGraw-Hill.
Lederer, A. L. and Sethi, V. (1988). The implementation of strategic information systems planning methodologies. MIS Quarterly, 12(2), September, 441-461.
Lu, H. P. (1994). A preliminary study of student responses to different CIS course teaching strategies. Journal of Computer Information Systems, 34(4), Summer, 31-36.
McFadden, F. R. and Hoffer, J. A. (1994). Modern Database Management, 4th ed., Redwood City, CA: Benjamin/Cummings.
Marshall, R. (1992). Relational, entity-relational, object oriented: strengths, weaknesses and complementary usage. Professional Computing, The Magazine of the Australian Computer Society, March Issue, Victoria, Australia.
Martin, J. (1981). Strategic Data-Planning Methodologies. Englewood Cliffs, NJ: Prentice Hall.
Pratt, P. J. and Adamski, J. J. (1994). Database Systems Management and Design. Danvers, MA: Boyd & Fraser.
Robb, P. and Adams, C. N. (1990). Microcomputer databases in the classroom: It's time to pay the (design) piper. Journal of Computer Information Systems, Fall, 18-24.
Sanders, G. L. (1995). Data Modelling. Danvers, MA: Boyd & Fraser.
Schmid, H. A. and Swenson, J. R. (1975). On the semantics of the relational data base model. ACM SIGMOD International Conference on the Management of Data, San Hose, CA, pp211-223.
Shanks, G., Simsion, G. and Rembach, M. (1993). The role of experience in conceptual schema design. Proceedings of 4th Australian Conference on Information Systems, University of Queensland, Qld, Australia, pp365-378.
Whong, S. and Gould, E. L. (1994). Investigation and survey of the use of databases in local organisations. Unpublished Masters thesis, Department of Business Systems, University of Wollongong, NSW, Australia.
| Author: Edward Gould is in the Department of Business Systems, University of Wollongong, Northfields Ave, Wollongong NSW 2522, Australia. Email: egould@uow.edu.au
Please cite as: Gould, E. (1995). Database education: Problems for business students. Australian Journal of Educational Technology, 11(1), 36-49. http://www.ascilite.org.au/ajet/ajet11/gould.html |