|Australasian Journal of Educational Technology
2005, 21(4), 546-566.
This paper reports on a study which contrasts results obtained using semantic and syntactic units of analysis in a context of content analysis of an online asynchronous discussion. The paper presents a review of literature on both types of units. The data set consisted of 80 messages posted by ten participants in an online learning module. Data were coded twice by two coders working independently. In the first instance, each coder divided all messages into semantic units and then coded those units. The second coding was conducted on the basis of a syntactic unit of a paragraph. Analysis at the level of the whole group showed little difference in results between the two types of coding. At the level of individual participants, those differences were greater. Results are discussed within a framework of reliability, capability of the unit to discriminate between behaviors, feasibility of different units, and their identifiability. Implications for research are discussed.
This range in approaches to the choice of unit highlights Rourke, Anderson, Garrison and Archer's claim (2001) that "[t]he selection of the unit of analysis is complex and challenging". The complexity and challenge comes from a need to balance the affordances and constraints of the different units. For example, choice of the whole message as a unit of analysis will afford reliability between coders, unlike the thematic unit or unit of meaning. The message also represents a feasible unit in terms of number of cases to be dealt with as compared to the sentence as a unit of analysis. However, the unit of meaning may be more capable of discriminating between behaviours in the construct being observed than will the message. Finally, some units such as the paragraph or sentence may not be easily identifiable depending on the discourse conventions adopted by discussion participants.
This paper presents a study that investigated the implications of choice of unit of analysis. The study contrasted results of coding a discussion transcript using two types of unit: the semantic unit or the unit of meaning versus a syntactic unit in the form of a paragraph. The paper begins with a review of the literature on syntactic and semantic units. Coding results are presented as aggregate measures for the discussion group as a whole, as well as for individual participants in the discussion. Results are also presented for each of the two coders. The discussion highlights issues related to reliability, capability of the unit to discriminate between behaviours, feasibility of the unit in terms of the number of cases to be coded, and the identifiability of the unit. The paper concludes with implications for content analysis of online asynchronous discussions (OADs).
Fahy (2001) advocates the use of a sentence as a unit of analysis since "[s]entences are, after all, what conference participants produce to convey their ideas, and are what transcripts consist of" . Fahy (2002) chose to work with a sentence as the unit of analysis in an investigation of interaction patterns in online discussion transcripts. He justified this choice with the claim that a sentence would "permit coding of all components of the transcript" and "provide some basis for comparison of results both internally and externally". Poscente and Fahy (2003) also conducted their analysis on the basis of a sentence, which best served their purpose of identifying triggers, or strategic initial sentences, in computer conferencing transcripts. Hillman (1996) worked at the level of a sentence in her comparison of face to face and computer mediated interactions.
A larger grained syntactic unit of analysis is a paragraph. Hara, Bonk and Angeli (2000) conducted their analysis at the level of a paragraph. They justified their choice of unit by arguing that "college-level students should be able to break down messages into paragraphs" (p. 9). In fact, McKenzie and Murphy (2000), who worked with a meaning based "message unit" as the unit of analysis, observed that it was the paragraph that typically encompassed a complete idea within a message. Rourke et al's (2001) use of a paragraph as a unit of analysis was less successful. Participants in their study frequently used "a full line of space or a tab ... for purposes other than delimiting a single coherent and unified idea". Instead of presenting one complete concept, a paragraph in this case was simply a random segment within a transcript.
Another syntactical unit larger than a paragraph is the whole message. Marcelo, Torres and Perera (2002) chose a complete message as the unit of analysis. They argued that discussion participants tend to express one general idea within the boundaries of a message. Of all syntactic units, the boundaries of a message are very clearly defined in any context of computer conferencing. This characteristic makes it a reliable tool with which to conduct coding. Khine, Yeap and Lok (2003) also worked at the level of a whole message which served their purpose of identifying various "message ideas" representing different types of behaviour within participant postings. In their attempt to discriminate between phases of knowledge construction in an online community, Aviv, Erlich, Ravid and Geva (2003) also conducted analysis on the basis of a message. Their approach was unique in the sense that, like Anderson, Rourke, Garrison and Archer (2001) who also selected a message as the unit of analysis, they allowed for more than one code to be assigned to a message if it reflected more than one type of behaviour.
In their discussion of different units of analysis, Rourke et al (2001) note the advantages of fixed units, such as syntactic units, in that they are "objectively recognizable" in a text. These units also meet Fahy's (2001) criterion that a unit of analysis must be "obvious and constant within transcripts". The objective and unambiguous identification of the syntactic unit in a transcript highlights its reliability for conducting analysis with multiple coders. Despite this advantage, however, the syntactic unit has several limitations. These are related to identifiability, feasibility, and the low discriminant capability of some syntactic units.
In terms of identifiability, the choice of different types of syntactic units may pose a challenge to coders analysing the transcripts. With regards to the graphic conventions, Howell-Richardson and Mellar (1996) noted that the modes of communication in computer mediated conferences tend to differ from those used in conventional types of discourse. They attribute this difference in modes to the lack of established norms of discourse for online communication. Rourke et al (2001) confirmed this finding in their analysis of messages in an online discussion. Participants in their study "combined the telegraphic style of email with the informality of oral conversation". That practice required of coders to use their judgment in deciding on the boundaries of sentences within transcripts. Their study illustrates the potential problems with coding syntactic units in transcripts from online discussions. Both of these studies highlight how syntactic units may not be easily identified in the context of an online discussion.
A further limitation of the syntactic unit relates to its feasibility. The choice of syntactic unit may place unfeasible demands on the coders and coding process. For example, the choice of a sentence as a unit of analysis may prove problematic with long and multiple transcripts. Rourke et al (2001) reported that participants in their study produced over 2,000 sentences during a 13-week discussion, leading them to conclude that the sentence can yield "an enormous amount of cases". The large number of cases resulting from choice of the syntactic unit can place a burden on resources available in contexts of content analysis and may require considerable time and effort to code. The problem of a large number of coding cases or instances can be overcome by choosing a larger syntactic unit. As the size of the unit of analysis increases, the number of instances of coding the coders will have to perform is reduced: the paragraph will have fewer cases than the sentences, and the message will have even fewer.
However, as the number of cases becomes smaller when we move from the sentence to the whole message, so too does "the likelihood that the unit will encompass multiple variables. Or conversely, that one variable will span multiple paragraphs" (Rourke et al, 2001). By increasing the size of the unit of analysis, we risk reducing the capability of the unit to discriminate between different behaviours related to the construct being observed. Rourke et al (2001) note that fixed units such as syntactic "do not always properly encompass the construct under investigation".
Some researchers have used a coding unit resembling the thematic unit proposed by Henri, without labelling it as one. For example, Veerman, Andriessen, and Kanselaar (1999) divided their discussion transcripts into fragments "in which thematic information had been expressed in relation to the task goal". Blake and Rapanotti (2001) chose as a unit of analysis "an aggregation of statements within the message body that could be recognized as a meaningful whole". Kim and Bonk (2002) report identifying and assigning codes to those 'utterances,' or segments within a sentence, that reflect collaboration between students. In all these cases, the unit of analysis was intended to contain a complete idea or a theme, a characteristic they share with Henri's unit of meaning.
Despite the apparent popularity of the 'thematic unit' in content analysis of online discussions, some researchers have associated a number of constraints or limitations with this unit (e.g., Aviv, 2001; McKenzie & Murphy, 2000). Howell-Richardson and Mellar (1996) describe the thematic unit as "ill-defined" which leads to inconsistency in identifying the unit (p. 51). Fahy (2001) considers the process of identifying units of meaning within discussion transcripts to be a "perilous, even impossible, task". That difficulty results from the fact that units of meaning are "not discrete or identifiable on the basis of consistent criteria" (Fahy, 2002). Also Aviv (2001), who argues in favour of semantic units, admits that it is difficult to discriminate them reliably. The lack of specific criteria for identifying semantic units has a direct impact on reliability, or more specifically reproducibility of a content analysis study where a unit of meaning was chosen for analysis. Howell-Richardson and Mellar (1996) support this argument by saying that "since Henri's explanation of the unit is not grounded in any particular theoretical framework ... it is difficult to see how the method could be reliably used by other researchers" (p. 51). Therefore, contrary to the syntactic unit of analysis, semantic units "properly delimit the construct, but invite subjective and unreliable identification of the unit" (Rourke et al, 2001).
To code the data, we chose an instrument designed by Murphy (2004). While many content analysts choose to develop their own coding schemes (see De Wever, Schellens, Valcke, & Van Keer, 2006 for a review of different content analysis schemes), some researchers strongly recommend using an instrument previously developed and tested (e.g., Gall, Borg, & Gall, 1996; Stacey & Gerbic, 2003; Rourke & Anderson, 2004). Rourke and Anderson argue that, instead of undertaking "an elaborate process of instrument development," previously developed instruments be used (p. 14). By using such instruments researchers "contribute to the accumulating validity of an existing procedure; are able to compare their results with a growing catalogue of normative data; and leapfrog over the instrument construction process" (pp. 14-15).
|P||Total words||Shortest message in words||Longest message in words||Average length of message||Total # of paras||Shortest para in words||Longest Para in words||Average length of para|
Murphy's (2004) instrument was designed to support observation and identification of behaviours related to Problem Formulation and Resolution (PFR) in Online Asynchronous Discussions (OADs). The instrument, which represented a second iteration, had undergone testing aimed to identify instances of construct under-representation, construct irrelevance, and lack of discriminant capability. The OAD analysed here was designed to engage students in defining and solving a problem, and Murphy's instrument was the only instrument we could identify developed specifically to measure these behaviours.
The instrument consists of two categories: Problem Formulation (code F) and Problem Resolution (code R), which are further divided into five processes. Each of the processes consists of a number of specific indicators of behaviour related to formulating and resolving a problem. The total number of those indicators across all five processes is 19. The first category of Problem Formulation includes two processes: Defining Problem Space (code FD), which includes seven indicators, and Building Knowledge (FB), with four indicators of behaviour. The second category of Problem Resolution is divided into three processes: Identifying Solutions (RI), with two indicators, Evaluating Solutions (RE), with four indicators, and Acting on Solutions (RA), with two indicators of behaviour. Codes representing all processes along with specific behaviours associated with each process are presented in Table 2 below. Coding involved analysing each unit of analysis and associating it with a category, process, and specific type of behaviour. The Cohen's kappa coefficient of interrater reliability for phase 2 was .724. The kappa coefficient could not be calculated for coding the semantic units in phase 1, since the number and the boundaries of those units were different for each of the coders (for a more in depth discussion of interrater reliability in content analysis of online discussions see Murphy & Ciszewska-Carr, 2005).
|FD||Defining problem space|
|RA||Acting on solutions|
Coding was conducted by two graduate research assistants (Coder I and II) with no prior coding experience. Before the coding began, each coder met individually with the principal investigator for a session designed to introduce them to coding with the instrument and to answer any questions about the process. First, the principal investigator, who was also a creator of the instrument, explained the instrument to the coders. She then modelled use of the instrument by coding a sample message from a transcript not used in the study. Each coder then coded other messages along with the principal investigator. To ensure consistent interpretation of the instrument, coders were instructed to code one level at a time. This means that the first coding decision involved determining whether the unit corresponded to Problem Formulation (F) or Resolution (R). Depending on this first decision, the second decision involved determining which of the processes associated with either F or R was reflected in the unit, i.e. if the unit corresponded to F, did it involve defining the problem (FD) or building knowledge (FB)?
In phase 1, the two coders were instructed to work independently in order to select the units of meaning. This stage required of coders that they rely on their own interpretation and judgment to decide on the unit. The purpose of this independent identification of semantic units within the transcripts was to identify potential differences between approaches adopted and results obtained by both coders, and how those differences may relate to the issue of reliability. After coders identified the units of meaning, they independently coded these units in each participant's transcript. In phase 2 of the study, each coder coded the data set a second time using the syntactic unit of a paragraph. Having read all transcripts, the coders realised that participants tended to contain one main idea within one paragraph. Therefore, this unit was chosen over the sentence. This clear organisation into paragraphs by the participants thus facilitated the coding process.
Figure 1 provides an example of a complete message posted by one of the participants. The message consists of two syntactic units, i.e. two paragraphs, divided by the participant, and indicated in the figure by brackets on the right side. Brackets and braces on the left side of the figure indicate the different choices of semantic units by the two coders. In the two-paragraph message, Coder I (CI) identified six units of meaning whereas Coder II (CII) identified two. The two units of meaning identified by Coder II did not, however, correspond to the syntactic units. The underlined portions of the text in Figure 1 delineate the units of meaning identified by Coder I.
|Participant||Semantic units||Syntactic units|
|Coder I||Coder II||Coder I||Coder II|
Tables 4 and 5 present a comparison of results achieved by each coder with a semantic versus a syntactic unit of analysis. For both types of units, values presented in the tables are aggregate measures of coding decisions across all 10 participants. The tables contrast results for each of the five processes listed vertically in the table (e.g., FD = Defining Problem Space). The total number of units (#) coded for each process is detailed. In addition, each number was calculated as a percentage (%) of the total number of units coded. Coder I, for example, identified and coded a total of 393 semantic units. Ninety-three of those 393 units were coded as FD, which constitutes 23.7% of all semantic units.
Results of coding using syntactic units are presented in similar format in the same table. For example, of the 355 syntactic units coded by Coder I, 76 were coded as FD, which constitutes 21.4% of all 355 units. The final column shows the percentage difference for each of the five processes between coding with semantic units and coding with syntactic units. For example, of all 393 semantic units coded by Coder I, 25.7% were coded as FB, and of all 355 syntactic units coded by that coder, 31.6% were coded as FB. Thus, the difference in the results for this particular process is 5.9%. Coder II's results are presented in the same format in Table 5.
Figure 1: Choices of semantic versus syntactic units by Coder I (CI) and Coder II (CII)
Tables 4 and 5 present aggregate results of coding with semantic and syntactic units for each of the five processes across all 10 participants. The difference presented in the last column is thus an average for all participants. However, aggregate measures may mask a greater or lesser variety in results that may occur at the level of the individual participant. For this reason, in order to observe the actual range of difference that occurred between coding semantic and syntactic units, we need to see individual and not just aggregate group values. We have arbitrarily chosen two participants to illustrate how individual results might have been higher or lower than the mean aggregate results. Table 6 presents percentage differences for participant D for each process coded by Coder I and Coder II. For this participant, the table gives results of coding with semantic and coding with syntactic units and the percentage difference between the two values. Table 7 presents analogous results for participant I.
|Process||Semantic unit||Syntactic unit||Difference|
|#||% / 393||#||% / 355||%|
|Process||Semantic unit||Syntactic unit||Difference|
|#||% / 457||#||% / 355||%|
Tables 8 and 9 provide a summary of differences between results of coding with semantic versus syntactic units. Results are presented for each of the ten participants. The summary allows to show the range of differences in more detail and to contrast it with the aggregate values. Table 8 presents differences for each individual participant, identified alphabetically from A to J, for each process coded by Coder I. Table 9 presents percentage differences for each individual participant for each process coded by Coder II.
|A %||B %||C %||D %||E %||F %||G %||H %||I %||J %|
|A %||B %||C %||D %||E %||F %||G %||H %||I %||J %|
In terms of reliability, Table 3 illustrated how coding with the semantic unit resulted in a low and inconsistent level of agreement in the choice of unit itself. The total number of semantic units identified by Coder I was 393 whereas the number identified by Coder II was 457. In addition to the different number of total units identified by each coder, the low and inconsistent level of agreement was also evident in individual participants' transcripts where the number of semantic units identified by both coders may have been the same. Such is the case with participant E where both Coder I and Coder II each identified 48 semantic units. These, however, were not always the same units. In fact, of the 48 units coded by each coder in participant E's transcript, only 18 were the same for both coders in terms of their boundaries.
The differences in interpretation of meaning which resulted in inconsistency in the choice of the semantic unit were illustrated in Figure 1. In that example, while Coder I identified six units, Coder II identified two. The different number of units in that message resulted from different approaches to identifying complete ideas or themes in the text. Coder I looked for meaning in relation to a context of Problem Formulation and Resolution (PRF). Thus, a unit of meaning corresponded to one of the 19 behaviours outlined in the PFR instrument. She adopted a one stage approach to coding which involved identifying each unit as she coded. For example, in Figure 1, she identified six behaviours corresponding to six units of meaning, such as specifying ways in which the problem manifests itself (1 CI), agreeing with the problem as presented (2 CI), or agreeing with solutions proposed by others (3 CI).
Coder II, on the other hand, adopted a different approach to identifying units of meaning. For her, meaning referred to themes or key ideas in the participants' messages. Figure I shows how she identified two themes, or key ideas, and therefore two units. The first unit grouped all text referring to the messages parents send to children regarding involvement in schools. The second unit grouped all text referring to the messages schools should send to the parents regarding their involvement in their children's education. Unlike Coder I, who identified units as she coded, Coder II used a two stage approach. She first divided all 80 messages into thematic units and subsequently coded each unit according to one of the 19 behaviours in the PFR instrument. The different approaches adopted by the coders highlight the types of inconsistencies that might result from choosing meaning as the unit of analysis. The differences in the coders' approaches and the results each of them obtained using these approaches reveal how, in this case, the unit of meaning resulted in inconsistency in the choice of unit.
While consistency and reliability are important issues in the choice of unit, so too is the capability of the unit to discriminate between behaviours. Tables 4 and 5 show the profiles of engagement of all 10 participants in all five types of behaviour for both semantic and syntactic units of analysis. The percentage differences between results obtained with the semantic versus syntactic unit by each coder are presented in the last column in each of the tables. In general, we can observe that the profiles of engagement in PFR using the semantic unit are similar to those with the syntactic unit when we present them as aggregate measures (the percentage differences range from 0.7% for to 5.9%). This means that the discriminant capability of the semantic unit was very similar to the discriminant capability of the syntactic unit. Consequently, the profiles of the group's engagement in PFR that emerged were quite similar regardless of the choice of unit. According to the aggregate results obtained with a semantic unit by Coder I (Table 4), we can conclude that the participants engaged mostly in identifying solutions (31.5%), and less in building knowledge (25.7%) and defining the problem (23.7%). According to the results obtained with syntactic unit, they engaged only slightly more in building knowledge (31.6%) than they did in identifying solutions (30.1%). According to results obtained by Coder II, participants engaged more in building knowledge (32.8% with semantic units and 29% with syntactic units) than in any other process. Thus, the differences in results on the level of the group were relatively small.
When we consider the profiles of engagement in PFR not for the group as a whole but for individual participants, we can observe that the two types of units discriminated differently between behaviours. For example, the results obtained by Coder I for participant D (Table 6) using the semantic unit indicate that the participant engaged more in identifying solutions over all other behaviours (RI: 33.3%). The behaviour in which the participant engaged least was acting on solutions (RA: 0%). Considering the results using the syntactic unit, however, participant D engaged more in building knowledge (FB: 35%) than in other types of behaviour, while acting on solutions was still the behaviour s/he engaged in least (RA: 2.5%). Even more interestingly, Coder II's results for participant I (Table 7) with the semantic unit of analysis suggest that the discussant engaged in the process of building knowledge almost half the time in his/her transcript (FB: 43.1%), whereas according to results of coding with the syntactic unit the discussant manifested equal engagement in each of the five types behaviour, including knowledge building (FB: 21.9%).
Tables 8 and 9 summarise percentage differences between results with the two units for each of the 10 participants for each of the five processes. The results indicate that the discriminant capability of the two units varied from one participant to another. For example, Coder II's results presented in Table 9 show that participant A's and participant E's results were essentially the same regardless of which unit of analysis was used, which indicates that the discriminant capability was very similar for the syntactic and semantic unit. In fact, larger differences in results, such as those for participant J, indicate that the discriminant capability of the units differed. The different profiles of some participants' engagement in PFR show that, at the level of individual participants, the two types of units discriminated differently between behaviours.
Another important issue related to the choice of unit is that of feasibility, which refers to the number of coding cases to be dealt with in relation to resources available for coding. Use of the paragraph as the unit of analysis proved feasible in this context of coding. The number of cases using this unit was 355. If we had chosen the message to be the unit of analysis, the feasibility would have been even greater, since the number of coding cases would have been only 80. That choice, however, may have affected the issue of discriminant capability. The 355 cases of syntactic units compared to the number of cases using the semantic unit (393 for Coder I, and 457 for Coder II) do not present a notable difference. In terms of time and resources needed for coding the semantic versus the syntactic units, the difference was very small, at least for Coder I. However, choosing the units was less feasible than coding them. For example, the approach adopted by Coder II in the choice of semantic unit was more time consuming compared to the choice of syntactic unit. In fact, because Coder II needed to identify each thematic unit prior to coding, she actually analysed the transcripts twice: first to choose the thematic unit and then to code the unit according to the instrument. This approach resulted in the analysis being more time consuming than if she had used the syntactic unit. When Coder II used the paragraph as the unit, she did not need to analyse the data twice. Thus, for Coder II, the syntactic unit was more feasible in terms of the time and effort required to conduct coding. For Coder I, the feasibility of using the semantic versus the syntactic unit was higher than for Coder II. Coder I analysed the data only once. Her unit of meaning corresponded to a unit of behaviour as outlined in the instrument. She dealt with 355 cases when coding syntactic units, and 393 cases when coding semantic units. In the case of this study, therefore, choice of the syntactic unit presented advantages in terms of feasibility, since choice of the syntactic unit involved two stages of coding.
The final issue we address in relation to the choice of unit is that of identifiability. Howell-Richardson and Mellar (1996) as well as Rourke et al. (2001) discuss this issue arguing that discourse conventions of computer mediated conferencing make identification of syntactic units a problem. While we can generally agree with this claim, in the case considered in this study, the conventions adopted by participants actually made the syntactic unit of a paragraph highly identifiable. The 80 messages were clearly divided up into paragraphs with each of the participants organising their eight messages into an average of 35 paragraphs. The participants' choice of discourse conventions may have been influenced by the fact that they were participating in a formal, structured context of computer conferencing whereby they responded to specific prompts or tasks. The discussion was also formal in that it took place in the context of a university course. If participants had not used such conventions and if paragraphs had not been easily identifiable for all participants, we may have been required to use a whole message as a syntactic unit. As an alternative, we could have used the sentence, except that this choice may have raised the issue of feasibility in terms of increasing number of cases.
In relation to reliability, results highlighted the implications of choosing the semantic unit over the syntactic unit. Following the practice of content analysts who have worked with the semantic unit, the inconsistency in the choice of unit could have been overcome in this study if, prior to coding, coders had worked together to agree on the approach they were going to adopt and on the boundaries of the semantic units to be coded. As the results of this study indicate, when using the semantic unit, arriving at a consensus regarding the unit prior to beginning coding is, indeed, necessary. While a prior consensus may ensure reliability within one research context, it does not extend to other contexts of coding, i.e. does not ensure reproducibility when the same set of data, or the same set of transcripts, is coded by a different group of coders in a different setting.
The need for prior consensus highlights the issue of feasibility. The implications of results for this study indicate that choice of the semantic unit using two coders would require two stages of coding in order to ensure a reliable choice of unit. In contexts where the provision of resources makes it feasible, this two stage approach should be adopted in order to promote reliability in the choice of semantic unit. In terms of the syntactic unit, reliability will not be an issue if the unit is easily identifiable because discussion participants have adopted clear and consistent conventions that allow coders to isolate paragraphs or sentences. If this is the case, the syntactic unit will represent a highly reliable choice.
The syntactic unit will also be a more feasible choice in many cases of coding, since it will not require two stages of coding. In the context of this study, the paragraph represented a feasible choice of unit and it was easily identifiable. In other contexts, participants could be required in advance to ensure that their messages follow a specified convention. For example, they could be instructed to divide their messages into main ideas which are grouped into paragraphs. This approach might also satisfy the criterion of discriminant capability. If participants were instructed to present their main ideas in paragraph format, the choice of a paragraph as a unit of analysis would support high discriminant capability.
The results of this study illustrated that choice of unit has implications in terms of reliability, discriminant capability, feasibility and identifiability. The results suggest that the context of the discussion and of coding will also affect the choice. Issues of reliability will depend on whether the coders have first worked together to decide on the unit. If so, then the semantic unit will be as reliable as the syntactic one. Feasibility will depend on the number of discussion participants and the length of their messages. The choice of unit - meaning, sentence, paragraph, or message - will have to be made in relation to the resources available for coding and the amount of units to be coded. Discriminant capability would also be affected by context. Whole messages could potentially yield high discriminant capability in contexts where participants are instructed to include one main idea per message. Similarly, identifiability will depend on the context of the discussion and whether participants have been instructed to follow a particular convention. In some contexts this may not be possible, in which case the syntactic unit may not be a viable choice. Ultimately, the choice of unit needs to be made carefully taking into consideration the implications for the particular context of the discussion and the issues of reliability, discriminant capability, feasibility, and identifiability.
Aviv, R. (2001). Educational performance of ALN via content analysis. Journal of Asynchronous Learning Networks, 4(2), 53-72. http://www.aln.org/publications/jaln/v4n2/pdf/v4n2_aviv.pdf
Aviv, R., Erlich, Z., Ravid, G. & Geva, A. (2003). Network analysis of knowledge construction in asynchronous learning networks. Journal of Asynchronous Learning Networks, 7(3), 1-23. http://www.aln.org/publications/jaln/v7n3/pdf/v7n3_aviv.pdf
Blake, C. T. & Rapanotti, L. (2001). Mapping interactions in a computer conferencing environment. In P. Dillenbourg, A. Eurelings & K. Hakkarinen (Eds), Proceedings of the European perspectives on computer supported collaborative learning conference, Euro-CSCL 2001. University of Maastricht. [verified 1 Dec 2005] http://mcs.open.ac.uk/lr38/Formal/Publications/euro-cscl2001.pdf
Bullen, M. (1998). Participation and critical thinking in online university distance education. Journal of Distance Education, 13(2), 1-32.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
De Wever, B., Schellens, T., Valcke, M. & Van Keer, H. (2006). Content analysis schemes to analyze transcripts of online asynchronous discussion groups: A review. Computers & Education, 46(1), 6-28.
Fahy, P. J. (2001). Addressing some common problems in transcript analysis. IRRODL Research Notes, 1(2). [verified 1 Dec2005] http://www.irrodl.org/content/v1.2/research.html#Fahy
Fahy, P. J. (2002). Epistolary and expository interaction patterns in a computer conference transcript. Journal of Distance Education, 17(1). [verified 1 Dec 2005] http://cade.athabascau.ca/vol17.1/fahy.html
Fahy, P. J., Crawford, G., Ally, M., Cookson, P., Keller, V. & Prosser, F. (2000). The development and testing of a tool for analysis of computer mediated conferencing transcripts. Alberta Journal of Education Research, 46(1), 85-88.
Gall, M., Borg, W. & Gall, J. (1996). Educational research: An introduction (6th ed.). White Plains, NY: Longman.
Gunawardena, C., Lowe, C. A. & Anderson, T. (1997). Analysis of a global online debate and the development of an interaction analysis model for examining social construction of knowledge in computer conferencing. Journal of Educational Computing Research, 17(4), 397-431.
Hara, N., Bonk, C. J. & Angeli, C. (2000). Content analyses of on-line discussion in an applied educational psychology course. Instructional Science, 28(2), 115-152. [viewed 17 Mar 2004] http://crlt.indiana.edu/publications/journals/techreport.pdf
Henri, F. (1992). Computer conferencing and content analysis. In A. R. Kaye (Ed), Collaborative learning through computer conferencing (pp. 117-136). Berlin: Springer Verlag.
Hillman, D. C. A. (1996). Improved coding and data management for discourse analysis: A case study in face-to-face and computer-mediated classroom interaction. Doctoral dissertation, University of Cambridge, Cambridge, UK. [viewed 17 Mar 2004, verified 1 Dec 2005] http://www.quahog.org/thesis/
Howell-Richardson, C. & Mellar, H. (1996). A methodology for the analysis of patterns of interactions of participation within computer mediated communication courses. Instructional Science, 24, 47-69.
Jeong, A. C. (2003). The sequential analysis of group interaction and critical thinking in online threaded discussions. The American Journal of Distance Education, 17(1), 25-43.
Jonassen D. & Kwon, H. (2001). Communication patterns in computer mediated versus face-to-face group problem solving. Educational Technology Research and Development, 49(1), 35-51.
Khine, M. S., Yeap, L. L. & Lok, A. T. C. (2003). The quality of message ideas, thinking and interaction in an asynchronous CMC environment. Educational Media International, 40(1-2), 115-126.
Kim, K. & Bonk, C. J. (2002). Cross-cultural comparisons of online collaboration. Journal of Computer-Mediated Communication, 8(1). [verified 1 Dec 2005] http://jcmc.indiana.edu/vol8/issue1/kimandbonk.html
Levin, J. A., Kim, H. & Riel, M. M. (1990). Analyzing instructional interactions on electronic message networks. In L. Harasim (Ed), Online education: Perspectives on a new environment (pp. 185-214). New York: Praeger Publishers.
Marcelo, C., Torres, J. & Perera, V. (2002). Analyzing the asynchronous online communication: The development of a qualitative instrument. Paper presented at the European Distance Education Network Meeting in Granada, June, 2002. [viewed 8 Mar 2005] http://prometeo.us.es/idea/mie/pub/marcelo/eden2002.pdf
McDonald, J. (1998). Interpersonal group dynamics and development in computer conferencing: The rest of the story. In Proceedings of the 14th conference on distance teaching and learning. Madison, WI: Continuing and Vocational Education, University of Wisconsin-Madison.
McKenzie, W. & Murphy, D. (2000). "I hope this goes somewhere": Evaluation of an online discussion group. Australian Journal of Educational Technology, 16(3), 239-257. http://www.ascilite.org.au/ajet/ajet16/mckenzie.html
Murphy, E. (2004). Promoting construct validity in instruments for the analysis of transcripts of online asynchronous discussions. Educational Media International, 41(4), 346-354.
Murphy, E. & Ciszewska-Carr, J. (2005). Identifying sources of difference in reliability in content analysis of online asynchronous discussions. International Review of Research in Open and Distance Learning, 6(2). http://www.irrodl.org/content/v6.2/murphy.html
Naidu, S. & Järvelä, S. (2006). Analyzing CMC content for what? Computers & Education, 46(1), 96-103.
Newman, D. R., Webb, B. & Cochrane, C. (1995). A content analysis method to measure critical thinking in face-to-face and computer supported group learning. Interpersonal Computing and Technology Journal, 3(5), 56-77. [viewed 25 Mar 2002, verified 1 Dec 2005] http://www.qub.ac.uk/mgt/papers/methods/contpap.html
Oriogun, P. K. (2003). Towards understanding online learning levels of engagement using the SQUAD approach to CMC discourse. Australian Journal of Educational Technology, 19(3), 371-387. http://www.ascilite.org.au/ajet/ajet19/oriogun.html
Pena-Shaff J. B. & Nicholls, C. (2004). Analyzing student interactions and meaning construction in computer bulletin board discussions. Computers & Education, 42, 243-265.
Poscente, K. R., & Fahy, P. J. (2003). Investigating triggers in CMC text transcripts. International Review of Research in Open and Distance Learning, 4(2). [viewed 8 Mar 2005] http://www.irrodl.org/content/v4.2/poscente_fahy.html
Rourke, L., Anderson, T., Garrison, D. R. & Archer, W. (2001). Methodological issues in the content analysis of computer conference transcripts. International Journal of Artificial Intelligence in Education, 12(1), 8-22. [viewed 25 Oct 2004] http://communitiesofinquiry.com/documents/MethPaperFinal.pdf
Rourke, L. & Anderson, T. (2004). Validity in quantitative content analysis. Educational Technology Research and Development, 52(1), 5-18.
Stacey, E. & Gerbic, P. (2003). Investigating the impact of computer conferencing: Content analysis as a manageable research tool. In G. Crisp, D. Thiele, I. Scholten, S. Barker & J. Baron (Eds), Interact, integrate, impact: Proceedings of the 20th ASCILITE Conference. Adelaide, 7-10 December 2003. http://www.ascilite.org.au/conferences/adelaide03/docs/pdf/495.pdf
Turcotte, S. & Laferrière, T. (2004). Integration of an online discussion forum in a campus-based undergraduate biology class. Canadian Journal of Learning and Technology, 30(2). [verified 1 Dec 2005] http://www.cjlt.ca/content/vol30.2/cjlt30-2_art-4.htm
Veerman, A., Andriessen, J. & Kanselaar, G. (1999). Collaborative learning through computer mediated argumentation. [viewed 3 Mar 2004] http://edu.fss.uu.nl/medewerkers/gk/files/Stanford_CSCL99.PDF
|Authors: Elizabeth Murphy, PhD, Associate Professor of Educational Technology and Second-Language Learning, Faculty of Education, Memorial University of Newfoundland, St. John's, NL Canada A1B 3X8 Email: email@example.com
Justyna Ciszewska-Carr, Research Assistant, Faculty of Education, Memorial University of Newfoundland, St. John's, NL Canada A1B 3X8 Email: firstname.lastname@example.org
Please cite as: Murphy, E. and Ciszewska-Carr, J. (2005). Contrasting syntactic and semantic units in the analysis of online discussions. Australasian Journal of Educational Technology, 21(4), 546-566. http://www.ascilite.org.au/ajet/ajet21/murphy2.html