PRIOR KNOWLEDGE EFFECTS

ON SIMOULTANOEUS LEARNING OF MORE THAN TWO CATEGORIES

 

 

CÎMPIAN ERIKA ILDIKÓ

researcher,

Institute for Research in Social and Human Sciences,

Cluj-Napoca, Romania

 

 

 

 

 

ABSTRACT. The purpose of the present experiment is to investigate the influence of prior knowledge during simultaneous learning of more than two new categories. In everyday life, we learn categories composed of both features that we are able to associate with other, previously learned entities, and features that are isolated (rote features). In the experiment subjects learned two and three categories, respectively, and then performed a single-feature classification task. The results suggest that prior knowledge help to integrate features, even if the increased number of categories significantly reduces the speed of learning.

Keywords: category learning, prior knowledge.

 

 

 

 

 

Categories provide a solid basis for a wide range of processes. We use them in problem solving, prediction, inference, induction, and so on (Ross, 2000; Spalding & Ross, 2000; Murphy & Ross, 2000; Yamauchi & Markman, 2000a; Yamauchi & Markman, 2000b; Smith & Minda, 2000; Kaplan & Murphy, 2000). Moreover, in order to complete efficiently our changing and renewing tasks, we constantly have to learn new categories (including learning new categorizations of already known entities). Although, at first glance the term “new categories” may falsely suggest that in the process of learning we use only of totally new information, in reality we heavily rely upon our prior knowledge. By prior knowledge we mean knowledge about past categories, general knowledge about the world, and knowledge connected to certain specific domains (Murphy & Medin, 1985; Wattenmaker, 1999; Kaplan & Murphy, 1999). When we learn new categories, both specific and general knowledge about categories with similar structure or content, and currently observable exemplars of the new categories influence us. Such prior knowledge, even in minimal amount, aid learning – at least of some features, directly connected with them. This small amount of knowledge is usually related only to a few features of the new objects, facilitating their learning, but there is no clear evidence of their interference with other, unrelated features, called rote features. Following Kaplan & Murphy (1999, 2000), we use the term rote features (or rote properties) to refer to features that are not directly connected to prior knowledge, and so, they are learned “by rote” (if not otherwise).

There are many possible ways that prior knowledge interacts with learning new information. Prior knowledge could be triggered by the explicit use of labels of past categories, or by some other cue-information that activates previously formed representations. Once background knowledge is activated, information from this might be “copied” into the new concept or might serve to connect both current features to formerly learned ones, and knowledge-related features with rote features. Evan Heit (1994, 1998, 2000) presented a similar proposal, but regarding the learning of whole exemplars of new categories, called the integration model of categorization. The fundamental claim for the integration model is that when a person is confronted with a new item, this item is both compared to actual observations, and past exemplars from previously learned resembling categories. However, regardless of the unit of analysis (features or exemplars), some crucial questions still remains. How could a person use his background knowledge to make category learning easier, how could this person know which knowledge would be useful and when, and in which direction influence learning prior knowledge structures having different contents?

Most everyday categories have a mixture of knowledge-related features and rote features. It is obvious that if we access some relevant background knowledge, the features related to them are learned quite efficiently. As for the rote features, there are a number of possibilities to learn them. We may associate them directly with the category’s label, learning them “by rote”; we may associate them with the knowledge-related features, finding an explanation for this association; or, we may relate them to prior knowledge triggered by the context. As Kaplan & Murphy (2000) summarized, previous studies suggest four possible ways in which prior knowledge may influence category learning. One possibility is that knowledge-related features elicit previously learned knowledge structures, which in turn directs attention toward these features, and inhibits learning of the rote features (Murphy & Medin, 1985; Wisniewski, 1995). Or, knowledge may actually aid learning of the rote features, since knowledge-related features are quickly learned at the beginning of the task. Using them in order to complete the categorization task leaves more processing capacity for learning the rote features. An alternative mechanism is that subjects do not use all the features/dimensions; they rely only on a few of them. Due to the prior knowledge activated, they discover some explanation for a few features related to this knowledge, which helps them learn to categories the items (Murphy & Allopenna, 1994). In this case, the competition between the features “ends before it begins”. Finally, subjects may try to find some connection (not necessarily systematic connections) between different types of features, even though those features appeared arbitrarily together.

Most experiments in category learning have simultaneously used only a single pair of categories from the same domain. At the same time, subjects had to deal merely with task-relevant information since no redundant or unnecessary (irrelevant) information was introduced in the category descriptions. In everyday life however, we simultaneously have to deal with more than two categories. When confronted with only two categories, our task is somewhat easier because of the very limited number and exclusive nature of classification possibilities. Subjects have to choose between only two possibilities (“the third possibility is excluded”), so that classifying a feature as belonging to one of the two categories does not necessary imply that it was irrevocably associated with the correct category. It is also possible to give the right answer by learning the features specific to only one of the categories, and by deciding that the actually presented feature does not belong to this category, so it has to belong to the other one. Introducing a third category make the use of this strategy impossible. For this reason in the present experiment we manipulated the number of the categories to be learned. In this case, of course, we expect learning speed to reduce because of the increased amount of features, but without learning being less efficient in other respects. If in the case of three categories prior knowledge make easier learning of features related to them to the same extent as it does in the case of two categories, then more processing capacity is relieved in order to learn other features. Subjects should learn equally well both three and two categories. On the other hand, if features were in competition, learning of knowledge-related would be less efficient in this case, because less attention could be directed toward an increased number of knowledge-related features. At the same time, more non-attended (rote) features would be inhibited, leading similarly to less efficient learning compared to the learning of only two categories.

 

 

Method

 

Subjects. Twenty-two persons participated in this experiment. They were randomly assigned in equal numbers to one of the two conditions (simultaneous learning of two categories – simultaneous learning of three categories).

 

Materials and design. In this experiment we used two (knight novels and love novels), respectively three categories of books (knight novels, detective novels and love novels). The categories’ factorial structure is presented in Table 1. Each category had 5 exemplars, and each exemplar had 5 features defined on 5 different dimensions. These dimensions were previously selected based on the preference of five people who did not participated in the actual experiment. They were instructed to select on which dimensions would they describe a book (indicating the precise order of their importance). According to these, the five most frequently selected dimensions were (1) the content of the book, (2) the color of the cover, (3) the age of the characters, (4) the size of the book, and (5) the type of the fonts used. The content of the book was mostly expressed by (a) the theme of the story, (b) the title of the book, (c) the logo of the book, (d) the outcome of the story, or (e) a key moment of the story. These five dimensions were grouped as thematic or knowledge-related dimensions (T1-T5 in Table 1), and rote-properties dimensions (D1-D4 in Table 1).

 


 

 

GAMON books

(category A)

MIRKO books

(category B)

LIKER books

(category C)

Dimension

T

D1

D2

D3

D4

T

D1

D2

D3

D4

T

D1

D2

D3

D4

Two-category condition

Exemplar

1

T1

1

1

1

1

 

 

 

T1

2

2

2

2

2

T2

2

1

1

1

T2

1

2

2

2

3

T3

1

2

1

1

T3

2

1

2

2

4

T4

1

1

2

1

T4

2

2

1

2

5

T5

1

1

1

2

T5

2

2

2

1

Three-category condition

Exemplar

1

T1

1

1

1

1

T1

2

2

2

2

T1

3

3

3

3

2

T2

2

1

1

1

T2

3

2

2

2

T2

1

3

3

3

3

T3

1

3

1

1

T3

2

1

2

2

T3

3

2

3

3

4

T4

1

1

2

1

T4

2

2

3

2

T4

3

3

1

3

5

T5

1

1

1

3

T5

2

2

2

1

T5

3

3

3

2

 

Table 1. Factorial structures of the categories used in the experiment.

Each exemplar (1-5) has one of the five knowledge-related (thematic) features (T1-T5)

and four rote features (D1-D4).

 

 

We considered that the most appropriate dimension for the knowledge-related dimension would be the content because of its obvious relation to previously acquired knowledge. Features of the other four dimensions are not specific to one or another book category, so they really must be learned “by rote”, eventually by forced associations with the thematic or knowledge-related features. In the case of thematic dimension, we derived a different feature for each of the five exemplar of the category. The rote features were the same for each of the five exemplar of the category, but they were mixed between each other as indicated in Table 1. The only exception was the first exemplar of every category, which had all the four rote features of the category it belonged. This exemplar was the prototype of the category. A complete list of the features used in both conditions is presented in the Appendix.

 

Procedure. Subjects in two-category condition learned a single pair of categories (knight novels and love novels), while subjects in three-category condition learned all three of them. The categories were given the following names: knight novels – Gamon, detective novels – Mirko, love novels – Liker. The procedure consisted of two sessions: learning and speeded single-feature classification.

On the learning session, the stimuli were presented on a computer screen. On each trial we presented a single exemplar from the two, respectively three categories. The names of the categories were presented below every exemplar. One presentation of all the 10, respectively 15 exemplars formed a learning block. The order of the exemplars within the learning blocks, as well as the order of the features of an exemplar was randomized on each presentation. Thus, every description of an exemplar was “unique” in order to prevent the order of dimensions serve as a cue.

Every exemplar was presented for 10 s. Between the presentations of two exemplars there was a break of 3 s. Thus, subjects had 10 s to decide to which category the item belongs. They indicated their responses verbally, and received feedback from the experimenter depending on subjects’ answer (correct or incorrect). Thus, they received the feedback information only when the stimulus had already disappeared from the screen. The learning session continued until the subject succeeded to correctly classify all the exemplars within the same block.

On the speeded single-feature session, on each trial we presented on the screen a single feature from those previously learned. Subjects had to decide in which category the feature occurred most frequently. Subjects were told that it is important to classify the features quickly and correctly. They responded by pressing the right or left arrow, which indicated the corresponding category written above the arrow. There were two blocks of feature-presentation. Each block of trials contained all the 18, respectively 27 features. The features were randomized within each block. Number of learning blocks, feature classification decisions and classification reaction time were recorded.

 

Results

 

As we expected, subjects learned to categorize the items significantly faster in the two-category condition then in the three-category condition (M=2.3 blocks, and M=6.0 blocks, respectively; t=4.52, p<0.0012). In fact, in the two-category condition 66% of the subjects completed their task after only two learning blocks.

Preliminary analyses indicated that there were no significant differences between reaction times (RTs) for correct single-feature classifications on the corresponding dimensions due to the three different categories (Gamon, Mirko vs. Liker). Thus, assuming that the categories were equivalents, we report averaged reaction times for both knowledge-related and rote features. First, reaction times were entered in a repeated measures analysis of variance (ANOVA) with learning condition (two-category vs. three-category) as between-subjects factor, and testing block (first trial vs. second trial) as a within-subjects factor. There was an overall main effect of testing blocks showing significantly shorter reaction times in the second testing trial [F(1,20)=25.56, p<0.001]. Both groups completed their tasks significantly faster on the second trial on both feature types (Table 2.). Some of the effects were similar in both cases; however, we treated separately data from the two testing blocks. For example, although there was no effect of learning conditions over trials, when we treated features types separately, we found a slight learning condition effect for the knowledge-related features [F(1,20)=3.804, p<0.064]. In the second trial, subjects in the two-category condition performed marginally better on classifying knowledge-related features then those in the three-category condition (t=2.191, p<0.039). This tendency, however, was not detectable in the first trial. No such effects were found for the rote features.

 

 

Another repeated measures analysis of variance with learning condition (two-category vs. three-category) and feature type (knowledge-related vs. rote features) indicated a significant main effect of feature type [for the first trial: F(1,20)=9.913, p<0.005; for the second trial: F(1,20)=22.676, p<0.001). No significant interaction between the two factors was found. In both trials, subjects were significantly faster at classifying knowledge-related features than they were at classifying rote features (see Figure 1.). Nevertheless, in the first trial subjects in the two-category condition were just moderately faster at classifying knowledge-related features than they were at classifying rote features (t=2.065, p<0.063), while subjects in the three category-condition were significantly faster completing the same task (t=2.745, p<0.19).

Figure 1. Mean reaction times (in millisecond) for knowledge-related and rote features for the two learning condition (reactions times were recorded in the first trial).

 

As regards the single-feature classifications error proportions, repeated measures analysis of variance with learning condition (two-category vs. three-category) and feature type (knowledge-related vs. rote features) showed a main effect for the feature type [for the first trial: F(1,20)=9.697, p<0.011; for the second trial: F(1,20)=11.305, p<0.007). In both trials subjects performed better on classifying knowledge-related features than they performed on classifying rote features (see Table 3.). No significant effects were found due to testing trials or learning condition.

 

 

 

Discussion and Conclusion

 

One major question addressed here was whether the increased number of categories influences category-learning performances. As predicted, subjects were largely influenced by prior knowledge. Due to the knowledge-related features they gained an advantage in learning even when they had to learn additional features. Not surprisingly, however, learning speed was reduced in this case. It also seems that the attentional focus hypotheses (Murphy & Medin, 1985) do not account for our findings. In the three-category condition subjects performed equally well at learning both knowledge-related features and rote features as their colleagues in the two-category condition. If subjects had learned knowledge-related features at the expense of the rote features, the increased number of the former ones should have inhibited the learning of the rote features more than it did in the two-category learning. Still there was no evidence that subjects in the three-category condition learned less efficiently the rote features than subjects did in the two-category condition. As regards the knowledge-related features there was an exception, however; here we found that subjects in the two-category condition performed marginally better in the second trial of single-feature classification, but not in the first testing trial.  Yet, this does not necessarily contradict our explanation – on the second testing trial subjects performed better because of overlearning of the knowledge-related features. It is worth mentioning that repetition of rote features did not lead to better performance compared to the three-category condition.

One reason why both knowledge-related features and rote features were equally well learned in the three learning condition is that – although subjects had some extra features to learn – they also had the more opportunities to learn them than subjects in two-category condition. An alternative explanation suggested by studies of Kaplan & Murphy (2000) may also account for our results. Knowledge-related features triggered to the same extent prior knowledge in both learning conditions. These knowledge structures obviously made easier learning features related to them, which in turn allowed subjects to allocate more processing capacity for learning rote features. On the other hand, in the two-category condition the very limited number and exclusive nature of classification alternatives allowed subjects to “keep away” from learning features of both categories, since it was possible to make correct classifications learning solely one set of features, and knowing that features which are not from this (single) set have to belong to the other set. As we have suggested earlier, this strategy does not necessary imply that both sets of features were associated with the corresponding category. The use of this strategy is also suggested by the fact that on the first testing trial subjects in the two-category condition were only moderately faster at classifying knowledge-related features than in the second trial. Introducing a third category make the use of this strategy impossible. In the present experiment, the third category may have compelled subjects to try to use more features as cues and thus, they made more associations between knowledge-related features and rote features (compared to the amount of such associations if they had relied only on learning knowledge-related features). If this is the case, than paradoxically the additional features did not impeded learning. Instead, it forced subjects to integrate the two types of features, and to apply a more elaborate learning strategy, which is presumably more similar to strategies we use in everyday category learning.

The present experiment cannot provide clear evidence for the use of such strategies; it merely suggests that we use different strategies in order to deal with different category learning conditions. The finding that prior knowledge aids learning of knowledge-related features and, especially, learning of rote features even if the number of categories is increased leads to further questions. Does either the amount or complexity of the material to be learned differently influence which knowledge structures and in what form will be used in learning? Further investigation of such topics will provide perhaps ecologically more valid models of category learning.

 

References

 

Heit, E. (1994). Models of the effects of prior knowledge on category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 712-731.

Heit, E. (1998). Influences of prior knowledge on selective weighting of category members. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 712-731.

Heit, E., & Bott, L. (2000). Knowledge selection in category learning. In D. L. Medin (Ed.), Psychology of Learning and Motivation, (Vol. 39), 163-199. San Diego: Academic Press.

Kaplan, A.S., Murphy, G.L. (1999). The acquisition of category structure in unsupervised learning. Memory & Cognition, 27(4), 699-712.

Kaplan, A.S., Murphy, G.L. (2000). Category learning with minimal prior knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(4),829-846.

Murphy, G.L.,  Medin, D.L. (1985). The role of theories in conceptual coherence. Psychological Review, 92, 289-316.

 Murphy, G.L., Allopenna, P.D. (1994). The locus of knowledge effects in concept learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 904-919.

Murphy G.L., Ross, B. (2000). Induction with cross-classified categories. Memory & Cognition, 27(6), 1024-1041.

Ross, B. (2000). The effects of category use on learned categories. Memory & Cognition, 28(1), 51-63.

Smith, J.D., Minda, J.P. (2000). Thirty categorisation results in search of a model. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(1),3-27.

Spalding, T.,  Ross, B. (2000). Concept learning and feature interpretation. Memory & Cognition, 28(3), 439-451.

Wattenmaker, W. D. (1999). The influence of prior knowledge in intentional versus incidental concept learning. Memory & Cognition, 27(4), 685-698.

Wisniewski, E.J. (1995). Prior knowledge and functionally relevant features in concept learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 449-468.

Yamauchi, T., Markman, A.B. (2000a). Learning categories composed of varying instances: The effect of classification, inference and structural alignment. Memory & Cognition, 28(1), 64-78.

Yamauchi, T., Markman, A.B. (2000b). Inference using categories. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(3),776-795.

 

 

Appendix

 

Features of the three categories used in the experiment. (All stimuli were in Hungarian.)

 

 

Knight novels

Thematic features (thematic dimension):

T1: the logo is an armored knight figure

T2: the title is: The Black Knight

T3: the main character is going to war

T4: at the end the main character dies on battle

T5: it (the story) happened a long time ago

Rote features (dimensions 1-4):

D1: the cover is brown

D2: the main character is about 45 years old

D3: it has the size of a sheet

D4: is printed with Normal characters

 

Detective novels

Thematic features (thematic dimension):

T1: the logo is a revolver

T2: the title is: Murder for Revenge

T3: the main character is interrogating everybody

T4: at the end the main character finds out everything

T5: it is about solving a mystery

Rote features (dimensions 1-4):

D1: the cover is green

D2: the main character is about 40 years old

D3: it has the size of an envelope

D4: is printed with Italic characters

 

Love novels

Thematic features (thematic dimension):

T1: the logo is a heart

T2: the title is: Love story

T3: the main character makes courts to a waitress

T4: at the end the main character is getting married

T5: it is about relationship between two people

Rote features (dimensions 1-4):

D1: the cover is blue

D2: the main character is about 35 years old

D3: it has the size of a notebook

D4: is printed with Bold characters