InterestMap: An Identity and Taste-based Recommender



What Would They Think?

A Computational Model of Personal AttitudesInterestMap: An Identity and Taste-based Recommender

Hugo Liu

MIT Media Laboratory

20 Ames St., Cambridge, MA, USA

hugo@media.mit.edu

Pattie MaesPattie Maes

MIT Media Laboratory

20 Ames St., Cambridge, MA, USA

pattiepattie@media.mit.edu

ABSTRACT

Recommender systems generally construct models of people which are specific to a particular application domain. With the emergence of social network software on the World Wide Web – where self-descriptive personal profiles are represented by a set of keywords and phrases spanning a broad range of a person’s interests and activities – there is an opportunity to construct a much more domain generic model of a person, which we suggest can operate at the granularity of a person’s identities and tastes.

Taking the social data mining approach to person modeling and recommendation, we used 100,000 personal profiles mined from online social networks to build an InterestMap. Put simply, it is a giant, dense connectionist-semantic network whose nodes are keywords and phrases spanning dually 1) the broad space of personal interests (i.e. hobbies, sports, music, books, television shows, movies, cuisines), and 2) the space of cultural identities (e.g. “raver,” “dog lover,” “fashionista”); and whose edges reveal the apparent strength of relatedness of two nodes. Once a novel profile is situated on the InterestMap as some pattern of node activations, the system can infer aspects of a person’s identities and tastes, as well as make recommendations for new interests.

In this paper, we examine the suitability of social network profiles to the task of identity and taste-based recommendation. By contrasting InterestMap’s “network” style recommendation mechanism against collaborative filtering systems, we expose their differing implications for transparency and trust, data source anonymity, and temporality. We explain the technical nexus of how InterestMap is constructed, and evaluate the system’s performance in a recommendation task.

Understanding the personalities and dynamics of an online community empowers the community’s potential and existing members. This task has typically required a considerable investment of a user’s time combing through the community’s interaction logs. This paper introduces a novel method for automatically modeling and visualizing the personalities of community members in terms of their individual attitudes and opinions.

“What Would They Think?” is an intelligent user interface which houses a collection of virtual representations of real people reacting to what a user writes or talks about (e.g. a virtual Marvin Minsky may show a highly aroused and disagreeing face when you write “formal logic is the solution to commonsense reasoning in A.I.). These “digital personas” are constructed automatically by analyzing personal texts (weblogs, instant messages, interviews, etc. posted by the person being modeled) using natural language processing techniques and commonsense-based textual-affect sensing.

Evaluations of the automatically generated attitude models are very promising. They support the thesis that the whole application can help a person form a deep understanding of a community that is new to them by constantly showing them the attitudes and disagreements of strong personalities of that community.

Categories and Subject Descriptors

H.5.2 [Information Interfaces and Presentation]: User Interfaces – interaction styles, natural language, theory and methods, graphical user interfaces (GUI); I.2.7 [Artificial Intelligence]: Natural Language Processing – language models, language parsing and understanding, text analysis.

General Terms

Algorithms, Design, Human Factors, Languages, Theory.

Categories and Subject Descriptors



General Terms



Keywords

Affective interfaces, memory, online communities, natural language processing. commonsense reasoning.User modeling, recommender systems, cultural identity, social networks, collaborative filtering

INTRODUCTION

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

IUI’05, January 10–13, 2005, San Diego, California, USA.

Copyright 2004 ACM 1-58113-894-6/05/0001...$5.00.Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

IUI’04, January 13-16, 2004, Island of Madeira, Portugal.

Copyright 2004 ACM.

The subgenre of artificial intelligence research known as recommender systems (cf. Resnick & Varian, 1997) have thus far enjoyed remarkable practical and commercial success. One of the most prevalent techniques for recommendation, collaborative filtering (Shardanand & Maes, 1995), powers product recommendations on e-commerce sites like Amazon[1] and Ebay[2], and a related link-based rating technology called PageRank™ (Page, Brin, Motwani & Winograd, 1998) provides the technical foundation which makes search engines like Google[3] possible. Recommender systems have also been deployed and researched for a variety of subject domains, including movie recommendation (Miller et al., 2003), and research paper recommendation (McNee et al., 2002).

Typically, the user models being constructed in these systems are specific to a single subject domain or application. As such, user models must often be built anew for each new recommender domain. In addition, there is little coordination amongst the different single-domain user models, and no cross-domain recommendation capability. One proposed approach to integrating recommendations from different domains into a single meta-recommendation system (Schafer, Konstan & Riedl, 2002) calls for explicit user control of the integration parameters.

In this paper, we investigate a different approach to integration. Employing a social data mining approach (Terveen & Hill, 2001) to recommendation, we harvest rather domain-generic profiles of personal interests and identities from web-based social networks (boyd, 2004) such as Friendster[4], Orkut[5], and Thefacebook[6]. Analyzing patterns of how interests and cultural identities co-occur over this corpus, we constructed a “map” of the interrelatedness of interests and cultural identities called an InterestMap, which can itself power cross-domain (interest domain, that is) recommendations, or alternatively provide contextual-support to domain-specific recommenders by situating a person in a larger cultural and aesthetic space.

The rest of this paper is structured as follows. First, we compare InterestMap’s “network” style representation and recommendation mechanism against the collaborative filtering approach and reveal their differing implications for transparency and trust, data source anonymity, and temporality. Second, we discuss the ins and outs of self-descriptive social network profiles – their fidelity, how identity and tastes can be inferred from them, and how they can facilitate recommendation. Third, we present the technical nexus of InterestMap – how profiles were mined and normalized, and the map was constructed taking an information theoretic approach. Fourth, we evaluate performance of InterestMap in a recommendation task using five-fold cross validation. We conclude by summarizing our contribution and giving directions for future research.

THE INTERESTMAP APPROACH

The approach taken by InterestMap is to 1) mine social network profiles; 2) extract out a normalized representation by mapping casually-stated keywords and phrases into a formal ontology (e.g. “Nietzsche” ( “Friedrich Nietzsche”, “dogs” appearing under the “passions” category ( “Dog Lover”); 3) augment the normalized profile with metadata (e.g. “War and Peace” also causes “Leo Tolstoy,” “Classical Literature,” and other metadata to be included in the profile, at a discounted value of 0.5 for example) to facilitate connection-making; and 4) apply an information-theoretic machine learning technique called pointwise mutual information (Church et al., 1991), or PMI, over the corpus of profiles to learn the semantic relatedness of every possible pair of person features (interests or identities). What results is a giant dense semantic network whose nodes are person features and edges are the semantic connectedness weights calculated by the PMI algorithm. Figure 1 is a visualization of a simplified InterestMap.

[pic]

Figure 1. A screenshot of an interactive visualization program, running over a simplified version of InterestMap (weak edges are discarded, and edge strengths are omitted). The “who am i?” node is an indexical around which a person is constructed. As interests are attached to the indexical, other interests and identity-descriptors are also pulled into the visual neighborhood.

An InterestMap can be applied in a simple manner to accomplish several tasks, such as identity recognition, interest recommendation, and person recommendation. Given a seed profile which represents a new user, the profile is normalized and the interests and identity-descriptors contained therein are mapped onto the nodes of the InterestMap, leading to a certain activation of the network. By spreading activation (Collins & Loftus, 1975) outward from these seed nodes, a surrounding neighborhood of nodes which are connected strongly to the seed nodes emerges. Coarsely speaking, these constitute interest recommendations because the immediate neighborhood describes interests which are proximal (and thus relevant) to this distributed representation of a person. A subset of the resulting semantic neighborhood of nodes will be identity-descriptor nodes, and the most proximal and strongly activated of these can be thought of as recognized identities. It is also possible to recommend other people to a user based on maximal overlap of interests and identities. To calculate the affinity between two people, two seed profiles lead to two sets of network activations, and the strength of the contextual overlap between these two activations can be used as a coarse measure of affinity.

Compared with collaborative filtering, our approach bears different implications for transparency and trust, data source anonymity, and temporality, and these are discussed below.

1 Transparency and Trust

That a user trusts the recommendations served to him by a recommender system is important if the recommender is to be useful and adopted. Among the different facilitators of trust, Wheeless & Grotz (1977) identify transparency as a prominent desirable property. When a human or system agent discloses its assumptions and reasoning process, the recipient of the recommendation is likely to feel less apprehensive toward the agent and recommendation. Also in the spirit of transparency, Herlocker et al. (2000) report experimental evidence to suggest that recommenders which provide explanations of its workings experience a great user acceptance rate than otherwise.

Compared to a collaborative filtering (CF) recommender, InterestMap may hold two advantages in the operationalization of transparency: the communicative value of features, and the communication of group tendencies. Together, they make the recommendation rationale more intelligible and errors easier to rationalize and forgive.

First, because a CF recommender’s vector of feature ratings generally result from a trace of a user’s interaction with an application, the individual features may individually be arbitrary, having low communicative value, that is to say, a user is not intending to communicate any of his interactions with an application as recommendations for another user. This is virtue in certain respects because it represents a capture of more realistic behavior, the user not consciously manipulating recommendations. However, in the case of an erroneous recommendation of an arbitrary obscure feature, recommendation recipients may find it even harder to justify or rationalize the purpose for the recommendation than if the erroneous feature had been at least prominent or important in some sense. For example it might be easier to rationalize or forgive the erroneous recommendation of a more prominent feature like L.v. Beethoven’s Symphony No. 5 to a non-classical-music-lover than an equally erroneous recommendation of a more obscure or arbitrary feature like Max Bruch’s Op. 28.

In contrast to CF, InterestMap’s feature-set has arguably more communicative value because they originate from social network self-descriptive personal profiles. Donath & boyd (2004) report that a person’s presentation of self in profiles is in fact a strategic communication and social signalling game. Interests chosen for display are not just any subset of possessed interests, but rather, are non-arbitrary interests meant to be representative of the self, and consciously intended to be communicated socially to those reading the profile. Users may be more likely to rationalize or forgive mis-recommendation of non-arbitrary features than otherwise; after all, human recommenders may mis-recommend, but rarely mis-recommend completely arbitrary things.

Second, InterestMap does not recommend by citing individual user’s vector of ratings, but rather, recommends using a connectionist-semantic network which represents the tendencies of a large group of people. Being able to visualize a group’s topology as in Figure 1 may hold particular advantage over CF in accessibility and intelligibility because a user may be more likely to regard connections which are proven important to the group as a better justification in the case of erroneous recommendation, than connections arising through the behaviors of particular individuals. For example, in Figure 1, it is plain to see that “Sonny Rollins” and “Brian Eno” are each straddling two different cliques of different musical genres, and we suggest that this kind of intelligibility can assist a user in rationalizing a recommendation. Committing an error that another human would make is more likely to garner sympathy than a completely inhuman error.

2 Data Source Anonymity

We have proposed the mining of social networks and are interested in building a domain-generic recommender resource which can be publicly distributed and used to power other domain-specific recommender systems. However the nature of the data source makes anonymity a particularly sensitive issue. An early phase of the InterestMap construction process is to normalize the casually-stated keywords and phrases into a formal ontology (e.g. “Nietzsche” ( “Friedrich Nietzsche”, “dogs” appearing under the “passions” category ( “Dog Lover”). This means already that the unrestricted idiosyncratic language which bears traces of an authorship are beginning to be wiped away. Generally, CF systems already start with features and ratings which conform to some formal ontology. Unfortunately, this is not a guarantee of anonymity, only pseudonymity. A user’s name is simply wiped away and replaced with a unique id, but the profile’s integrity is intact. Because the number of features is quite large in our space of interests and identities, it may be possible to guess the identity of some nameless profiles whose constitutions are quite unique, or at least, it lends itself to the perception that privacy of the source data may be violated.

Rather than preserving individual profiles, InterestMap simply uses these profiles to learn the strengths of connections on a network whose nodes already exist (they are simply an exhaustively enumeration of all features in the ontology). The final structure assures the anonymity of the data source, and is much easier to distribute. Additionally, it is possible to use varied data sources whose feature-sets are slightly different (for example, one social network is missing “cuisines,” another is missing “sports”) because the resulting InterestMap hides such differences.

3 Temporality

Finally, there is the issue of temporality – keeping the recommender knowledge base current and up-to-date. This is an issue where collaborative filtering easily wins out over our “map” approach. Because CF recommenders keep each individual user’s rating history separate, any change made to an individual user’s model can automatically be understood by the CF recommendation algorithm. However, InterestMap’s collapse of individual models onto a communal map does not allow for such a simple update. The whole of the map must be retrained from a new image of the source data. Computationally, this is quite expensive and so the map approach is less suitable for domains in which the feature space is changing rapidly, e.g. news stories. Fortunately, the space of interests and identities does not evolve with such rapidity, tending more to move at the pace of culture, and so requires less frequent sampling of social network profiles to stay up-to-date.

SELF-DESCRIPTIVE PERSONAL PROFILES IN SOCIAL NETWORKS

The recent emergence and popularity of web-based social network software (boyd, 2004; Donath & boyd, 2004) such as Friendster, Orkut, and Thefacebook can be seen as a tremendous source of domain-independent user models, which might be more appropriately termed, person models to reflect their generality. To be sure, well over a million self-descriptive personal profiles are available across different web-based social networks. While each social network has an idiosyncratic representation, the common denominator across all the major web-based social networks we have examined is the representation of a person’s broad interests (e.g. hobbies, sports, music, books, television shows, movies, and cuisines) as a set of keywords and phrases.

In addition, more than just interests, higher-level features about a person such as cultural identities (e.g. “raver,” “extreme sports,” “goth,” “dog lover,” “fashionista”) are also articulated via a category of special interests variously named, “interests,” “hobbies & interests,” or “passions.” In the web page layout of the personal profile display, this special category of interest always appears above the more specific interest categories, encouraging a different conceptualization of these interests; they are potentially interests more central to one’s own self-concept and self-identification. Of course, syntactic and semantic requirements are not enforced regarding what can and cannot be said within any of these profile entry interfaces, but based on our own experiences, with the exception of those who are intentionally tongue-and-cheek, the special interests category is usually populated with descriptors more central to the self than other categories. For example, a person may list “Nietzsche” and “Neruda” under the “books” category, and “reading,” “books,” or “literature” under the special interests category. In the normalization of profiles, identity descriptors are inferred from descriptors listed under the special interests category (e.g. “dogs” ( “Dog Lover,” “reading” ( “Book Lover”, “deconstruction” ( “Intellectual”).

The remainder of this section exposes some of the major advantages and disadvantages of applying this corpus of personal profiles to the recommendation problem; in addition, we further evaluate our paper claim that recommendations made with this corpus are indeed be driven by identity and taste.

1 Advantages

There are several advantages to applying this corpus of personal profiles to the recommendation problem. First, as was discussed earlier in Section 2.1, the interest features given in the corpus are demonstrated to have communicative value and are non-arbitrary by virtue of a person consciously including the interest in the profile; this represents a marked advantage over history-of-interactions data gathered within a particular application, which is not hand-crafted.

Second, as users of social network are already motivated to maintain their profiles in the context of that end-user application for purposes such as grooming social relationships and finding dates (although boyd counters that some profiles are rarely updated), this corpus is likely to stay fresh and current and can be re-sampled periodically to update the recommender.

Third, unlike some other genres of “personal profiles” whose features are chosen from existing lists, such as those used for online marketing (e.g. “beauty,” “technology,” “sports,” etc.), social network profiles are entered as free text, posing few restrictions on what can be said and allowing a rich feature space to power recommendation; it is up to our system to opportunistically recognize those free text fragments and map them into formal ontologies.

Fourth, rather than representing identity using small demographic buckets (e.g. age 18-30; income: above $100,000), our system infers a broader set of psychographic buckets (e.g. “Dog Lover,” “Book Lover”, “Intellectual”) from descriptors listed under the special interests category. Arguably, psychographic identity features are more relevant and predictive of good recommendations than demographic ones; at least, this is the current thinking in the marketing literature (Zaltman, 2003).

2 Disadvantages

In boyd’s (2004) analysis of the Friendster social network, questions the accuracy and fidelity of profiles. Because social networks collapse social relations from a variety of contexts (e.g. work, high school, family) down into a singular list of friends, and because social networks like Friendster have a dual use as a dating site, there is a tremendous politics to the presentation of self. One must craft a profile which can at once attract potential suitors, yet not be perceived as false by friends. It is also necessary to negotiate the presented persona so as to be acceptable by friends from a variety of social contexts; as a result, boyd suggests that users are particularly apprehensive of including interests which might possibly perceived as faux pas’s (how many hipsters secretly listen to Britney Spears but refuse to report it?). Because of these concerns, profiles tend to represent a person from a limited or skewed perspective.

There is also concern that the space of interests being reported through social network profiles is too limited to a few popular interests. This would seemingly preclude many potentially successful recommendations from being made, but are for whatever reason so rarely reported in profiles that they would never actually be recommended. A potential solution to this limitation, then, is to use InterestMap as a meta-level recommendation system, which simply passes on some context about a user (such as identity or style of music) onto a domain-specific feature-based recommender.

3 Identity and Taste

We have claimed that recommendations made with this corpus are indeed driven by identity and taste, but this requires some further explanation.

Earlier we have described that certain identity descriptors are inferred from the “special interests” category in profiles; so for example, “dogs” maps into “Dog Lover,” “reading” maps into “Book Lover”, and maybe “deconstruction” maps into “Intellectual.” In Section 4, we explain further how these inferential mappings are made. The space of identity descriptors is smaller and also less-sparse than the general space of interests. We calculate the following statistic in InterestMap: each identity descriptor occurs in the corpus on the average of 18 times more frequently than the typical interest descriptor. Recalling that the connection strengths in an InterestMap are learned by calculating the pointwise mutual information (PMI) between any pair of features, the implication is that identity descriptor features are involved in many more calculations than the typical interest feature, and are semantically connected to a broader set of features. A hub-and-spoke effect emerges from this, with each identity descriptor feature behaving as a hub, connected broadly and strongly through spokes to other features.

When InterestMap is used to make recommendations, spreading activation outward from a set of seed nodes determines the resultant neighborhood from which recommendation candidates are selected. Since spreading activation tends to flow easily through strong links, and because a typical interest feature is, on average, more strongly connected to identity descriptor features than to other interest features, identity plays a dominant role in structuring recommendations.

Having established that identity is a major basis for recommendation, we have yet to argue that the connections between identity descriptor features and interest features are warranted and meaningful. If an identity is thought of as a high-level theme describing a person, then that theme must lend some cohesion to the rest of the profile in order for it to be a useful organizer of interests. But why are we to believe that social network profiles demonstrate this kind of cohesion around high-level themes like identity?

Boyd’s (2004) observations that profiles are limited and focused in perspective and that users are sensitive to faux pas’s or out-of-place features offer some support to this end. Donath and Boyd’s (2004) suggestion that profile-crafting be viewed as an activity meant to signal underlying qualities about a person (like tastes or identity, for example) would require that many or most of the articulated interests coherently signal a single underlying quality because a single or a few interests might produce too weak or ambiguous a signal. In the sociology of identity literature, there also a theoretical principle called the Diderot Effect (McCracken, 1991) which governs the assemblage of symbolic environments such as one’s possessions, or in our case, one’s self-descriptive profile. The Diderot Effect is “a force that encourages the individual to maintain a cultural consistency in his/her complement of consumer goods,” (p. 123) and the tendency to be consistent with our articulated self-concepts; thus it is quite fair to conclude that one’s articulated identity-descriptors necessarily predict a coherency in one’s articulated interests.

While many cultural identity descriptors are easy to articulate and can be expected to be given in the special interests category of the profile, tastes are often a fuzzy matter of aesthetics and may be harder to articulate in the special interests category. For example, a person in a European taste-echelon may like the band “Stereolab” and the philosopher “Jacques Derrida,” yet there may be no convenient keyword articulation to express this. This may be an example of the collective outsmarting the individual because once an InterestMap is learned, cliques of interests seemingly governed by nothing other than taste clearly emerge on the network. One clique for example, seems to demonstrate a latin aesthetic: “Manu Chao,” “Jorge Luis Borges,” “Tapas,” “Soccer,” “Bebel Gilberto,” “Samba Music.” In place of indexicals like identity descriptors, tastes seem to emerge as small cliques of interests which together constitute an indexical in the network. Because the cohesion of cliques is strong, they tend to behave as a single hub in the same way identity descriptor nodes behave in spreading activation.

IMPLEMENTATION

As discussed above in Section 2, InterestMap is constructed in four steps. First, we mined 100,000 personal profiles from two web-based social networks between January and July of 2004. These profiles were organized as unstructured lists of interests under a fixed ontology of category headings such as “special interests,” “movies,” “books,” etc.

Second, we wrote Python scripts to segment these unstructured lists, which happened variously with commas, semicolons, newlines, the word “and,” etc. To normalize the causally-stated keywords, we opted to create a formal ontology covering the space of interests. For this we turned to various resources of ontologies on the web for music, sports, movies, television shows, and cuisines, including The Open Directory Project[7], the Internet Movie Database[8], and Wikipedia[9]. The ontology of cultural identity descriptors required the most intensive effort to assemble together, finished off with some hand editing. To assist in the heuristic normalization, we also gathered statistics on the popularity of certain features (most readily available in The Open Directory Project) which could be used for disambiguation (e.g. “Bach” ( “JS Bach” or ( “CPE Bach”?). Using this crafted ontology of 22,000 main features, the heuristic normalization process successfully recognized 68% of all tokens across the 100,000 personal profiles, committing 8% false positives across a random checked sample of 1,000 mapped features. We suggest that this is a good result considering the difficulties of working with free text input, and enormous space of potential features.

Third, once the interest features have been normalized, they are expanded using metadata assembled along with the formal ontology. For example, a book implies its author, and a band implies its musical genre. These metadata features are included in the profile, but at a discount of 0.5 (read: they only count half as much). The purpose of doing this is to increase the chances that the learning algorithm will discover latent semantic connections.

Fourth, we apply an information-theoretic machine learning technique called pointwise mutual information (Church et al., 1991), or PMI, over the corpus of profiles to learn the semantic relatedness of every possible pair of features (interests or identity-descriptors). For any two features f1 and f2, their PMI is given in equation (1).

[pic] (1)

What results is a 22,000 x 22,000 matrix of PMIs. After filtering out features which have a completely zeroed column of PMIs, we arrive at a 12,000 x 12,000 matrix, and this is the complete form of the InterestMap. Of course, this is too dense to be visualized as a semantic network, but we have built less dense semantic networks from the complete form of the InterestMap by applying thresholds for minimum connection strength.

EMPIRICAL EVALUATION

We have thus far only performed an empirical evaluation of the performance of InterestMap versus a CF-like control in an interest recommendation task. We plan to further evaluate InterestMap through a real-world deployment so that we may measure our claims regarding improved transparency and trust, and the impact that the identity-spokes and taste-cliques have on shaping recommendations. These will be included in a forthcoming paper.

We performed five-fold cross validation to determine the accuracy of InterestMap in recommending interests, versus a control system which operates using naïve collaborative filtering. The corpus of 100,000 normalized and metadata-expanded profiles was randomly divided into five segments. One-by-one, each segment was held out as a test corpus and the other four used to either train an InterestMap using PMI, or stored as cases to be used for the CF control system.

Within each profile in the test corpus, a random half of the identity-descriptor and interest features were used as a “situation feature set” and the remaining half as the “target feature set.” The InterestMap recommender uses the situation feature set to create an activation of the InterestMap network. Using this activated network, a rank-ordered list of all interest features is created, and for each interest feature in the target feature set, a percentile relevance score is calculated, corresponding to that target interest feature’s ranking in the list of all features. The overall accuracy is the arithmetic mean of the percentile relevance scores generated for each interest feature in the target feature set. We opted to score the accuracy of a recommendation on a sliding scale, rather than requiring that target interest features be guessed exactly within n tries because the size of the target feature set is so incredibly small with respect to the space of possible guesses that accuracies will be too low and variances too high for a good performance assessment.

[pic]

Figure 2. Results of five-fold “hold-one-out” cross-validation of InterestMap and a CF control system on a recommendation task.

In the control system, the situation feature set is used to recall particular profiles containing those features, and profiles are scored and rank-ordered based on how many features from the situation feature set they contain. The interests contained in the rank-ordered list of profile matches are mapped out, duplicates filtered, and a rank-order list of interest features is produced. In both systems, scoring ties are broken by randomly ordering the tied items. The results are given in Figure 2Entering an online community for the first time can be intimidating if a person does not understand the dynamics of the community and the attitudes and opinions espoused by its members. Right now, there seems to only be one option for these first-time entrants – to comb through the interaction logs of the community for clues about people’s personalities, attitudes, and how they would likely react to various situations. Picking up on social and personal cues, and overgeneralizing these cues into personality traits, we begin to paint a picture of a person so lucid that we seem to be able to converse with that person in our heads. Gaining understanding of the community in this manner is time consuming and difficult, especially when the community is complex. For the less dedicated, more casual community entrant, this approach would be undesirable.

[pic]

Figure 1. Virtual personas representing members of the AI community react to typed text. Each virtual persona’s affective reactions are visualized by modulating graphical elements of the icon.

In our research, we are interested in giving people at-a-glance impressions of the attitudes of people in an online community so that they can more quickly and deeply understand the personalities and dynamics of the community.learn-by-doing; often called “experiential learning” in the educational psychology literature, many such as Kolb have touted this dynamic, explorative, generate-and-test approach as necessary to developing real intuition about a subject (Kolb, 1985).

The problem of getting a person to fully engage on a task without extraneous demands of form is often referred to as the engagement problem, and a well-known theory which addresses it is Csikzentmihalyi’s flow state theory (1997), which describes flow as a desirable state of deep engagement with a task putting aside all other concerns. In thinking about flow with respect to complex activities like programming, Pearce and Howard are concerned with attentional thrashing between a task and the artifacts of the tool used to accomplish that task (2004). We believe that Metafor addresses the flow concern of programming; a person is naturally engaged when he expresses the task as a story, and Metafor’s automatic creation of scaffolding code from a person’s narrative leaves a person free to focus on the high-level task without the disruptions of programmatic concerns. Error! Reference source not found.Error! Reference source not found.4.1.

The results demonstrate that on average, interest features in the target feature set were ranked found in the 82 percentile of features in InterestMap’s recommendations, and found in the 71 percentile in the CF control system; this is a rather unambiguous validation of the information theoretic “map” approach of recommendation as compared to collaborative filtering. Of course, the patchy nature of this domain’s feature space must be considered. Some regions in the feature space, such as music, seem to be densely covered, while other areas are quite sparse. Generally, we would expect collaborative filtering to perform well in sparse domains, and information-theoretic approach to fare better in dense domains, but because this domain’s feature space is patchy, it is hard to assess the clear impact of the domain topology on the results. Standard deviation scores suggest that neither system was entirely consistent in its performance, and are just straddling the line of acceptability. However, there are several odd-ball scenarios which may be artificially exaggerating these scores – for example, some interests in a profile may be completely out-of-the blue and apart from the others; also, errors in the normalization process may also artificially generate out-of-the-blue features.

CONCLUSION & FUTURE WORK

Applying a social data mining approach to recommendation, we mined 100,000 self-descriptive social personal profiles from web-based social networks and applied a pointwise mutual information learning mechanism to build InterestMap, a giant connectionist-semantic network of interests (i.e. hobbies, sports, music, books, television shows, movies, cuisines), and identity-descriptors (e.g. “raver,” “dog lover,” “fashionista”). We have argued that interest features derived from social network profiles are non-arbitrary and have greater communicative value than features derived from histories of interactions with applications, and thus, have more positive implications for transparency and trust; furthermore a “map” representation does a better job of protecting source data anonymity, and so is more appropriate when the source is sensitive to privacy like social network data. We also demonstrated the increased role that identity (via identity-hubs) and tastes (via taste-cliques) play in the production of recommendations, and suggest that this has contributed to InterestMap’s 82% to 71% accuracy advantage over a collaborative filtering approach in our empirical evaluation.

In future work, we hope to explore the role that recognized identities and tastes can play in generating explanations, and also how these interest grouping mechanisms can be used to learn from recommendation feedback (e.g. user did not like “ESPN Sportscenter” or “Blue Crush” so hypothesize that user is not “Sports Fan” and avoid all interests connected to this hub). We also see an opportunity to use InterestMap as a domain-generic recommender which can provide contextual support and bootstrapping recommendations to single-domain recommenders.

We have built a system that can automatically generate a model of a person’s attitudes and opinions from an automated analysis of a corpus of personal texts, consisting of, inter alia, weblogs, emails, webpages, instant messages, and interviews. “What Would They Think?” (Fig. 1) displays a handful of these digital personas together, each reacting to inputted text differently. The user can see visually the attitudes and disagreements of strong personalities in a community. Personas are also capable of explaining why they react as they do, by displaying some text quoted from that person when the face is clicked.

To build a digital persona, the attitudes that a person exhibits in his/her personal texts are recorded into an affective memory system. Newly presented text triggers memories from this system and forms the basis for an affective reaction. Mining attitudes from text is achieved through natural language processing and commonsense-based textual affect sensing (Liu et al., 2003). This approach to person modeling is quite novel when compared to previous work on the topic (cf. behavior modeling, e.g. (Sison & Shimura, 1998), and demographic profiling, e.g. questionnaire-derived user profiles).

A related paper on this work (Liu, 2003b) gives a more thorough technical treatment of the system for modeling human affective memory from personal texts. This paper does not dwell on the implementation-level details of the system, but rather, describes the computational model of attitudes in a more practical light, and discusses how these models are incorporated to build the intelligent user interface “What Would They Think?”.

This paper is structured as follows. First, we introduce a computational model of a person’s attitudes, a system for automatically acquiring this model from personal texts, and methods for applying this model to predict a person’s attitudes. Second, we present how a collection of digital personas can portray a community in “What Would They Think?” and an evaluation of our approach. Third, we situate our work in the literature. The paper concludes with further discussion and presents directions for future work.

COMPUTING A PERSON’S ATTITUDES

“better,” “more” “best,” “most” “each drink has,” “each drink has a price” Rarely is something constructed de novo. ; in fact, t (other than naïve physics)precisely precisely sp u needed for a venture in programmatizing natural language In particular, Just as stories are written under the assumption that the reader possesses certain basic knowledge, so should a computer reader possess a sufficient background library of prototypes.

Having at least touched upon many of the interesting programmatic semantics of natural language, the next section describes how some of this theoretical discussion is computationalized in the Metafor system interpreter.In this section we present a brief overview of the Metafor implementation and then we discuss some simplifying assumptions which were made to the theory presented in Section 3.

The Metafor system can broken down into the following components:, although admittedly the current implementation does not actively dialog with the user for any substantive decision making The user interface given in Figure 1 is still very much a work in progress. The upper-right window which current offers an under-the-hood view into the system’s internal state is not as accessible as it could be to novices. We would also like to add a history feature and vital features currently missing from the interface such as undo, or allowing the user to modify the visualized code.Given the complexity of performing a judicious evaluation, and the relatively small size of the study’s sampling, we prefer to view this as an indicative study, one which may prelude a fuller study not completed at submission press time.The pool of vconsisted of allsevensix (no other types were sought for this study) Each volunteer was taken through an interview-style assessment. We felt that to measure improvements in non-programmer’s intuition would have required a longer-term study and a more robust implementation. Thus, pResults are given in Figures 2 and 3. for something which could tutor them to code, and which they could have unlimited access to, unlike a real tutor, perhaps helping her to make her story articulation more precise or more rigorous; and.tTt Also to be sure, (that would imply that we have solved the deep story understanding problem, which we have not)From the comments of those we have played with Metafor, wW They also found it fun and engaging, and would consider using it.

.Our approach to modeling attitudes is based on the analysis of personal texts using natural language parsing and the commonsense-based textual affect sensing work described in (Liu et al., 2003). Personal texts are broken down into units of affective memory, consisting of concepts, situations, and “episodes”, coupled with their emotional value in the text. The whole attitudes model can be seen as an affective memory system that valuates the affect of newly presented concepts, situations, and episodes by the affective memories they trigger.

In this section, we first present a bipartite model of the affective memory system. Second, we describe how such a model is acquired automatically from personal texts. Third, we discuss methods for applying the model to predict a user’s affective reaction to new texts. Fourth, we describe how some advanced features enrich our basic person modeling approach.

1 A Bipartite Affective Memory System

A person’s affective reaction to a concept, topic, or situation can be thought of as either instinctive, due to attitudes and opinions conditioned over time, or reasoned, due to the effect of a particularly vivid recalled memory. Borrowing from cognitive models of human memory function, attitudes that are conditioned over time can be best seen as a reflexive memory, while attitudes resulting from the recall of a past event can be represented as a long-term episodic memory (LTEM). Memory psychologist Endel Tulving equates LTEM with “remembering” and reflexive memory with “knowing” and describes their functions as complementary (Tulving, 1983). We combine the strengths of these two types of memories to form a bipartite, episode-reflex model of the affective memory system.

1 Affective long-term episodic memory

Long-term episodic memory (LTEM) is a relatively stable memory capturing significant experiences and events. The basic unit of memory captures a coherent series of sequential events, and is known as an episode. Episodes are content-addressable, meaning, that they can be retrieved through a variety of cues encoded in the episode, such as a person, location, or action. LTEM can be powerful because even events that happen only once can become salient memories and serve to recurrently influence a person’s future thinking. In modeling attitudes, we must account for the influence of these particularly powerful one-time events.

In our affective memory system, we compute an affective LTEM as an episode frame, coupled with an affect valence score that best characterizes that episode. In Fig. 2, we show an episode frame for the following example episode: “John and I were at the park. John was eating an ice cream. I asked him for a taste but he refused. I thought he was selfish for doing that.”

Figure 2. An episode frame in affective LTEM.

As illustrated in Fig. 2, An episode frame decomposes the text of an identified episode into simple verb-subject-argument propositions like (eat John “ice cream”). Together, these constitute the subevents of the episode. The moral of an episode is important because the episode-affect can be most directly attributed to it. Extraction of the moral, or root cause, is done through heuristics which are discussed elsewhere (Liu, 2003b). Tulving’s encoding specificity hypothesis (1983) suggests that contexts such as date, location, and topic are useful to record because an episode is more likely to be triggered when current conditions match the encoding conditions. The affect valence score is a numeric triple representing (pleasure, arousal, dominance). This will be covered in more detail later in the paper.

2 Affective reflexive memory

While long-term episodic memory deals in salient, one-time events and must generally be consciously recalled, reflexive memory is full of automatic, instant, almost instinctive associations. Whereas LTEM is content-addressable and requires pattern-matching the current situation with that of the episode, reflexive memory is like a simple lookup-table that directly associates a cue with a reaction, thereby abstracting away the content. In humans, reflexive memories are generally formed through repeated exposures rather than one-time events, though subsequent exposures may simply be recalls of a particularly strong primary exposure (Locke, 1689). In addition to frequency of exposures, the strength of an experience is also considered. Complementing the event-specific affective LTEM with an event-independent affective reflexive memory makes sense because there may not always be an appropriate distinct episode which shapes our appraisal of a situation; often, we react reflexively – our present attitudes deriving from an amalgamation of our past experiences now collapsed into something instinctive.

Because humans undergo forgetting, belief revision, and theory change, update policies for human reflexive memory may actually be quite complex. In our computational, we adopt a more simplistic representation and update policy that is not cognitively motivated, but instead, exploits the ability of a computer system to compute an affect valence at runtime.

The affective reflexive memory is represented by a lookup-table. The lookup-keys are simple concepts which can be semantically recognized as a person, action, object, activity, or named event. These keys act as the simple linguistic cues that can trigger the recall of some affect. Associated with each key is a list of exposures, where each exposure represents a distinct instance of that concept appearing in the personal texts. An exposure, E, is represented by the triple: (date, affect valence score V, saliency S). At runtime, the affect valence score associated with a given conceptual cue can be computed using the formula given in Eq. (1).

[pic] (1)

where n = the number of exposures of the concept

This formula returns the valence of a conceptual cue averaged over a particular time period. The term, [pic], rewards frequency of exposures, while the term, [pic], rewards the saliency of an exposure. In this simple model of an affective reflexive memory, we do not consider phenomena such as belief revision, reflexes conditioned over contexts, or forgetting.

To give an example of how affective reflexive memories are acquired from personal texts, consider Fig. 3, which shows two excerpts of text from a weblog and a snapshot sketch of a portion of the resulting reflexive memory.

Figure 3. How reflexive memories get recorded from excerpts.

In the above example, two text excerpts are processed with textual affect sensing and concepts, both simple (e.g. telemarketer, dinner, phone), and compound (e.g. telemarketer::call, interrupt::dinner, phone::ring) are extracted. The saliency of each exposure is determined by heuristics such as the degree to which a particular concept in topicalized in a paragraph. The resulting reflexive memory can be queried using Eq. (1). Note that while a query on 3 Oct 01 for “telemarketer” returns an affect valence score of (-.15, .25, .1), a query on 5 Oct 01 for the same concept returns a score of (-.24, .29, .11). Recalling that the valence scores correspond to (pleasure, arousal, dominance), we can interpret the second annoying intrusion of a telemarketer’s call as having conditioned a further displeasure and a further arousal to the word “telemarketer”.

Of course, concepts like “phone” and “dinner” also unintentionally inherit some negative affect, though with dinner, that negative affect is not as substantial because the saliency of the exposure is lower than with “telemarketer.” (“dinner” is not so much the topic of that episode as “telemarketer”). Also, if successive exposures of “phone” are affectively ambiguous (sometimes used positively, other times negatively), Eq. (1) tends to cancel out inconsistent affect valence scores, resulting in a more neutral valence.

In summary, we have motivated and characterized the two components of the affective memory system: an episodic component emphasizing the affect of one-time salient memories, and a reflexive component, emphasizing instinctive reactions to conceptual cues that are conditioned over time. In the following subsection, we propose how this bipartite affective memory system can be acquired automatically from personal texts.

2 Model Acquisition from Personal Texts

The bipartite model of the affective memory system presented above can be acquired automatically from an analysis of a corpus of personal texts. Fig. 4 illustrates the model acquisition architecture. [pic]

Figure 4. An architecture for acquiring the affective memory system from personal texts.

Though there are some challenging tasks in the natural language extraction of episodes and concepts, such as the heuristic extraction of episode frames, these details are discussed elsewhere (Liu, 2003b). In this subsection, we focus on three aspects of model acquisition, namely, establishing the suitability criteria for personal texts, choosing an affective representation of attitudes, and assessing the affective valence of episodes and concepts.

1 What Personal Texts are Suitable?

In deciding the suitability of personal texts, it’s important to keep in mind that we want a text that is both a rich source of opinion, and also amenable to natural language processing by the computer. First, texts should be first-person, opinion narratives. It is still rather difficult to extract a person’s attitudes given a non-autobiographical text because the natural language processing system would have to robustly decide which opinions belong to which persons (we save this for future work). It is also important that the text be of a personal nature, relating personal experiences or opinions. Attitudes and opinions are not easily accessible in third-person texts or objective writing, especially for a rather naïve computer reading program. Second, texts should explore a sufficient breadth of topics to be interesting. An insufficiently broad model gives a poor and disproportional sampling of a person and would hardly justify the embodiment of such a model into a digital persona. It should be noted however, that there is plausible reason to intentionally partition a person’s text corpus into two or more digital personas. Perhaps it would be interesting to contrast an old Marvin Minsky versus a young one, or a Marvin who is passionate about music versus a Marvin who is passionate about A.I. Third, texts should cover everyday events, situations, and topics, because that is the optimal discourse domain of recognition of the mechanism with which we will judge the affect of text. Fourth, texts should ideally be organized into episodes, occurring over a substantial period of time relative to the length of a person’s life. This is a softer requirement because it is still possible to build a reflexive memory without episode partitioning. Weblogs are an ideal input source because of their episodic organization, although instant messages, newsgroups, and interview transcripts are also good input sources because they are so often rich in opinion.

Representing Affect using the PAD Model

Affect valence pervading the proposed models can take one of two potential representations. They take an atomistic view that emotions existing as a part of some finite repertoire, as exemplified by Manfred Clyne’s “sentics” schema (1977). Or, they can take the form of a dimensional model, represented prominently by Albert Mehrabian’s Pleasure-Arousal-Dominance (PAD) model (1995). In this model, the three nearly independent dimensions are Pleasure-Displeasure (i.e., feeling happy or unhappy), Arousal-Nonarousal (i.e., arousing one’s attention), and Dominance-Submissiveness (i.e., the amount of confidence/lack-of-confidence felt). Each dimension can assume values from –100% to +100%, and a PAD valence score is a 3-tuple of these values (e.g. [-.51, .59, .25] might represent anger).

We chose a dimensional model, namely, Mehrabian’s PAD model, over the discrete canonical emotion model because PAD represents a sub-symbolic, continuous account of affect, where different symbolic affects can be unified along one of the three dimensions. This model has robustness implications for the affective classification of text. For example, in the affective reflexive memory, a conceptual cue may be variously associated with anger, fear, and surprise, which can be unified along the Arousal dimension of the PAD model, thus enabling the affect association to be coherent and focused.

3 Affective Appraisal of Personal Text

Judging the affect of a personal text has three chief considerations. First, the mechanism for judging the affect should be robust and comprehensive enough to correctly appraise the affect of a breadth of concepts. Second, to aid in the determination of saliency, the mechanism must be able to appraise the affect of very little text, such as on the sentence-level. Third, the mechanism should recognize specific emotions rather than convolving affect onto any single dimension.

Several common approaches fail to meet the criteria. The naïve keyword spotting approach looks for surface language features like keywords. However, this approach is not acceptably robust on its own because affect is often conveyed without mood keywords. Statistical affect classification using statistical learning models such as latent semantic analysis (Deerwester et al., 1990) generally require large inputs for acceptable accuracy because it is a semantically weak method. Hand-crafted models and rules are not broad enough to analyze the desired breadth of phenomena.

To analyze personal text with the desired robustness, granularity, and specificity, we employ a model of textual affect sensing using real-world knowledge, proposed by Liu et al. (2003). In this model, defeasible knowledge of everyday people, things, places, events, and situations is leveraged to sense the affect of a text by evaluating the affective implications of each event or situation. For example, to evaluate the affect of “I got fired today,” this model evaluates the consequences of this situation and characterizes it using negative emotions such as fear, sadness, and anger. This model, coupled with a naïve keyword spotting approach, provides rather comprehensive and robust affective classification. Since the model uses knowledge rather than word statistics, it is semantically strong enough to evaluate text on the sentence level, classifying each sentence into a six-tuple of valences (ranging from a value of 0.0 to 1.0) for each of the six basic Ekman emotions of happy, sad, angry, surprised, fearful, and disgusted (an atomistic view of emotions) (Ekman, 1993). These emotions are then mapped to the PAD model.

One point of potential paradox should be addressed. The real-world knowledge-based model of affect sensing is based on defeasible commonsense knowledge from the Open Mind Commonsense corpus (Singh et al., 2002), which is in turn, gathered from a web community of some 11,000 teachers. Therefore, the affective assessment of text made by such a model represents the judgment of a typical person. However, sometimes a personal judgment of affect is contradicted by the typical judgment. Thus, it would seem paradoxical to attempt to learn that a situation has a personally negative affect when the typical person judges the situation as positive. To overcome this difficulty, we implement, in parallel, a mood keyword-spotting affect sensing mechanism to confirm and contradict the assessment of the primary model. In addition, we make the assumption that although a personal affect judgment may deviate from that of a typical person on small particulars, it will not deviate on average, when examining a large text. The implication of this is that on a slightly larger granularity than a sentence, the affective appraisal is more likely to be accurate. In fact, accuracy should increase proportional to the size of the textual context being considered. The evaluation of Liu et al.’s affective navigation system (2003b) yields some indirect support for the idea that accuracy increases with the size of the textual context. In that user study, users found affective categorizations of textual units on the order of chapters to be more accurate and useful to information navigation than affective categorizations of small textual units such as paragraphs.

To assess the affect of a sentence, we factor in the affective assessment of not only the sentence itself, but also of the paragraph, section, and whole journal entry or episode. Because so much context is factored into the affect judgment, only a modest amount of affective information can learned for any given sentence. Thus we rely on the confirming effects of being able to encounter an attitude multiple times. In exchange for only being able to learn a modest amount from a sentence, we also minimize the impact of erroneous judgments.

In summary, digital personas can be automatically acquired from personal texts. These texts should feature the explicit expression of the opinions of the person to be modeled, and should be of a certain form required by the natural language processing. Natural language processed texts are analyzed for its affective content at varying textual granularities (e.g. sentence-, paragraph-, and section- level) so as to minimize the possibility of error. This is necessary because our textual affect sensing tool evaluates a typical person’s affective reaction to a text, and not any particular person’s. Affect valence is represented using the PAD dimensional model of affect, whose continuity allows affect valences to be more easily summed together. The resulting affect valence is recorded with a concept in the reflexive memory, and an episode in the episodic memory.

3 Predicting Attitudes using the Model

Having acquired the model, the digital persona attempts to predict the attitudes of the person being modeled by offering some affective reaction when it is fed some new text. This reaction is based on how the new text triggers the reflex concepts and the recall of episodes in the affective memory system. When a reflex memory or episode is triggered, the affective valence score associated with that memory gets attached to the affective context of the new text. The gestalt reaction to the new text is a weighted summation of the affect valence scores of the triggered memories.

The triggering process is somewhat complex. The triggering of episodes requires the detection of an episode in the new text, and heuristically pattern matching this new episode frame to the library of episode frames. The range of concepts that can trigger a reflex memory is increased by the addition of conceptual analogy using OMCSNet, a semantic network of commonsense knowledge. The details of the triggering process is omitted here, but is discussed elsewhere (Liu, 2003b).

This process of valuating some new text by triggering memories out of the context in which they were encoded, and inheriting their affect valences, is error prone. We rely on the observation that if many memories are triggered, their contextual intersection is more likely to be accurate. Ultimately, the performance of the digital persona in reproducing the attitudes of the person being model is determined by the breadth and quality of the corpus of personal texts gathered on the person. The digital persona cannot predict attitudes that are not explicitly exhibited in the personal texts.

4 Enriching the Basic Model

The basic model of a person’s attitudes focuses on applying a person’s self-described memories to valuate new textual episodes. While this basic model is sufficient to produce reactions to text for which there exists some relevant personal memories, the generated digital personas are often quite “sparse” in what they can react to. We have proposed and evaluated some advancements to the basic model. In particular, we have looked at how a person’s attitude model can be enriched by the attitude models of people with whom the modeled person fashions himself/herself after – perhaps a good friend or mentor. More technically, we mean an imprimer.

Marvin Minsky describes an imprimer as someone to which one becomes attached. (Minsky, forthcoming) He introduces the concept in the context of attachment-learning of goals, and suggests that imprimers help to shape a child’s values. Imprimers can be a parent, mentor, cartoon character, a cult, or a person-type. The two most important criteria for an imprimer are that 1) the imprimer embodies some image, filled with goals, ideas, or intentions, and that 2) one feels attachment to the imprimer.

We extend this idea in the affect realm and make the further claim that internal imprimers can do more than to critique our goals; our attachment to them leads us to the willful emulation of a portion of their values and attitudes. Keeping a collection of these internal imprimers, they help to support our identity. From the supposition that we conform to many of the attitudes of our internal imprimers, we hypothesize that affective memory models of these imprimers, if known, can complement the person’s own affective memory model in helping to predict a person’s attitudes. This hypothesis is supported by much of the work in psychoanalysis. Sigmund Freud (1991) wrote of a process he called introjection, in which children unconsciously emulate aspects of their parents, such as the assumption of their parent’s personalities and values. Other psychologists have referred to introjection by terms like identification, internalization, and incorporation.

We propose the following model of internal imprimers to support attitude prediction. First, it is necessary to identify people, groups, and images that may possibly be a person’s imprimer. We can do so but analyzing the affective memory. From a list of all conceptual cues from both the episodic and reflexive memories, we use semantic recognizers to identify all people, groups (e.g. “my company”) and images (e.g. “dog”=> “dog-person”) that on average, elicit high Arousal and high Submissiveness, show high frequency of exposure in the reflexive memory, and collocate in past episodes with self-conscious emotion keywords like “proud”, “embarrassed”, “ashamed”.

[pic]

Figure 5. Affective models of internal imprimers, organized into personas, complements one’s own affective model

Once imprimers are identified, we also wish to identify the context under which an imprimer’s attitudes show influence. Shown in Fig. 5, we propose organizing the internal imprimer space into personas representing different contextual realms. There is good reason to believe that humans organize imprimers by persona because we are different people for different reasons. One might like Warren Buffett’s ideas about business but probably not about cooking. Personas can also prevent internal conflicts but allowing a person to maintain separate systems of attitudes in different contexts. To identify an imprimer’s context, we must first agree on an ontology of personas, which can be person-general (as the personas in Fig. 5 are) or person-specific. Once imprimers are associated with personae, we gather as much “personal” text from each imprimer as desired and acquire only the reflexive memory model, thus relaxing the constraint that texts have episodic organization. In this augmented attitude prediction strategy (depicted in Fig. 3), when conceptual cues are unfamiliar to the self, we identify internal imprimers whose persona matches the genre of the new episode, and give them an opportunity to react to the cue. These affective reactions are multiplied by a coefficient representing the ability of this self to be influenced, and the valence score is added on to the episode. Rather than maintaining all attitudes in the self, internal imprimers enable judgments about certain things to be mentally outsourced to the persona-appropriate imprimers.

We have implemented and evaluated the automated identification and modeling acquisition of imprimer personas in cases where the imprimers are people. Our implemented system is not yet able to use abstract non-person imprimers, e.g. “dog-person”.

[pic]

Figure 6. The imprimer-augmented attitude prediction strategy. Edges represent memory triggers.

In summary, we have presented a reflex-episode model of affective memory as a memory-based representation of a person’s attitudes. The model can be acquired automatically from personal text using natural language processing and textual affect analysis. The model can be applied over new textual episodes to produce affective reactions that aim to emulate the actual reactions of the person being modeled. (Fig. 6). We have also discussed how the basic attitudes model can be enriched with added information about the attitudes of the mentors of the person being modeled.

In the following section, we abstract away the details of the attitudes model presented in this section to examine how digital personas can be portrayed graphically and how a collection of digital personas can portray the personalities of a community.

WHAT WOULD THEY THINK?

While modeling a person’s attitudes is fun in the abstract, it lacks the motivation and the verifiability of a real application of the theory and technology. What Would They Think? (Fig. 1) is a graphical realization of the modeling theory discussed in the previous section. What Would They Think? has been implemented and is currently being evaluated through user studies, though the underlying attitude models have already been evaluated in a separate study. In this section, we discuss the design of our interface, present some scenarios for its use, and report how this work has been evaluated.

1 Interface Design

Digital personas acquired from an automatic analysis of personal text, are represented visually with pictures of faces, which occupy a matrix. Given some new text typed or spoken into the “fodder” box, each persona expresses an affective reaction through modulations in the graphical elements of the face icon. Each digital persona is also capable of some introspection. When clicked, a face can explain what motivated its reaction by displaying a salient quote from its personal text.

Why a static face? Visualizing a digital persona’s attitudes and reactions with the face of the person being represented is better than with something textual or abstract. There are several reasons why a face is a superior representation. People are already wired with a cognitive faculty for quickly recognizing and remembering faces, and a face acts as a unique cognitive container for a person’s individual identity and personality. In the user task of understanding a person’s personality, it is easier to attribute personality traits and attitudes to a face than to text or an abstract graphic. For example, people-watching is a past-time in which we imagine the personality and identity behind a stranger’s face (Whyte, 1988). A community of faces is more socially evocative than either a community of textual labels or abstract representations, for those representations are not designed as convenient containers of identity and personality.

Having decided on a face representation, should the face be abstract or real, static or animated? While verisimilitude is the goal for many facial interfaces, we must be careful to not portray more detail in the face than our attitude model is capable of elucidating, for the face is fraught with social cues, and unjustified cues could do more harm than good. By conveying attitudes through modulations in the graphical elements of a static face image, rather than through modulations of expression and gaze in an animated face, we are emphasizing the representational aspect of the face, over the real. Scott McCloud has explored extensively the representational-vs.-real tradeoff of face drawing in comics (1993).

Modulating the Face. In the expression of an affective reaction, it is nice to be able to preserve the detail of the continuous, dimensional output of the digital persona. The information should also be conveyed as intuitively as possible. Thus an intuitive mapping may be best achieved through the use of visual metaphors to represent affective states of the person (Lakoff & Johnson, 1980). We often describe a happy person as being “colorful”, while “face turns colorless” usually represents negative emotions like fear and melancholy. A person whose attention or passion is aroused has a face that “lights up”. And someone who isn’t sure or confident about a topic feels “fuzzy” toward it. Taking these metaphors into consideration, a rather straightforward scheme is used to map the three affect dimensions of pleasure, arousal, and dominance onto the three graphical dimensions of color saturation, brightness, and focus, respectively. A pleasurable reaction is manifested by a face with high color saturation, while a displeasurable reaction maps to an unsaturated, colorless face. This mapping creates an implicit constraint that the face icon be in color. An aroused reaction results in a brightly lit icon, while a non-aroused reaction results in a dimly lit icon. A dominant (confident) reaction maps to a sharp, crisp image, while a submissive (unconfident) reaction maps to a blurry, unfocused image. While better mapping schemes may exist, our experience with users who have worked with this interface tells us that the current scheme conveys the affect reaction quite intuitively. This makes the assumption that the original face icons are all of good quality – in color, bright enough, and in focus.

Populating a Community. An n x n matrix can hold a small collection of digital personas. The matrix can either be configured automatically or manually. Each matrix cell can be manually configured to house a digital persona by specifying a persona .MIND file and a face icon. A user can build and later augment a digital persona by specifying a weblog url, homepage url, or some personal text pasted into the window. The matrix can also be configured automatically to represent a community. Plug-in scripts have been created to automatically populate the matrix with certain types of communities, including a niche community of weblogs known as a “blog ring,” a circle of friends in the online networking community called “,” a group of potential mates on an online dating website called “,” and a usenet community.

Currently, only a blog ring community can generate fully specified digital personas. The Friendster and communities’ personal text corpora are rather small profile texts. As a result, only a fairly shallow reflexive memory can be built. The episodic memory is not meaningful for these texts. The personal texts for usenet communities are rather inconsistent in quality. For example, a usenet community based on question and answers will not be as good a source of explicit opinions as a community based on discussion of issues. Also, usenet communities pose the problem of not providing a face icon for each user. In this case, the text of each person’s name labels each matrix cell, accompanied by a default face icon in the background, which is necessary to convey the affective reaction.

Introspection. A digital persona is capable of some limited introspection. To inquire what motivated a persona to express a certain reaction to some text, the face icon can be clicked. An explanation will be offered taking the form of a quote or a series of quotes from the personal text. These quotes are generated by backpointers to the text associated with each affective memory. For episodic memory, a particularly salient episode can justify a reaction, while there may need to be many quotes to justify a triggered reflex memory. With the capability for some introspection and explanation, a user can verify whether or not an affective reaction is indeed justified. This lends the interface some fail-softness, as a user will not be completely mislead when a person’s attitude is erroneously represented by the system.

2 Use Cases

How can a person use the What Would They Think? interface to understand the personalities and attitudes of people in a community? The system supports several use cases.

In the basic use case, the user, a new entrant to a community, is presented with an automatically generated matrix of some people in the community. The user can employ a hypothesis-testing approach to understanding personalities. The user types some very opinionated statements into the “fodder” box as a litmus test in understanding the attitudes of the different people toward that statement. Faces that lighting up in color versus black and white provide an illustrative contrast of the strong disagreements in the community. A user can inquire as to the source of strong opinions by clicking on a face and viewing a motivating quote. A user can reorganize the matrix so as to cluster personalities perceived to be similar. Assuming that the personal texts for each persona in the community is of comparable length, depth, and quality, the user may notice over a series of interactions that certain personas are negative more often than not, or certain other personas are aroused more intensely more often than other personas. These may lead a user to conclude that certain personalities are more cynical, and others more easily excitable.

Another use case is gaging the interests and expertise of people in a community. Because people generally talk more about things that interest them and have more to say on topics they are more familiar with, a digital persona modeled on such texts will necessarily exhibit more reaction to texts that are interesting to the person being or falls in their area of expertise. In this use case, a user can, for example, copy-and-paste a news article into the fodder box and assess which personas are interested or have expertise toward a particular topic.

A third use case involves community-assisted reading. The matrix fodder box can be linked to a cursor position in a text file browser. As a user reads through a webpage, story, or news article, he/she can get a sense of how the community might read and react to the text currently being read.

3 Evaluation

The quality of the attitude prediction in What Would They Think? has been formally evaluated through user studies. We are also currently conducting user studies to evaluate the effectiveness of the matrix interface in assisting a person to learn about and understand a community. These results will be available by press time.

The quality of attitude prediction was evaluated experimentally, working with four subjects. Subjects were between the ages of 18 and 28, and have kept diary-style weblogs for at least 2 years, with an average entry interval of three-to-four days. Subjects submitted their weblog urls, for the generation of affective memory models. An imprimer identification routine was run, and the examiner hand-picked the top one imprimer for each of the three persona domains implemented: social, business, and domestic. A personal text corpus was built, and imprimer reflexive memory models were generated. The subjects were engaged in an interview-style experiment with the examiner.

In the interview, subject and their corresponding PERSONA models were asked to evaluate 12 short texts representative of three genres: social, business, and domestic (corresponding to the ontology of personas in the tested implementation). The same set of texts was presented to each participant and the examiner chose texts that were generally evocative. They were asked to summarize their reaction by rating three factors on Likert-5 scales.

Feel negative about it (1)…. Feel positive about it (5)

• Feel indifferent about it (1) … Feel intensely about it (5)

Don’t feel control over it (1)… Feel control over it (5)

These factors are mapped onto the PAD valence format, assuming the following correspondence: 1(-1.0, 2( -0.5, 3(0.0, 4( +0.5, and 5( +1.0. Subjects’ responses were not normalized. To assess the quality of attitude prediction, we record the spread between the human-assessed and computer-assessed valences,

[pic] (2)

We computed the mean spread and standard deviation across all episodes along each PAD dimension. On the –1.0 to +1.0 valence scale, the maximum spread is 2.0. Table 1 summarizes the results.

Table 1. Performance of attitude prediction, measured as the spread between human and computer judged values.

| |Pleasure |Arousal |Dominance |

| |mean |std. |mean |std. |mean |std. |

| |spread |dev. |spread |dev. |spread |dev. |

|SUBJECT 1 |0.39 |0.38 |0.27 |0.24 |0.44 |0.35 |

|SUBJECT 2 |0.42 |0.47 |0.21 |0.23 |0.48 |0.31 |

|SUBJECT 3 |0.22 |0.21 |0.16 |0.14 |0.38 |0.38 |

|SUBJECT 4 |0.38 |0.33 |0.22 |0.20 |0.41 |0.32 |

Assuming that human reactions obeyed a uniform distribution over the Likert-5 scale, we give two baselines, which were simulated over 100,000 trials. In BASELINE 1, [pic] is fixed at 0.0 (neutral reaction to all text). In BASELINE 2, [pic] is given a random value over the interval [-1.0,1.0] with a uniform distribution (arbitrary reaction to all text). It should be pointed out however, that in the context of an interactive sociable computer, BASELINE 1 is not a fair comparison, because it would never produce any behavior.

On average, our approach performed noticeably better than both baselines, excelling particularly in predicting arousal, and having the most difficulty predicting dominance. The standard deviations were very high, reflecting the observation that predictions were often either very close to the actual valence, or very far. This can be attributed to one of several causes. First, multiple episodes described in the same journal entries may have caused the wrong associations to be learned. Second, the reflexive memory model does not account for conflicting word senses. Third, personal texts inputted for the imprimers often generated models skewed to positive or negative because text did not always have an episodic organization. While results along the pleasure and dominance dimensions are weaker, the arousal dimension recorded a mean spread of 0.22, suggesting the possibility that it alone may have immediate applicability.

Table 2. Performance of attitude prediction that can be attributed to imprimers and episodic memory

In the experiment, we also analyzed how often the episodic memory, reflexive memory, and imprimers were triggered. Episodes were on average, 4 sentences long. For each episode, reflexive memory was triggered an average of 21.5 times, episodic memory 0.8 times, and imprimer reflexive memory 4.2 times. To measure the effect of imprimers and episodic memories, we re-ran the experiment turning off imprimers only, episodic memory only, and both. Table 2 summarizes the results.

These results suggest that the positive effect of episodic memory was negligible on the results. This certainly has to do with its low rate of triggering, and the fact that episodic memories were weighted only slightly more than reflexive memories. The low trigger rate of episodic memory can also be attributed to the strict criteria that three conceptual cues in an episode frame must trigger in order for the whole episode to trigger. These results also suggest that imprimers played a measurable role in improving performance, which is a very promising result.

Overall, the evaluation demonstrates that the proposed attitude prediction approach is promising, but needs further refinement. The randomized BASELINE 2 is a good comparison when considering possible entertainment applications, whose interaction is more fail-soft. The approach does quite well against the active BASELINE 2, and is within the performance range of these applications. Taking into account possible erroneous reactions, we were careful to pose What Would They Think? as a fail-soft interface. The reacting faces are evocative, and encourage the user to click on a face for further explanation. Used in this manner, the application is fail-soft because users can decide on the basis of the explanations whether the reaction is justified or mistaken. We expect that ongoing studies of the usefulness of the What Would They Think? intelligent will show that its use is fail-soft: the generated reactions are evocative and encourage the user to further verify and investigate a purported attitude. We do not suggest that the approach is yet ready for fail-hard applications, such as deployment as a sociable software agent, because fallout (bad predictions) can be very costly in the realm of affective communication (Nass et al., 1994).

RELATED WORK

The community of personalities metaphor has been previously explored with Guides (Oren et al., 1990), a multi-character interface that assisted users in browsing a hypermedia database. Each guide embodied a specific character (e.g. preacher, miner, settler) with a unique “life story.” Presented with the current document that a user is browsing, each guide suggested a recommended follow-up document, motivated by the guide’s own point-of-view. Each guide’s recommendations is based on a manually constructed bag of “interests” keywords.

Our affective memory -based approach to modeling a person’s attitudes appears to be unique in the literature. Existing approaches to person modeling are of two kinds: behavior modeling, and demographic profiling. The former approach models the actions that users take within the context of an application domain. For example, intelligent tutoring systems track a person’s test performance (Sison & Shimura, 1998), while online bookstores track user purchasing and browsing habits and combine this with collaborative filtering to group similar users (Shardanand & Maes, 1995). The latter approach uses gathered demographic information about a user, such as a “user profile”, to draw generalized conclusions about user preferences and behavior.

Neither of the existing approaches are appropriate to the modeling of “digital personas.” In behavior modeling, knowledge of user action sequences are generally only meaningful in the context of a particular application and does not significantly contribute to a picture of a person’s attitudes and opinions. Demographic profiling tends to overgeneralize people by the categories they fit into, is not motivated by personal experience, and often requires additional user action such as filling out a user profile.

Memory-based modeling approaches have also been tried in related work on assistive agents. Brad Rhode’s Remembrance Agent (Rhodes & Starner, 1996) uses an associative memory to proactively suggest relevant information. Sunil Vemuri’s project, “What Was I Thinking?” (2004) is a memory prosthesis that records audio from a wearable device, and intelligently segments the audio into episodes, allowing the “audio memory” to be more easily browsed.

CONCLUSION

Learning about the personalities and dynamics of online communities has been up to now a difficult problem with no good technological solutions. In this paper, we propose What Would They Think? an interactive visual representation of the personalities in a community. A matrix of digital personas reacts visually to what a user types or says to the interface, based on predictions of attitudes actually held by the persons being modeled. Each digital persona’s model of attitudes is generated automatically from an analysis of some personal text (e.g. weblog), using natural language processing and textual affect sensing to populate an associative affective memory system. The whole application enables a person to understand the personalities in a community through interaction rather than by reading narratives. Patterns of reactions observed over a history of interactions can illustrate qualities of a person’s personality (e.g. negativity, excitability), interests and expertise, and also qualities about the social dynamics in a community, such as the consenses and disagreements held by a group of individuals.

The automated, memory-based personality modeling approach introduced in this paper represents a new direction in person modeling. Whereas behavior modeling only yields information about a person within some narrow application context, and whereas demographic profiling paints an overly generalized picture of a person and often requires a profile to be filled out, our modeling of a person’s attitudes from a “memory” of personal experiences paints a richer, better-motivated picture about a person that has a wider range of potential applications than application-specific user models. User studies concerning the quality of the attitude prediction technology are promising and suggest that the currently implemented approach is strong enough to be used in fail-soft applications. In What Would They Think? the interface is designed to be fail-soft. The reactions given by the digital personas are meant to be evocative. The user is encouraged to further verify and investigate a purported attitude by clicking on a persona and viewing a textual explanation of the reaction.

In future work, we intend to further develop the modeling of attitudes by investigating how particularly strong beliefs such as “I love dogs” can help to create a model of a person’s identity. We also intend to investigate other applications for our person modeling approach, such as virtual mentors and guides, marketing, and document recommendation.

ACKNOWLEDGMENTS

The authors would like to thank Deb Roy, Barbara Barry, Push Singh, Andrea Lockerd, Marvin Minsky, Henry Lieberman, and Ted Selker for their comments on this work.

REFERENCES

1] danah boyd: 2004, Friendster and publicly articulated social networks. Conference on Human Factors and Computing Systems (CHI 2004). ACM Press.

2] K.W. Church, W. Gale, P. Hanks, and D. Hindle: 1991, Using statistics in lexical analysis. In Uri Zernik (ed.), Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, pp. 115-164. New Jersey: Lawrence Erlbaum, 1991

3] A.M. Collins, and E.F. Loftus: 1975, A spreading-activation theory of semantic processing. Psychological Review, 82, pp. 407-428

4] Judith Donath and danah boyd: 2004, Public displays of connection, BT Technology Journal 22(4)

5] Mihayi Csikszentmihalyi: 1997, Finding flow: the psychology of engagement with everyday life. 1st ed. MasterMinds. 1997, New York: Basic Books.:pp. 71-82. Kluwer

6] J. Herlocker, J. Konstan and J. Riedl: 2000, Explaining Collaborative Filtering Recommendations. Conference on Computer Supported Cooperative Work, pp. 241-250

7] Grant McCracken: 1991, Culture and Consumption: New Approaches to the Symbolic Character of Consumer Goods and Activities. Indiana University Press, Indiana

8] S. McNee, I. Albert, D. Cosley, P. Gopalkrishnan, S.K. Lam, A.M. Rashid, J.A. Konstan, & J. Riedl: 2002, On the Recommending of Citations for Research Papers. Proceedings of ACM 2002 Conference on Computer Supported Cooperative Work (CSCW2002), pp. 116-125.

9] B.N. Miller, I. Albert, S.K. Lam, J.A. Konstan, & J. Riedl: 2003, MovieLens Unplugged: Experiences with an Occasionally Connected Recommender System. Proceedings of ACM 2003 International Conference on Intelligent User Interfaces (IUI'03).

10] L. Page, S. Brin, R. Motwani, T. Winograd: 1998, The PageRank Citation Ranking: Bringing Order to the Web, Stanford Digital Libraries Working Paper.

11] P. Resnick and H.R. Varian: 1997, Recommender Systems. Communications of the ACM, Vol. 40(3):pp. 56-58.

12] David A. Kolb: 1985, Experiential Learning: Experience as the Source of Learning and Development. Prentice Hall.

13] U. Shardanand and P. Maes: 1995, Social information filtering: Algorithms for automating `word of mouth'. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 210-217.

14] J.B. Schafer, J.A. Konstan, and J. Riedl: 2002, Meta-recommendation Systems: User-controlled Integration of Diverse Recommendations. Proceedings of the 11th International Conference on Information and Knowledge Management (CIKM 2002), pp. 43-51

15] L.G. Terveen and W.C. Hill: 2001, Beyond Recommender Systems: Helping People Help Each Other, in Caroll, J. (ed.), HCI In The New Millennium. Addison-Wesley.

16] L. Wheeless and J. Grotz: 1977, The Measurement of Trust and Its Relationship to Self-disclosure. Communication Research 3(3), pp. 250-257.

17] Gerald Zaltman: 2003, How Customers Think. Cambridge, MA:Harvard Business School Press.arXiv:cs.AI/0003003 Retrieved from:

18]

19]

20] Jon M. Pearce & Steve Howard: 2004, Designing for Flow in a Complex Activity. 6th Asia-Pacific Conference on Computer-Human Interaction. Springer-Verlag.

21] Deerwester, S. et al. (1990). Indexing by latent semantic anlysis. Journal of the American Society of Information science:416(6), pp 391-407.

22] Ekman, P. (1993). Facial expression of emotion. American Psychologist, 48, 384-392.

23] Freud, S. (1991). The essentials of psycho-analysis: the definitive collection of Sigmund Freud's writing selected, with an introduction and commentaries, by Anna Freud. London: Penguin.

24] Lakoff, G. & Johnson, M. (1980). Metaphors We Live By, University of Chicago Press.

25] Liu, H. (2003b). A Computational Model of Human Affective Memory and Its Application to Mindreading. Submitted to FLAIRS 2004. Draft available at: /~hugo/publications/drafts/Affective-Mindreading-Liu.doc

26] Liu, H., Lieberman, H., Selker, T. (2003). A Model of Textual Affect Sensing using Real-World Knowledge. Proceedings of IUI 2003, pp. 125-132.

27] Liu, H., Selker, T., Lieberman, H. (2003b). Visualizing the Affective Structure of a Text Document. Proceedings of CHI 2003, pp. 740-741.

28] Locke, J. (1689). Essay Concerning Human Understanding Hypertext by ITL at Columbia University, 1995. Print version ed. P.H. Nidditch. Oxford, 1975.

29] McCloud, S. (1993). Understanding Comics, Kitchen Sink Press, Northhampton, Maine,

30] Mehrabian, A. (1995). for a comprehensive system of measures of emotional states: The PAD Model. (Available from Albert Mehrabian, 1130 Alta Mesa Road, Monterey, CA, USA 93940).

31] Minsky, M., (forthcoming). The Emotion Machine, Pantheon, New York. Several chapters are available at: .

32] Nass, C.I., Stener, J.S., and Tanber, E. (1994) Computers are social actors. In Proceedings of CHI ’94, (Boston, MA), pp. 72-78, April 1994.

33] Oren, T., Salomon, G., Kreitman, K. and Don, A. (1990). Guides: characterizing the interface. In Laurel, B. (Eds.) The art of human-computer interface design. Addison-Wesley.

34] Rhodes, B. and Starner, T. (1996). The Remembrance Agent: A continuously running automated information retrieval system. Proceedings of PAAM '96, pp. 487-495.

35] Shardanand, U. and Maes, P. (1995). Social information filtering: Algorithms for automating "word of mouth", Proceedings of CHI'95, 210-217.

36] Singh, P., (2002). The public acquisition of commonsense knowledge. Proceedings of AAAI Spring Symposium. Palo Alto, CA, AAAI.

37] Sison, R. and Shimura, M. (1998). Student modeling and machine learning. International Journal of Artificial Intelligence in Education, 9:128-158.

Tulving, E. (1983). Elements of episodic memory. Oxford: New York.

V

Whyte, W. (1988). City. Doubleday, New York.

38]

-----------------------

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

-----------------------

::: EPISODE FRAME :::

SUBEVENTS:

(eat John “ice cream”),

(ask I John “for taste”),

(refuse John)

MORAL: (selfish John)

CONTEXTS: (date), (park), ()

EPISODE-IMPORTANCE: 0.8

EPISODE-AFFECT: (-0.8,0.7,0)

(refuse John)

Text Excerpts

…2 Oct 01… Telemarketers called harassed me again today, interrupting my dinner. I’m really upset…

…4 Oct 01… The phone rang, and of course, it was a telemarketer. Damn it!

::: REFLEXIVE MEMORY :::

telemarketer = {

[2oct01, (-.3, .5, .2), .5],

[4oct01, (-.8, .8, .3), .4] } ;

dinner = {

[2oct01, (-.3, .5, .2), .2]}

“interrupt dinner” = {…} ;

….

Personality

traits

Explicit

attitudes

Implicit

attitudes

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download