∎

¹¹institutetext: Rodolfo Migon Favaretto ²²institutetext: Paulo Knob ³³institutetext: Soraia Raupp Musse ⁴⁴institutetext: VHLab - Graduate Program in Computer Science - PUCRS, Av Ipiranga, 6681, Porto Alegre - RS - Brazil
⁴⁴email: [email protected] ⁵⁵institutetext: Felipe Vilanova ⁶⁶institutetext: Ângelo Brandelli Costa ⁷⁷institutetext: Graduate Program in Psychology - PUCRS, Av Ipiranga, 6681, Porto Alegre - RS - Brazil

Detecting Personality and Emotion Traits in Crowds from Video Sequences^†^†thanks: Thanks to Office of Naval Research Global (USA) and Brazilian agencies: CAPES, CNPQ and FAPERGS.

Rodolfo Migon Favaretto Paulo Knob Soraia Raupp Musse Felipe Vilanova Ângelo Brandelli Costa

(Received: date / Accepted: date)

Abstract

This paper presents a methodology to detect personality and basic emotion characteristics of crowds in video sequences. Firstly, individuals are detected and tracked, then groups are recognized and characterized. Such information is then mapped to OCEAN dimensions, used to find out personality and emotion in videos, based on OCC emotion models. Although it is a clear challenge to validate our results with real life experiments, we evaluate our method with the available literature information regarding OCEAN values of different Countries and also emergent Personal distance among people. Hence, such analysis refer to cultural differences of each country too. Our results indicate that this model generates coherent information when compared to data provided in available literature, as shown in qualitative and quantitative results.

Keywords:

Computer vision crowd features Big-five model cultural dimensions crowd emotion

^†^†journal: Machine Vision and Applications

1 Introduction

Crowd analysis is a phenomenon of great interest in current applications. Surveillance, entertainment and social sciences are examples of fields that can benefit from the development of this area of study. Literature presents different applications of crowd analysis, like counting people in crowds Chan2009 ; cai2014 , group and crowd movement and formation Solmaz2012 ; Zhou2014 ; Ricky:15 ; jo2013review and detection of social groups in crowds solera_2013 ; Shao2014 ; Feng2015 ; Chandran2015 . Normally, these approaches are based on personal tracking or optical flow algorithms, and handle with features like walking speed, directions and distances over time. Specifically on this subject, one study investigated cultural difference in videos from different countries: Chattaraj et al. CHATTARAJ2009 suggested that cultural and population differences could produce deviations in speed, density and flow of the crowd.

In this paper, we propose to detect personality aspects based on the Big-five personality model, also referenced as (OCEAN) costa07 , using individuals behaviors automatically detected in video sequences. For this, we used the NEO PI-R costa07 which is the standard questionnaire measure of the Five Factor Model. Also, we use such psychological traits to identify some primordial emotions in the crowd. Using a similar mapping as proposed by Saifi et al. saifi2016approach , we were able to identify the level of some emotions for each individual, like happiness or fear, according its OCEAN level.

This paper is organized as follows: Section 2 presents related works on personality and crowd emotion detection. Section 3 describes our model to detect personality traits for individuals and groups. Moreover, we use this psychological traits to identify different levels of emotions in the crowd, like if an agent is happy or angry, in Section 3.3. Experimental results are addressed in Section 4, where we deal with the challenge to evaluate our approach with real life. Conclusions and future work are presented in Section 5.

2 Related Work

This section discusses some topics concerned with personality and emotion detection in crowds domain.

Personality may be labeled as deep psychological individual level trait cattell50 . Trait is an inference made after observed behaviors that seeks to explain its regularity hall98 . Raymond Cattel is commonly referred as the one who developed the methodology which permitted the objective grouping of hundreds of trait descriptors in a set of higher level factors digman90 . Cattell cattell48 developed a taxonomy of individual differences that consisted of 16 primary factors and 8 second-order factors. Nevertheless, attempts to replicate his work were unsuccessful fiske48 and researchers agreed that only the 5-factor model matched his data, originating the Big Five personality model.

Nowadays, researchers agree that there are five robust orthogonal traits which effectively matched personality attributes digman90 , known as the Big Five: Openness to experience (“the active seeking and appreciation of new experiences”); Conscientiousness (“degree of organization, persistence, control and motivation in goal directed behavior”); Extraversion (“quantity and intensity of energy directed outwards in the social world”); Agreeableness (“the kinds of interaction an individual prefers from compassion to tough mindedness”); Neuroticism (how much prone to psychological distress the individual is) lordw07 .

The NEO PI-R costa92 is one of the most used instrument based on the Big Five personality theory. It assesses the normal adult personality and is internationally recognized as a gold standard for personality assessment. One of its advantages is that it further specifies six facets within each personality trait and have data from several countries which easily allows cross-cultural comparisons McCrae2002 ; McCrae05 . Although the empirical evidence matching individual level traits and crowd behavior is not strong (one of the few examples is Barry97 ), the Big-Five personality model is widely used to model computational crowd simulation kaup06 ; Durupinar08 ; Guy11 . In general, in such methods, OCEAN model allows to simulate a crowd with individual level parameters based on the expected behaviors of the agents.

Refer to caption — Figure 1: Three main steps of our approach: ( $1^{st}$ ) individual tracking and data extraction, ( $2^{nd}$ ) individual data is mapped to individual and group personality (OCEAN) traits and ( $3^{rd}$ ) personality traits are mapped to individual and group basic emotions. These steps are detailed in Figure 2.

Several models have been developed to explain and quantify basic emotions in humans and other animals. One of the most cited is the model proposed by Paul Ekman ekman1971constants which considers the existence of 6 universal emotions based on cross-cultural facial expressions (anger, disgust, fear, happiness, sadness and surprise). Other approaches such as Affective Neuroscience postulate, from an evolutionary perspective, consider other groups of emotions such as fear, rage/anger and sadness/panic montag2017primary .

Panksepp’s theory claims that individual differences in primary emotional systems may represent the phylogenetically oldest parts of human personality. In order to support this idea, the author links individual differences in primary emotions and the big five model of personality. for example, seeking is robustly linked with openness, high play with higher extraversion, high care and low anger are associated with higher agreeableness, high scores on fear, sadness and rage with high neuroticism davis2003affective ; davis2011brain . Finally, other models such as the proposed by Ortony, Clore, and Collins (commonly referred to as the OCC model) distinguish 22 emotion types manly bases on its expression. This is the model used in this paper clore2013psychological .

In the work proposed by Gorbova and collaborators Gorbova2017 , authors present a system of automatic personality screening from video presentations in order to make a decision whether a person has to be invited to a job interview. The automatic personality screening is based on visual, audio and lexical cues from short video-clips. The system is build to predict candidate scores of 5 Big Personality Traits and to estimate a final decision, to which degree the person from video-clip has to be invited to the job interview.

Baig et al. baig2015perception focus their work on the perception of emotion due to crowd behavior. For this end, they propose an approach based on probabilistic modeling, which is trained to perceive the emotions of people in a given area. There, it uses camera sensors in order to track the motion of the individuals in the crowd. They use data mining techniques to identify different behaviors and events. Agents can experience different types of interactions during the simulation, which can arouse negative or positive emotions. Tests showed that the algorithms for emotions detection performed well with all tested scenarios and types of interaction. Yet, they comment that it is still necessary to include more complex interactions and motivations in a more complex probabilistic model in order to have a more realistic model.

Following the emotion detection in crowds subject, Rabiee et al. rabiee2016emotion mention that emotions can be valuable traits in the understanding of the crowd. Moreover, they say that the majority of methods proposed are just based on low-level visual features, leaving a huge semantic gap between these low-level features and high-level concept of crowd behavior. Therefore, in their work, it is proposed an attribute-based strategy to generate a crowd dataset with both annotations of abnormal crowd behavior and crowd emotion. In fact, the dataset is the main contribution of their work, which can be used as a benchmark for computer vision as well to help to understand the correlation between crowd behavior and emotion.

Saifi et al. saifi2016approach proposed a mapping between psychological traits and emotions. In their work, they aim to model emergent emotions and behaviors in a simulated crowd, based on the personality of each individual. For this, the OCEAN costa07 personality model is combined with the OCC ortony1990cognitive emotional model in order to find the susceptibility of each of the five OCEAN personality factors to feel each OCC emotion. They use fuzzy logic to model the critical emotions when in presence of an unexpected event, which can trigger a specific behavior. In their method, it is possible to observe a crowd and the effect of events, like how calm or nervous agents react to it. Moreover, the user is able to predict future situations or events, which can lead him/her to make better decisions at the right time.

Following a similar way, Zhang et al. zhang2017exploring proposed a novel crowd representation named “crowd mood”, which should be based on the spacing interactions of the individuals and the structural levels of motion patterns in crowds. They affirm that “Basic types of crowd motion can reflect representative emotions of the crowd.”. Therefore, the main idea is to project varied types of statistical motion features and, based on this, assign emotional labels to the behavior of the crowd. The preliminary achieved results showed an interesting performance in the given tasks, offering a promising tool which represents crowd behaviors semantics and can be applied to varied crowd tasks.

In this paper, the idea is to map parameters from individual and group behaviors automatically detected from video sequences of different countries, to OCEAN dimensions. We extended a previous work favaretto:sib2017 in order to detect groups and find their OCEAN levels as well. Then, we use OCEAN of groups and individuals to find out their emotion in the video sequence, following a mapping similar with the one proposed by Saifi et al. saifi2016approach . While their work present a model to simulate crowds with psychological traits and emotions, our work focus on identifying such traits in video sequences. Further details are going to be presented in Section 3. In this sense, our contribution is a model based on a set of equations that handle the individual parameters obtained from videos and map them to the methodology that compose the Big-Five personality model for individuals and groups, proposing a mapping function to find out emotion values. Our method does not require any training or specific dataset since we based our prototype on the theories behind the OCEAN and OCC (emotion model).

3 The proposed approach

Our model presents three main steps responsible for following: video data extraction, personality and emotion analysis. These steps are illustrated in the overview of the method in Figure 1. The first step aims to obtain the individual trajectories from observed pedestrians in real videos. Using these trajectories, we detect groups and extract data which are useful for second step, that is responsible for personality analysis of groups and individuals. Once we have concluded the second step, we have enough information to follow with the third step, which consists of emotion detection of individuals and groups according to OCEAN values. Figure 2 illustrates a more detailed flow chart of our approach and Section 3.1 presents further details.

3.1 Individuals and Groups Data Extraction

Initially, the information about people from real videos is obtained using a tracker Bins2013 to recover people trajectories. In order to transform image in world coordinates, we assume that the head position is on the ground place ( $z=0$ ), and then we used a planar homography to rectify a perspective image and generate world coordinates. In computer vision, planar homography is defined as a projective mapping from one plane to other. Then we use the homography to rectify a perspective image, in this case, to generate a “plan” view of trajectories from a “perspective” photo. Figure 3 illustrates the mapping of the trajectories points from image coordinates to a orthogonal plan view (world coordinates). The coordinates of trajectories are used to calculate the motion parameters (speed of travel, direction) for each agent.

We compute information for each person $i$ at each timestep: i) 2D position $\vec{x}_{i}$ (meters); ii) speed $s_{i}$ (meters/frame); iii) angular variation $\alpha_{i}$ (degrees) w.r.t. a reference vector $\vec{r}=(1,0)$ ; iv) isolation level $\varphi_{i}$ ; v) socialization level $\vartheta_{i}$ ; and vi) collectivity $\phi_{i}$ . The isolation level $\varphi$ is computed as shown in Equation 1:

\varphi_{i}=\left\{\begin{array}[]{ll}1,&\text{~{}if~{}}n_{social}=0\\ \frac{\frac{1}{n_{social}}\sum_{j=0}^{n_{social}-1}d(\vec{x}_{i},\vec{x}_{j})}{d_{Hall}},&\text{~{}otherwise}\end{array},\right.

(1)

where $n_{social}$ is the number of individuals in the social space¹¹1Social space is related to $3.6$ m hall98 . according to Hall’s proxemics hall98 , $d$ is a simple function to calculate the Euclidean distance between two individuals $i$ and $j$ ( $\vec{x}_{i}$ and $\vec{x}_{j}$ are, respectively, the positions of individuals $i$ and $j$ ) and $d_{Hall}=3.6$ is the distance (in meters) around an individual that represents its personal social space. The socialization level $\vartheta_{i}$ of an individual $i$ is calculated according to the Equation 2:

\vartheta_{i}=\left\{\begin{array}[]{ll}0,&\text{~{}if~{}}n_{social}=0\\ \frac{n_{social}}{\rho},&\text{~{}otherwise}\end{array},\right.

(2)

where $\rho$ is the total number of individuals in the analyzed frame. To compute the collectivity affecting individual $i$ and exerted by $n_{social}$ individuals in social space (as presented in Favaretto:Sib:2016 ), we use the Equation 3, as follow:

\phi_{i}=\sum_{j=0}^{n-1}\gamma e^{(-\beta\varpi(i,j)^{2})},

(3)

where the collectivity between two individuals $i$ and $j$ is calculated as a decay function of $\varpi(i,j)=s(s_{i},s_{j}).w_{1}+o(\alpha_{i},\alpha_{j}).w_{2}$ , considering $s$ and $o$ respectively their speed and orientation differences, and $w_{1}$ and $w_{2}$ are constants that should regulate the offset in meters and radians. We have used $w_{1}=1$ and $w_{2}=1$ . $\gamma=1$ is the maximum collectivity value when $\varpi(i,j)=0$ , and $\beta=0.3$ is empirically defined as decay constant. Hence, $\phi_{i}$ is a value in the interval $[0;1]$ . Therefore, for each individual $i$ at each frame $f$ in a video sequence, we compute vector $\vec{V_{i,f}}$ of extracted data where $\vec{V_{i,f}}=\left[s_{i,f},\alpha_{i,f},\varphi_{i,f},\vartheta_{i,f},\phi_{i,f}\right]$ . Yet, the average values for all frames in the video sequence are represented through the vector $\vec{V_{i}}$ of individual $i$ where $\vec{V_{i}}=\left[\bar{s}_{i},\bar{\alpha}_{i},\bar{\varphi}_{i},\bar{\vartheta}_{i},\bar{\phi}_{i}\right]$ .

For groups detection, we use the computed parameters $s$ , $o$ and $d$ which respectively state for speed and orientation variation and distance of each pair of agents $i$ and $j$ . We use the notion of distances based on the proxemics described by Hall Hall:1990 to define that two agents belong to the same group according with three tests empirically defined: If $(d(x_{i},x_{j})<=1.2\text{meter})$ and $(o(\alpha_{i},\alpha_{j})<=15^{\circ})$ and ( $s(s_{i},s_{j})<\beta$ ), where $\beta=5\%$ of higher speed. Based on this set of rules, agents are grouped in pairs. In a next step, we check which pairs have one individual in common, and merge them into larger groups. This process is performed until the group formation does not share individuals, i.e. they are disjoint. For each group $g$ detected in video sequence, we define a vector $\vec{G_{g}}=\left[n_{g},\vec{I}_{g},\bar{s}_{g},\bar{\alpha}_{g},\bar{d}_{g}\right]$ of extracted data, where $n_{g}$ is the number of individuals in $g$ , $\vec{I}_{g}$ states for a vector containing the id of each individual in $g$ and $\bar{s}_{g}$ , $\bar{\alpha}_{g}$ and $\bar{d}_{g}$ present the average values of speed, orientation and distance applied by individuals in group $g$ during the video sequence.

At the end of this step, we have $\vec{V}$ for all individuals per frame and averaged for the video sequence and $\vec{G}$ for all groups averaged for the video sequence. In next section we describe how these data is used to find out the personality traits in our work.

3.2 Mapping crowd features in OCEAN Dimensions

We used the empirically equations proposed by Favaretto et al. favaretto:sib2017 in a previous work which goal is to map individual and group characteristics in OCEAN cultural dimensions. Basically, the method proposes to “answer” 25 items from NEO PI-R inventory for each individual in the video sequence. Although the complete version of NEO PI-R has $240$ items, 25 of them were selected since they have a direct relationship with crowd behavior. Each question is mapped to an equation using only information contained in $V$ for each $i$ . For example, in order to represent the item “1 - Have clear goals, work to them in orderly way”, we consider that the individual $i$ should have a high velocity $s$ and low angular variation $\alpha$ to have answer compatible with “Strong”. So the equation for this item is $Q_{1}=s_{i}+\frac{1}{\alpha_{i}}$ . Table 1 shows the equations used for each considered question.

Table 1: Equations from each NEO PI-R selected item.

NEO PI-R Item Equation 1 - Have clear goals, work to them in orderly way $Q_{1}=s_{i}+\frac{1}{\alpha_{i}}$ 2. Follow same route when go somewhere $Q_{2}=\alpha_{i}$ 3. Shy away from crowds $Q_{3-8}=\varphi_{i}$ 4. Don’t get much pleasure chatting with people 5. Usually prefer to do things alone 6. Prefer jobs that let me work alone, unbothered 7. Wouldn’t enjoy holiday in Las Vegas 8. Many think of me as somewhat cold, distant 9. Rather cooperate with others than compete $Q_{9-10}=\phi_{i}$ 10. Try to be courteous to everyone I meet 11. Social gatherings usually bore me $Q_{11}=\varphi_{i}+std(\alpha_{i})$ 12. Usually seem in hurry $Q_{12}=s_{i}+\alpha_{i}$ 13. Often disgusted with people I have to deal with $Q_{13}=\varphi_{i}+\frac{1}{\phi_{i}}$ 14. Have often been leader of groups belonged to $Q_{14}=\phi_{i}+\vartheta_{i}+\frac{1}{\alpha_{i}}$ 15. Would rather go my own way than be a leader $Q_{15}=\frac{1}{Q_{14}}$ 16. Like to have lots of people around me $Q_{16-21}=\vartheta_{i}$ 17. Enjoy parties with lots of people 18. Like being part of crowd at sporting events 19. Would rather a popular beach than isolated cabin 20. Really enjoy talking to people 21. Like to be where action is 22. Feel need for other people if by myself for long $Q_{22-25}=\vartheta_{i}+\phi_{i}$ 23. Find it easy to smile, be outgoing with strangers 24. Rarely feel lonely or blue 25. Seldom feel self-conscious around people

Once all questions $k$ (in the interval $[1;25]$ ) have been answered for all individuals $i$ , we have $\vec{Q_{i,k}^{f}}$ for each frame $f$ . In addition, we computed the average values to have one vector $\vec{Q_{i,k}}$ per video. According to NEO PI-R definition, each of the questions $\vec{Q^{\prime}_{k}}$ are associated to one of the Big Five dimensions and some questions should invert the values, because an item score 4 (Strongly Agree) can represent a high or low value of a certain personality trait. So, to get the correct values, we applied a factor to the questions which score should be inverted: $\vec{Q^{*}_{i,k}}=4-\vec{Q^{\prime}_{i,k}}$ , as shown in next equations:

O_{i}=\frac{Q^{*}_{i,2}}{\varrho},

(4)

C_{i}=\frac{Q^{\prime}_{i,1}}{\varrho},

(5)

E^{\prime}_{i}=Q^{\prime}_{i,3}+Q^{\prime}_{i,12}+Q^{\prime}_{i,14}+\sum_{q=16}^{23}Q^{\prime}_{i,q},

(6)

E^{*}_{i}=\sum_{q=4}^{8}Q^{*}_{i,q}+Q^{*}_{i,11}+Q^{*}_{i,15},

(7)

E_{i}=\frac{(E^{\prime}_{i}+E^{*}_{i})}{\varrho},

(8)

A_{i}=\frac{\sum_{q=9}^{10}Q^{\prime}_{i,q}}{\varrho},

(9)

N_{i}=\frac{Q^{\prime}_{i,13}+\sum_{q=24}^{25}Q^{*}_{i,q}}{\varrho},

(10)

where $\varrho$ represents the percentage of questions from the total, in each dimension (O, C, E, A and N), respectively 4%, 4%, 72%, 8% and 12%.

3.3 Emotion Detection

In this paper we connect personality and emotion traits in order to detect data in the individuals in video sequences. As mentioned by revelle2009personality :

”A helpful analogy is to consider that personality is to emotion as climate is to weather. That is, what one expects is personality, what one observes at any particular moment is emotion.” That is the reason why, in our work, we correlated emotion traits with visual behaviors that will be perceived in pedestrian motion.

Our work proposes an emotion mapping based on personality traits (i.e. OCEAN) found for each individual present in the video sequence. First, we selected four emotions from OCC ortony1990cognitive model: Fear, Happiness, Sadness and Anger. It is important to notice that we chose only four emotions that, in our opinion, are the most visible when relating with motion behavior, which is our case in video sequences with pedestrians. Any other from the total of 22 emotions proposed in OCC model could also be mapped.

Once we have computed the personality traits for an individual, we propose a way to map these personality traits into the four considered emotions. In addition to vector $\vec{V}_{i,f}$ for individual $i$ per frame $f$ , we included personality and emotion parameters as follows: $\vec{V}_{i,f}=\left[s_{i,f},\alpha_{i,f},\varphi_{i,f},\vartheta_{i,f},\phi_{i,f},\vec{P}_{i,f},\vec{E}_{i,f}\right]$ . $\vec{P}_{i,f}$ states for values of OCEAN personality $\left[O_{i,f},C_{i,f},E_{i,f},A_{i,f},N_{i,f},\right]$ , while $\vec{E}_{i,f}$ states for values of emotion based on OCC model: $\left[F_{i,f},H_{i,f},S_{i,f},An_{i,f}\right]$ . Values of $\vec{P}$ are in the interval $[0;1]$ while values of $\vec{E}$ are from $[-3;3]$ . For groups analysis, we only considered parameters of personality and emotion in the detected groups $g$ in the video sequence. In addition to previously defined $\vec{G_{g}}=\left[n_{g},\vec{I}_{g},\bar{s}_{g},\bar{\alpha}_{g},\bar{d}_{g}\right]$ , we include $\vec{P}_{g}$ and $\vec{E}_{g}$ containing the average values of $n_{g}$ individuals in $g$ .

In order to map from OCEAN to emotion parameters we observe some aspects in literature costa07 :

•

O- : person is close to interact with others;
•

O+ : person is aware of his/her feelings;
•

C+ : person is optimistic;
•

C- : person is pessimist;
•

E+ : person has a strong relationship with positive emotions;
•

E- : person presents relationship with negative emotions;
•

A+ : person has a strong relationship with positive reactions;
•

A- : person presents relationship with negative reactions;
•

N-: known by the emotional stability;
•

N+ : person feels negative emotions;

Such data resulted in empirical definitions included in Table 2, that shows the mapping from OCEAN traits to the chosen emotions. In fact, we were not the first one to propose this type of mapping. Saifi et al. saifi2016approach proposes similar data for a different approach where authors were interested in providing critical emotions in crowd simulators. Davis and Panksepp davis2011brain also proposed a similar approach unifying basic emotions with personality.

In Table 2, the plus/minus signals along each factor represent the positive/negative value of each one. For example, O+ stands for positive values (i.e. O $\geq$ 0.5) and O- stands for negative values (i.e. O $<$ 0.5)). A positive value for a given factor (i.e. 1) means the stronger the OCEAN trait is, the stronger is the emotion too. A negative value (i.e. -1) does the opposite, therefore, the stronger the factor’s value, the weaker is a given emotion. A zero value means that a given emotion is not affected at all by the given factor. To better illustrate, a hypothetical example is given: if an individual has a high value for Extraversion (for example, E = 0.9), following the mapping in Table 2, this individual can present signals of happiness (i.e. If E+ then Happiness= 1) and should not be angry (i.e. If E+ then Anger= -1).

Table 2: Emotion mapping from OCEAN to OCC. The plus/minus signals along each factor represent the positive/negative value of each one.

Factor Fear Happiness Sadness Anger O+ 0 0 0 -1 O- 0 0 0 1 C+ -1 0 0 0 C- 1 0 0 0 E+ -1 1 -1 -1 E- 1 0 0 0 A+ 0 0 0 -1 A- 0 0 0 1 N+ 1 -1 1 1 N- -1 1 -1 -1

Finally, emotions are normalized in the interval $[0;1]$ considering the lowest and highest achieved values in the video, in order to keep the obtained scale.

4 Experimental Results

In this section we discuss some results obtained with our approach. We organized it into four different analysis: i) Ocean and Emotion recognition in spontaneous videos (Sections 4.1 and 4.2, respectively), ii) an Ocean and Emotion comparison with Literature (Section 4.3), iii) an Ocean and Emotion analysis in the Fundamental Diagram experiment (Section 4.4) and iv) a quantitative analysis of personal space (Section 4.5).

4.1 OCEAN analysis

In this section, we present OCEAN analysis involving spontaneous videos (crowds in public spaces). Firstly, we calculate the OCEAN of each individual in the video at each frame. Once we get the individuals OCEAN, the group OCEAN is computed by the average of the individuals’ OCEAN that are part of the group. Figure 4 shows a representation of each individual OCEAN in a determined frame. We used five color box that represent the five dimensions, where blue is related to Openness, cyan indicates Conscientiousness, green indicates Extraversion, yellow means Agreeableness and red, Neuroticism.

The group of individuals highlighted in Figure 4 is composed of two people. The OCEAN of this group (named $G1$ ) over the time is illustrated in Figure 5(a), and obtained by the average OCEAN of its individuals ( $P9$ and $P10$ ), presented in Figures 5 (b) and (c). As can be seen in such figures, the two individuals present more motion variation at the beginning of the video and then they keep the same motion characteristics until the end of the short movie (which duration is 100 frames). Their Openness is high because they keep low angular variation along their trajectories. On the other hand, their Conscientiousness dimension is lower than other dimensions because they keep low speeds in comparison to other groups.

The group OCEAN reflects the individuals OCEAN. An analysis of OCEAN in the country level is presented later in this section. Once we have the OCEAN values, individuals emotions are detected in the analyzed videos, as discussed in next section.

4.2 Emotion analysis

This section presents the emotion analysis in spontaneous videos. Figure 6 shows an example of the emotion detection in a video from Austria. A filled square represents that the person has a positive value from that emotion, a half filled square means that the emotion is neutral and a not filled square means that the person has a negative value from the emotion. For example, the highlighted person, with the blue arrow, got a negative value regarding the Anger emotion (the red square is not filled). It happens because this individual is interacting with the other (walking in the same direction and with similar angular variation), so collectivity and socialization levels are high and isolation level is low, consequently Anger receives a negative score.

Still in Figure 6, the highlighted person with the orange arrow gets a zero value for the Happiness emotion (the green square is half filled). It happens because she/he is alone, changing orientation (angular variation), with high isolation. The person highlighted with the red arrow gets a positive value for the Fear (the yellow square is completely filled). It happens because this individual is moving slower and with high angular variation, so his/her dimension C is low, generating a high value for Fear. The legend of colors is the following: red is related to Anger, yellow means Fear, green indicates Happiness and blue, Sadness.

In another example, showed in Figure 7(a), we highlight two different situations, a group (green circle) and an individual alone (red circle). It is interesting to notice that individuals who are part of a bigger group or have a high collectivity tend to be happy, as we can see in the highlighted group in Figure 7(b). On the other hand, individuals who are alone and distant from others tend to experience negative emotions (see an example in Figure 7(c)).

4.3 OCEAN and Emotion Comparison with Literature

In the next experiment we compare our results with available literature costa07 about OCEAN values of different Countries. In the case of the present work we focus on Germany and Brazil, since in next section we also present a comparative about these two Countries. It is important to emphasize that the data registered in costa07 was acquired based on subject answers on surveys and not based on their visual behavior in video sequences. We use OCEAN dimensions of the two analyzed Countries (Brazil and Germany) as presented by Costa et al. costa07 as ground-truth in our approach. We evaluated our method in a set of $10$ available videos, where 2 videos were from Germany and 8 from Brazil.

We evaluate the accuracy of our approach to detect the OCEAN values of the Countries based on percentage difference when compared with the literature results, considering all dimensions among all videos. Figure 8 shows the results obtained with our approach for two countries in all OCEAN dimensions, in comparison with the literature costa07 . It is interesting to highlight that results achieved for Brazil showed a higher accuracy, when compared to Germany. In the same time, this was the country with more videos to be analyzed.

In the same way, we compute the emotions in the country level. For this, similar to the OCEAN approach, the emotion of each country is computed as the mean emotion from the videos from that country. Figure 9 shows the mean emotion from the analyzed countries in this experiment. As far as we know, there is no data to be compared in this matter.

4.4 OCEAN and emotion analysis in the Fundamental Diagram experiment

In last sections we were interested about detecting personality and emotion traits in videos from different Countries, in public spaces. However, since the contexts are not the same, e.g. people can be close or faraway from others not just because their personality but because the contexts they are evolving on or the relationship with spaces are not the same. Consequently, this research presents such difficult challenge to solve in order to evaluate/validate results. The ideal is to discard the spacial context in order to analyze only the people behavior, from different Countries, while executing the same task. Therefore, we found the Fundamental Diagram applied to pedestrians, as proposed in Chattaraj:2009 . We detect personality and emotion traits in video sequences from two countries (Brazil and Germany ²²2We have access to such videos thanks to the authors of database of PED experiments, available at http://ped.fz-juelich.de/db/.), where all the individuals are performing exactly the same task. i.e. walking in a controlled space. The experiments have been performed in an environment setup as illustrated in Figure 10.

This experiment was conducted as described in Chattaraj:2009 , being the same in all the two countries, with the same populations ( $N=15$ , $25$ and $34$ ). The corridor was built up with markers and tape on the ground. Its size and shape is presented in Figure 10. The length of the corridor is $17.3m$ , while the width of the passageway is $0.8m$ , which is sufficient for a single person walk. In addition, we can observe on the bottom of Figure 10 a rectangle of 2 x 0.8 meters which illustrates the Region of Interest (ROI) where the populations were captured to be analyzed, as proposed in Chattaraj:2009 .

Figure 11 shows examples of emotion detection performed by individuals in the experiment of Fundamental Diagram. Pictures show the performed tests with different sizes of population in both Countries.

Figure 12 shows the OCEAN values from Brazil and Germany in the experiments with extreme sizes of populations, i.e. $N=15$ and $N=34$ . Based only on a visual inspection, we can easily perceive that OCEAN values from the both Countries are more similar when the density of people is higher in the experiment. It can indicate that people assumes group-level behavior instead of individual-level behavior caused by the higher density and the lack of free space. It agrees with several theories about mass behavior as discussed in vilanova2017 and LE_BON_THE_CROWD .

In this analysis, we also compute Pearson’s correlation to find out the similarity between the two countries in both OCEAN and emotion aspects. Figure 13 shows the Pearson’s correlation of OCEAN values among the countries, while Figure 14 shows the Pearson’s correlation of the emotion values.

Considering the both Figures 13 and 14, it is possible to see that, except from the population $n=25$ in OCEAN experiment, as the density of people in the experiment increases, the correlation among the countries also increases. We also included an investigation regarding the emotion and OCEAN impacted by the density of people. This was possible because the performed task is the same (i.e. individuals are walking in the same predefined environment) while only the number of people increases. Figure 15(a) shows how the emotion values vary according to the density of people in videos from Germany. It is interesting to see how the emotions Anger, Fear and Sadness increases proportionally as the density increases too. On the other hand, Happiness emotion decreases proportionally as a function of observed density. Indeed, the data observed in Germany was in accordance to what was empirically expected in our hypothesis, i.e. the only positive emotion (H) decreases as the density increases. However, the computed emotion for Brazil was not so well behaved as in Germany, and certainly it could be better investigated in a future work. One possible explanation is that Brazilian people were colleagues/friends while in Germany, they were related in Chattaraj:2009 as volunteers to the experiment.

Although we did not find any information about emotion and personality detection in videos from different Countries related in literature, these results seem promising in order to understand people personality, emotion and cultural differences in video sequences.

4.5 Quantitative Analysis: Personal Space

In this experiment we performed a comparison among the preferred distance people keep from others, as evaluated in a study performed by sorokowska:2017 . Results obtained with the experiment performed in our approach are compared with results of sorokowska:2017 , as shown in Figure 16. In the Sorokowska work, the answers were given on a distance (0-220 cm) scale anchored by two human-like figures, labeled A and B. Participants were asked to imagine that he or she is Person A. The participant was asked to rate how close a Person B could approach, so that he or she would feel comfortable in a conversation with Person B.

In our approach we measure the distances a person A keeps from a person B right in front of he or she, in the video sequences. For the comparison, in our approach, we use the distances from the Fundamental Diagram video sequences varying the people number ( $N=15$ , $N=20$ , $N=25$ and $N=30$ ) and from the Sorokowska’s approach we select the evaluation from acquaintance people, where the people are not close neither strangers, similar to people in our experiment. As we can see in Figure 16, in spite of the fact that distances from our approach are higher than the ones from Sorokowska, when there are few people ( $N=15$ and $N=20$ ), the proportion is similar in both scenarios. In addition, in the experiment $N=30$ , the distances observed in our approach are close to those obtained in the Sorokowska study. In all scenarios, people from Brazil keeps higher distances from others than people from Germany. The same happens in Sorokowska’s research. Although they are different experiments (videos and surveys), our method indicates that there is a correlation between people feeling in abstracted environment (as in surveys) and as they behave in video sequences.

5 Final considerations

In this paper, we describe a way to detect individual-level traits of emotion and personality observed in video sequences, based on individuals and groups features. We propose to detect OCEAN personality traits and compared with data from different countries existent in the literature. In addition, based on OCEAN we compute 4 traits of emotion defined in OCC model. We believe the results are promising and video sequences can be used to detect personality and emotion, what can help us to understand people behavior in video sequences.

An important challenge in this area is the comparison with real life data. In this work we successfully compared our results with OCEAN from two specific countries (Brazil and Germany) present in Psychological literature. As one particular aspect to be considered in behavior analysis is the context and environment in which individuals behave, we decided to exclude such variations by fixing tasks that the tested populations were required to execute. This is why we used Fundamental Diagram for pedestrians. To do that we performed a full experiment in Brazil to serve as benchmark to our research. Therefore, we measured the preferred personal distance from individuals in FD in order to present a quantitative data analysis.

The results obtained from our approach indicate that our model generates coherent information when compared to data provided in available literature, as shown in various analysis. It is important to note that the mapping to OCEAN and EMOTION dimensions was empirically defined through equations using data extracted from computer vision. NEO PI-R results in the literature measured these dimensions by considering a different type of information (subjective responses of individuals collected through questionnaires). The results of preferred distances presented by Sorokowska et al. sorokowska:2017 also were based on subjective responses informed by individuals.

For our future work, we consider to select items related to collectiviness from the International Personality Item Pool (http://ipip.ori.org/), which originated the NEO PI-R, to establish a questionnaire that can identify individual-level traits that correspond to group behaviors and could be analyzed computationally. One way to do this would be to ask participants to answer the proposed questionnaire, then divide these participants into groups with high and low scores, and, finally, ask these groups to perform some behavioral tasks; tasks that could be mapped in videos of crowds. Such a set of individual-level traits related to group behavior will not only increase the accuracy of group analysis tools by means of videos as well as group simulations based on the unique characteristics of each agent.

References

(1) Baig, M.W., Baig, M.S., Bastani, V., Barakova, E.I., Marcenaro, L., Regazzoni, C.S., Rauterberg, M.: Perception of emotions from crowd dynamics. In: Digital Signal Processing (DSP), 2015 IEEE International Conference on, pp. 703–707. IEEE (2015)
(2) Barry, B., Stewart, G.L.: Composition, process, and performance in self-managed groups: The role of personality. Journal of Applied Psychology 82, 62 (1997)
(3) Bins, J., Dihl, L.L., Jung, C.R.: Target tracking using multiple patches and weighted vector median filters. MIV 45(3), 293–307 (2013). DOI 10.1007/s10851-012-0354-y. URL http://dx.doi.org/10.1007/s10851-012-0354-y
(4) Bon, G.: The Crowd: A Study of the Popular Mind. Cosimo classics personal development. Cosimo Classics (2006)
(5) C. S. Hall, G.L., Campbell, J.B.: Theories Of Personality, fourth edn. John Wiley & Sons, New Jersey (1998)
(6) Cai, Z., Yu, Z.L., Liu, H., Zhang, K.: Counting people in crowded scenes by video analyzing. In: 9th IEEE ICIEA, pp. 1841–1845 (2014). DOI 10.1109/ICIEA.2014.6931467
(7) Cattell, R.B.: The primary personality factors in women compared with those in men. British Journal of Statistical Psychology 1(2), 114–130 (1948). DOI 10.1111/j.2044-8317.1948.tb00231.x. URL http://dx.doi.org/10.1111/j.2044-8317.1948.tb00231.x
(8) Cattell, R.B.: Personality: A systematic, theoretical, and factual study, first edn. McGraw-Hill, New York (1950)
(9) Chan, A.B., Vasconcelos, N.: Bayesian poisson regression for crowd counting. In: 12th IEEE ICCV, pp. 545–551 (2009)
(10) Chandran, A., Poh, L.A., Vadakkepat, P.: Identifying social groups in pedestrian crowd videos. In: ICAPR, pp. 1–6 (2015). DOI 10.1109/ICAPR.2015.7050677
(11) Chattaraj, U., Seyfried, A., Chakroborty, P.: Comparison of pedestrian fundamental diagram across cultures. ACS 12(03), 393–405 (2009). DOI 10.1142/S0219525909002209. URL http://www.worldscientific.com/doi/abs/10.1142/S0219525909002209
(12) Chattaraj, U., Seyfried, A., Chakroborty, P.: Comparison of pedestrian fundamental diagram across cultures. ACS 12 (2009). DOI 10.1142/S0219525909002209
(13) Clore, G.L., Ortony, A.: Psychological construction in the occ model of emotion. Emotion Review 5(4), 335–343 (2013)
(14) Davis, K.L., Panksepp, J.: The brain’s emotional foundations of human personality and the affective neuroscience personality scales. Neuroscience & Biobehavioral Reviews 35(9), 1946–1958 (2011)
(15) Davis, K.L., Panksepp, J., Normansell, L.: The affective neuroscience personality scales: Normative data and implications. Neuropsychoanalysis 5(1), 57–69 (2003)
(16) Digman, J.M.: Personality structure: Emergence of the five-factor model. Annual Review of Psychology 41(1), 417–440 (1990). DOI 10.1146/annurev.ps.41.020190.002221
(17) Durupinar, F., Allbeck, J., Pelechano, N., Badler, N.: Creating crowd variation with the ocean personality model. In: Proc. of the 7th International Joint Conf. on Autonomous Agents and Multiagent Systems - Volume 3, pp. 1217–1220. IFAAMAS, Richland, SC (2008). URL http://dl.acm.org/citation.cfm?id=1402821.1402835
(18) Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. Journal of personality and social psychology 17(2), 124 (1971)
(19) Favaretto, R.M., Dihl, L., Musse, S.R.: Detecting crowd features in video sequences. In: Proceedings of Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE Computer Society´s Conference Publishing Services (2016). DOI 10.1109/SIBGRAPI.2016.34
(20) Favaretto, R.M., Dihl, L., Musse, S.R., Vilanova, F., Costa, A.B.: Using big five personality model to detect cultural aspects in crowds. In: 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 223–229 (2017). DOI 10.1109/SIBGRAPI.2017.36
(21) Feng, L., Bhanu, B.: Understanding dynamic social grouping behaviors of pedestrians. IEEE STSP 9(2), 317–329 (2015)
(22) Fiske, D.W.: Consistency of The Factorial Structures in Personality Ratings from Different Sources. University of MICHIGAN (1948). URL https://books.google.com.br/books?id=o74lnQEACAAJ
(23) Gorbova, J., Lüsi, I., Litvin, A., Anbarjafari, G.: Automated screening of job candidate based on multimodal video processing. In: 2017 IEEE CVPRW, pp. 1679–1685 (2017). DOI 10.1109/CVPRW.2017.214
(24) Guy, S.J., Kim, S., Lin, M.C., Manocha, D.: Simulating heterogeneous crowd behaviors using personality trait theory. In: Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’11, pp. 43–52. ACM, New York, USA (2011). DOI 10.1145/2019406.2019413. URL http://doi.acm.org/10.1145/2019406.2019413
(25) Hall, E.: The Hidden Dimension. A Doubleday anchor book. Anchor Books (1990). URL http://books.google.com.br/books?id=zGYPwLj2dCoC
(26) Jo, H., Chug, K., Sethi, R.J.: A review of physics-based methods for group and crowd analysis in computer vision. Journal of Postdoctoral Research 1(1), 4–7 (2013)
(27) Jr., P.T.C., McCrae, R.R.: Revised NEO Personality Inventory (NEO-PI-R) and the NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources. (1992)
(28) Jr, P.T.C., McCrae, R.R.: NEO PI-R: inventário de personalidade NEO revisado. (2007). São Paulo
(29) Kaup, D., Clarke, T., Malone, L., Oleson, R.: Society for computer simulation. SIMULATION SERIES 38(4), 365–370 (2006)
(30) Lord, W.: Neo Pi-R – A Guide to Interpretation and Feedback in a Work Context, first edn. Hogrefe Ltd (2007)
(31) McCrae, R., Terracciano, A.: Universal features of personality traits from the observer’s perspective : data from 50 cultures. Journal of Personality and Social Psychology 88(3), 547–561 (2005). URL http://sro.sussex.ac.uk/14937/
(32) McCrae, R.R.: NEO-PI-R Data from 36 Cultures, pp. 105–125. Springer US, Boston, MA (2002). DOI 10.1007/978-1-4615-0763-5˙6. URL http://dx.doi.org/10.1007/978-1-4615-0763-5_6
(33) Montag, C., Panksepp, J.: Primary emotional systems and personality: An evolutionary perspective. Frontiers in psychology 8, 464 (2017)
(34) Ortony, A., Clore, G.L., Collins, A.: The cognitive structure of emotions. Cambridge university press (1990)
(35) Rabiee, H., Haddadnia, J., Mousavi, H., Nabi, M., Murino, V., Sebe, N.: Emotion-based crowd representation for abnormality detection. arXiv preprint arXiv:1607.07646 (2016)
(36) Revelle, W., Scherer, K.R.: Personality and emotion. Oxford companion to emotion and the affective sciences pp. 304–306 (2009)
(37) Saifi, L., Boubetra, A., Nouioua, F.: An approach for emotions and behavior modeling in a crowd in the presence of rare events. Adaptive Behavior 24(6), 428–445 (2016)
(38) Sethi, R.J.: Towards defining groups and crowds in video using the atomic group actions dataset. In: 2015 IEEE International Conference on Image Processing, ICIP 2015, Quebec City, QC, Canada, September 27-30, 2015, pp. 2925–2929 (2015). DOI 10.1109/ICIP.2015.7351338. URL http://dx.doi.org/10.1109/ICIP.2015.7351338
(39) Shao, J., Loy, C.C., Wang, X.: Scene-independent group profiling in crowd. In: IEEE CVPR, pp. 2227–2234 (2014). DOI 10.1109/CVPR.2014.285
(40) Solera, F., Calderara, S., Cucchiara, R.: Structured learning for detection of social groups in crowd. In: 10th IEEE AVSS (2013)
(41) Solmaz, B., Moore, B.E., Shah, M.: Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE PAMI 34(10), 2064–2070 (2012). DOI 10.1109/TPAMI.2012.123. URL http://dx.doi.org/10.1109/TPAMI.2012.123
(42) Sorokowska, A., Sorokowski, P., et al., P.H.: Preferred interpersonal distances: A global comparison. Journal of Cross-Cultural Psychology 48(4), 577–592 (2017). DOI 10.1177/0022022117698039. URL https://doi.org/10.1177/0022022117698039
(43) V., F., B., F., et al., A.C.: Deindividuation: From le bon to the social identity model of deindividuation effects. CP 4(1) (2017). DOI 10.1080/23311908.2017.1308104
(44) Zhang, Y., Qin, L., Ji, R., Zhao, S., Huang, Q., Luo, J.: Exploring coherent motion patterns via structured trajectory learning for crowd mood modeling. IEEE Transactions on Circuits and Systems for Video technology 27(3), 635–648 (2017)
(45) Zhou, B., Tang, X., Zhang, H., Wang, X.: Measuring crowd collectiveness. IEEE PAMI 36(8), 1586–1599 (2014). DOI 10.1109/TPAMI.2014.2300484

Detecting Personality and Emotion Traits in Crowds from Video Sequences††thanks: Thanks to Office of Naval Research Global (USA) and Brazilian agencies: CAPES, CNPQ and FAPERGS.