Dating sites oost europa met photo s Sex games cam
The resource would become even more useful if we could deduce complete and correct metadata from the various available information sources, such as the provided metadata, user relations, profile photos, and the text of the tweets.In this paper, we start modestly, by attempting to derive just the gender of the authors 1 automatically, purely on the basis of the content of their tweets, using author profiling techniques.
In the following sections, we first present some previous work on gender recognition (Section 2). Currently the field is getting an impulse for further development now that vast data sets of user generated data is becoming available. (2012) show that authorship recognition is also possible (to some degree) if the number of candidate authors is as high as 100,000 (as compared to the usually less than ten in traditional studies).
172 For Tweets in Dutch, we first look at the official user interface for the Twi NL data set, Among other things, it shows gender and age statistics for the users producing the tweets found for user specified searches.
These statistics are derived from the users profile information by way of some heuristics.
Their highest score when using just text features was 75.5%, testing on all the tweets by each author (with a train set of 3.3 million tweets and a test set of about 418,000 tweets). (2012) used SVMlight to classify gender on Nigerian twitter accounts, with tweets in English, with a minimum of 50 tweets.
Their features were hash tags, token unigrams and psychometric measurements provided by the Linguistic Inquiry of Word Count software (LIWC; (Pennebaker et al. Although LIWC appears a very interesting addition, it hardly adds anything to the classification.
2004), with and without preprocessing the input vectors with Principal Component Analysis (PCA; (Pearson 1901); (Hotelling 1933)).