Language-independent gender identification through keystroke analysis

Authors: Tsimperidis, I., Katos, V. and Clarke, N.

Journal: Information and Computer Security

Volume: 23

Issue: 3

Pages: 286-301

eISSN: 2056-4961

DOI: 10.1108/ICS-05-2014-0032

Abstract:

Purpose - The purpose of this paper is to investigate the feasibility of identifying the gender of an author by measuring the keystroke duration when typing a message. Design/methodology/approach - Three classifiers were constructed and tested. The authors empirically evaluated the effectiveness of the classifiers by using empirical data. The authors used primary data as well as a publicly available dataset containing keystrokes from a different language to validate the language independence assumption. Findings - The results of this paper indicate that it is possible to identify the gender of an author by analyzing keystroke durations with a probability of success in the region of 70 per cent. Research limitations/implications - The proposed approach was validated with a limited number of participants and languages, yet the statistical tests show the significance of the results. However, this approach will be further tested with other languages. Practical implications - Having the ability to identify the gender of an author of a certain piece of text has value in digital forensics, as the proposed method will be a source of circumstantial evidence for "putting fingers on keyboard" and for arbitrating cases where the true origin of a message needs to be identified. Social implications - If the proposed method is included as part of a text-composing system (such as e-mail, and instant messaging applications), it could increase trust toward the applications that use it and may also work as a deterrent for crimes involving forgery. Originality/value - The proposed approach combines and adapts techniques from the domains of biometric authentication and data classification.

https://eprints.bournemouth.ac.uk/24545/

Source: Scopus

Language-independent gender identification through keystroke analysis

Authors: Tsimperidis, I., Katos, V. and Clarke, N.

Journal: INFORMATION AND COMPUTER SECURITY

Volume: 23

Issue: 3

Pages: 286-301

ISSN: 2056-4961

DOI: 10.1108/ICS-05-2014-0032

https://eprints.bournemouth.ac.uk/24545/

Source: Web of Science (Lite)

Language Independent Gender Identification Through Keystroke Analysis

Authors: Tsimperidis, I., Katos, V. and Clarke, N.

Journal: Information and Computer Security

Volume: 23

Issue: 3

Pages: 286-301

https://eprints.bournemouth.ac.uk/24545/

Source: Manual

Preferred by: Vasilis Katos

Language Independent Gender Identification Through Keystroke Analysis

Authors: Tsimperidis, I., Katos, V. and Clarke, N.

Journal: Information and Computer Security

Volume: 23

Issue: 3

Pages: 286-301

ISSN: 2056-4961

Abstract:

Purpose – In this work we investigate the feasibility of iden tifying the gender of an author by measuring the keystroke duration when typing a message.

Design/methodology/approach – Three classifiers were constructed and tested. We empirically evaluated the effectiveness of the classifiers by using empirical data. We used primary data as well as a publicly available dataset containing keystrokes from a diff erent language to validate the language independence assumption.

Findings – The results of this work indicate that it is possible to identify the gender of an author by analyzing keystroke durations with a probability of success in the region of 70%.

Research limitations/implications – The proposed approach was validated with a limited number of participants and languages, yet the statistical tests show the significance of the results. However, t his approach will be further tested with other languages.

Practical implications – Having the ability to identify the gender of an aut hor of a certain piece of text has value in digital forensics, as the proposed method will be a source of circumstantial evidence for “putting fingers on keyboard” and for arbitrating cases where the true origin of a message needs to be identified.

Social implications – If the proposed method is included as part of a text composing system (such as email, and instant messaging applications) it could increase trust toward the applications that use it and may also work as a deterrent for crimes involving forgery.

Originality/value – The proposed approach combines and adapts techniques from the domains of biometric authentication and data classification.

https://eprints.bournemouth.ac.uk/24545/

Source: BURO EPrints