Click here to Skip to main content
12,888,493 members (44,492 online)
Click here to Skip to main content
Add your own
alternative version


19 bookmarked
Posted 22 Apr 2002

Name genderization

, 22 Apr 2002 CPOL
Rate this:
Please Sign up or sign in to vote.
Extrapolate the gender of a person based on their first name
<!-- Download Links --> <!-- Add the rest of your HTML here -->

Sample app


Over time I have compiled a database of roughly six thousand unique first names along with the gender usually associated with that name.  The names in this database are primarily English names, but also contains some other nationalities such as German, French, Russian, etc. I have used this database on a number of occasions for processes such as data entry validation, and data extrapolation.  I believe some other people could benefit from this database and worker class, so I decided to post it to Code Project.  

Why would anyone need this?

Gender is a very common dimension in data marts.  A database project I worked on (some time ago) had a database of about 1.2 million names, addresses and phone numbers.  If the client had wanted gender for the names on this list it would have been unavailable because that data was not collected with the list. The only way to get gender was to contact these people directly (unreasonable) or extract the approximate gender based on each individuals names.

Another example is a data entry application where the data entry person did a poor job of entering the data.  I used this database to cross match the gender entered by the user to the approximate gender determined through the database.  The results were that roughly 15% of the data entered required review and nearly 10% was actually incorrect.

What is this?

I am including with this project 3 items.

  1. An MS Access database (NameDB.mdb) with a single table (FRST_NM_GNDR) which contains approximately 6,000 names and associated genders.
  2. The source code for a class (CFPSGenderizer) which loads this database and provides a simple API for looking up a name and returning the associated gender.
  3. A demo project which demonstrates how to use the CFPSGenderizer class.

Where did these names come from?

The names in this database have been collected from 3 primary sources.  1) A customers database, 2) freely available web site downloads, 3) the Social Security Administration's web-site (  There are no license requirements for using these names nor are there any warranties as to the accuracy of the name/gender associations.

How accurate is the list?

Who knows!  From the few times I have used the list in verifiable scenarios it appeared that for the names in the list it was at least 85% accurate.  This means, of course, that it could be as much as 15% inaccurate.  I do not use this data when high-precision is needed, only for cross-verification and data extrapolation situations.  

How to use this class?

  1. Add the FPSGenderizer.cpp and FPSGenderizer.h files to your project.
  2. Instantiate an instance of the CFPSGenderizer class in your program at an appropriate location.  The class must be initialized through the Load function so your implementation should plan on performing this step only once if possible.
  3. Call one of the overridden Load functions to load the list from a database or serialized file.
  4. Call the CFPSGenderizer::Genderize function and pass in an LPCTSTR containing a first name you want to genderize.  It will return a char which will either be 'M' (Male), 'F' (Female) or 'U' (Unknown).  This function will return 'U' for names not on the list as well as for names on the list explicitly associated with 'U'.

Future Development?

As my job requires I will be updating the database by adding names and changing the associations of the names on the list.  I also plan to incorporate an edit-distance and metaphone algorithm (see my earlier Spell Checker app) to find suggestions for a name and based on the frequency of suggested male/female/unknown genders suggest a gender. Before I release this enhancement I need to test the results to see if they are even remotely reliable, though.


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Matt Gullett
Web Developer
United States United States
No Biography provided

You may also be interested in...

Comments and Discussions

GeneralThankyou Pin
Member 927458418-Jul-12 19:34
memberMember 927458418-Jul-12 19:34 
QuestionGender Data Pin
sm jacoby10-Feb-12 3:26
membersm jacoby10-Feb-12 3:26 
GeneralBetter sex... Pin
politico7-Aug-06 9:42
memberpolitico7-Aug-06 9:42 
Generaljust wanted to say thanks Pin
shoi17-Dec-03 15:18
membershoi17-Dec-03 15:18 
GeneralAmbigiuous names Pin
Claudius Mokler24-Apr-02 0:06
memberClaudius Mokler24-Apr-02 0:06 
GeneralRe: Ambigiuous names Pin
Matt Gullett24-Apr-02 1:30
memberMatt Gullett24-Apr-02 1:30 
GeneralRe: Ambigiuous names Pin
Philippe Lhoste2-May-02 23:17
memberPhilippe Lhoste2-May-02 23:17 
I appreciate your article (even if I have no use for itSmile | :) ) because it is honest (no 100% accuracy promised) and it explains well why it is needed (I wondered).

FYI, there are not much ambiguous names in French, with same spelling.
I recall mainly of:
Claude (m), Camille (f), Dominique (u)
The gender given is, as far as I know, the most frequent.

Some variants have little difference, at least when spoken, like René/Renée, Frédéric/Frédérique, Fabien/Fabienne.

Note that now, French people can create any first name they want. It used to be much more restrictive in the past (only calendar and historic names).
We don't see much names created from scratch, but a lot of variants in spelling, to stand out...
Eg. a regular spelling was Alain, now we see Allain, Alin, Alyn, etc. Phonetic rules can help here.


Philippe Lhoste (Paris -- France)
Professional programmer and amateur artist

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170424.1 | Last Updated 23 Apr 2002
Article Copyright 2002 by Matt Gullett
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid