11/1/2015
Using the Gender Package in R: Part 3 - Running a List of First Names from Tableau


In part 1 of this series, I discussed the basics of the Gender package in R. In part 2, I demonstrated how to leverage parallel processing to speed up the processing of names. For example, if you need to determine the gender for a list of 10,000 names. In this post I am going to discuss how to use the Tableau integration with R to run gender on a list of names, with and without parallel processing.


Step 1: Install and load gender package in R, start Rserve and connect Tableau to R



Setup the R connection in Tableau.

Under "Help" and "Settings and Performance" select "Manage R Connection"
Choose Server "localhost" and Port "6311" and click OK

Step 2: Load the list of names into Tableau

Load a list of first names into Tableau using a field named "First Name". If you want to try this out, copy and paste this short lists of first names into Tableau.



Step 3: Create a calculated field in Tableau

Calculated Field: Gender from R



Step 4: Build a Quick Viz in Tableau

   Move "First Name" to Rows
   Move "Gender in R" to Color

You should now have a list of names that are color coded by gender, either male or female. You could also change the shapes at this point, using the built in Tableau shapes for male and female. Below is an example, simply adjusting the colors and using those Tableau default shapes for gender.

    

Using Parallel Processing

We can use the same parallel processing technique demonstrated in part 2 of this series, using Tableau. After you follow the steps above, create another calculated field and then simply use this field instead of the "Gender in R" field.

Calculated Field: Gender from R Parallel Processing



Note - Parallel processing will not be useful on short lists, but if you have multicore processors and a long list of names then this approach could be very useful.

Download a sample workbork here.


I hope you find this information useful. If you have any questions feel free to email me at Jeff@DataPlusScience.com

Jeffrey A. Shaffer

Follow on Twitter @HighVizAbility