Overlearning speaker race in sociolinguistic auto-coding

Dan Villarreal
Concerns about AI fairness have been raised in domains like the American criminal justice system, where algorithms assessing the risk of a pretrial defendant may inadvertently use defendants’ race as a decision criterion. Similar risks apply to the domain of sociolinguistic auto-coding, in which machine learning classifiers assign categories to variable data based on acoustic features (e.g., car vs “cah”). The proposed project addresses this possibility by using sociolinguistic data by assessing the extent to...
This data repository is not currently reporting usage information. For information on how your repository can submit usage information, please see our documentation.