Stata ado file to calculate the Jaro-Winkler string distance between two strings.
Use ssc install jarowinkler
to install the ado file from Stata or see here
jarowinkler
calculates the distance between two string variables using the Jaro-Winkler distance metric. The distance metric is often used in record linkage to compare first or last names in different sources. Jaro-Winkler modifies the standard Jaro distance metric by putting extra weight on string differences at the start of the strings to be compared. The metric is scaled between 0 (not similar at all) and 1 (exact match).
For more detail on the Jaro-Winkler method, see wikipedia and http://www.gabormelli.com/RKB/Jaro-Winkler_Distance_Function.
Jaro-Winkler implementation based on code from http://cs.anu.edu.au/~Peter.Christen/Febrl/febrl-0.4.01/stringcmp.py and https://github.com/miguelvps/c/blob/master/jarowinkler.c.
99% of the time, when you run keep
in Stata, you follow it with order
, right? Well, keeporder
saves you a line of code, running keep and then order on the variable list.
Use ssc install keeporder
to install the ado file from Stata or see here
This package draws inspiration from estout
for Stata and stargazer
for R. It turns regression output into tables. The tables are tex
fragments meant to work with the threeparttable
method of putting tables in LaTeX described here
To install the latest version from Github:
install.packages("devtools")
devtools::install_github("jamesfeigenbaum/textablr")
extract_bib
is a simple R script to create minimal a bibtex
library file for an article. Some journals require tex
and bib
files with submissions. However, I have only one master bib
file. That file currently contains 3543 references. Rather than send all of that to the journal to help typeset my manuscript, I only want to send a bib
file with the references included in my article. This script does that.
My person R package, including my ggplot2
theme, a function to pull variable codes from IPUMS (ipums_codes()
), and other junk.