Here is some code to help convert the USDA’s National Nutrient Database (ASCII format) into Stata format. This database is useful for quantifying nutritional information and nutritional content of various common food items. Prior to using State, this method requires the use of the linux ‘sed’ command first (as performed on a Mac in Terminal).
After unzipping the SR24.zip (or SR25.zip etc) file downloaded, in Terminal (or a bash shell):
# Make sure you're in the directory containing ONLY the # downloaded files before running the next commands, or you # could end up modifying important files here! # This converts quotation marks to two single quotes LANG=ASCII sed -i '' 's/\"/'\'\''/g' *.txt # This converts tildes to quotation marks LANG=ASCII sed -i '' 's/~/\"/g' *.txt
Then in Stata:
insheet using FOOD_DES.txt,clear delimiter("^")
gen str5 NDB_No = string(v1,"%05.0f")
drop v1
rename v2 FdGrp_Cd
rename v3 Long_Desc
rename v4 Shrt_Desc
rename v5 ComName
rename v6 ManufacName
rename v7 Survey
rename v8 Ref_desc
rename v9 Refuse
rename v10 SciName
rename v11 N_Factor
rename v12 Pro_Factor
rename v13 Fat_Factor
rename v14 CHO_Factor
save FOOD_DES,replace
insheet using FD_GROUP.txt,clear delimiter("^")
rename v1 FdGrp_Cd
rename v2 FdGrp_Desc
save FD_GROUP,replace
insheet using LANGUAL.txt,clear delimiter("^")
gen str5 NDB_No = string(v1,"%05.0f")
drop v1
rename v2 Factor_Code
save LANGUAL,replace
insheet using LANGDESC.txt,clear delimiter("^")
rename v1 Factor_Code
rename v2 Description
save LANGDESC,replace
insheet using NUT_DATA.txt,clear delimiter("^")
gen str5 NDB_No = string(v1,"%05.0f")
drop v1
gen str3 Nutr_No = string(v2,"%03.0f")
drop v2
rename v3 Nutr_Val
rename v4 Num_Data_Pts
rename v5 Std_Error
gen str2 Src_Cd = string(v6,"%03.0f")
drop v6
rename v7 Deriv_Cd
rename v8 Ref_NDB_No
rename v9 Add_Nutr_Mark
rename v10 Num_Studies
rename v11 Min
rename v12 Max
rename v13 DF
rename v14 Low_EB
rename v15 Up_EB
rename v16 Stat_cmt
rename v17 AddMod_Date
rename v18 CC
save NUT_DATA,replace
insheet using NUTR_DEF.txt,clear delimiter("^")
gen str3 Nutr_No = string(v1,"%03.0f")
drop v1
rename v2 Units
rename v3 Tagname
rename v4 NutrDesc
rename v5 Num_Dec
rename v6 SR_Order
save NUTR_DEF,replace
insheet using SRC_CD.txt,clear delimiter("^")
gen str2 Src_Cd = string(v1,"%03.0f")
drop v1
rename v2 SrcCd_Desc
save SRC_CD,replace
insheet using DERIV_CD.txt,clear delimiter("^")
rename v1 Deriv_Cd
rename v2 Deriv_Desc
save DERIV_CD,replace
insheet using WEIGHT.txt,clear delimiter("^")
gen str5 NDB_No = string(v1,"%05.0f")
drop v1
rename v2 Seq
rename v3 Amount
rename v4 Msre_Desc
rename v5 Gm_Wgt
rename v6 Num_Data_Pts
rename v7 Std_Dev
save WEIGHT,replace
insheet using FOOTNOTE.txt,clear delimiter("^")
gen str5 NDB_No = string(v1,"%05.0f")
drop v1
rename v2 Footnt_No
rename v3 Footnt_Typ
gen str3 Nutr_No = string(v4,"%03.0f")
drop v4
rename v5 Footnt_Txt
save FOOTNOTE,replace
insheet using DATSRCLN.txt,clear delimiter("^")
gen str5 NDB_No = string(v1,"%05.0f")
drop v1
gen str3 Nutr_No = string(v2,"%03.0f")
drop v2
rename v3 DataSrc_ID
save DATASRCLN,replace
insheet using DATA_SRC.txt,clear delimiter("^")
rename v1 DataSrc_ID
rename v2 Authors
rename v3 Title
rename v4 Year
rename v5 Journal
rename v6 Vol_City
rename v7 Issue_State
rename v8 Start_Page
rename v9 End_Page
save DATA_SRC,replace





